ML Experiment Autopilot

ML Experiment Autopilot CLI

Project Idea

ML Experiment Autopilot is an autonomous agent that designs, executes, and iterates on machine learning experiments without human supervision, explaining every decision along the way.

Inspiration

Traditional AutoML tools operate as black boxes — they select models and tune hyperparameters but never explain why. Data scientists lose insight into what works and what doesn't. ML Experiment Autopilot takes a fundamentally different approach: it thinks like a researcher, forming hypotheses, testing them, analyzing results, and adapting its strategy based on what it learns.

Description

At its core, the system uses Gemini 3 Flash Preview with Thought Signatures to maintain reasoning continuity across the entire experiment session. Four cognitive components — ExperimentDesigner, ResultsAnalyzer, HypothesisGenerator, and ReportGenerator — share a single multi-turn conversation, allowing Gemini to reference early iteration results when designing later experiments. Temperature is set to 1.0 with thinking level "high" for complex reasoning tasks.

The workflow is fully autonomous: data profiling, baseline establishment, then an iterative loop of Gemini-designed experiments executed via Jinja2-generated Python scripts in isolated subprocesses. The agent detects performance plateaus, balances exploration vs. exploitation, and generates publication-ready Markdown reports with visualizations.

Development Tech Stack

Built with scikit-learn, XGBoost, LightGBM, MLflow, and Rich, the system is designed for the Marathon Agent track — long-running autonomous tasks with self-correction and continuous reasoning.

Built With

gemini-3-flash-preview-(google-generativeai)
jinja
lightgbm
matplotlib
mlflow
numpy
pandas
pydantic
pytest
python
python-dotenv
rich
scikit-learn
typer
xgboost

Updates

Srikar Pottabathula started this project — Feb 09, 2026 04:58 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.