Inspiration
Training machine learning models is expensive in both time and energy, and a lot of that cost comes from failed experiments or running too many epochs before realizing a configuration is not promising. I wanted to build something that helps teams run greener ML workflows by combining GitLab CI/CD, lightweight hyperparameter search, and cloud infrastructure choices that are more cost- and carbon-aware.
What it does
ML Helper runs short hyperparameter search jobs through GitLab CI/CD, compares candidate configurations in parallel, and keeps the best result as an artifact. It supports local execution and GCP-oriented execution, adds Spot VM aware defaults for greener compute, and exposes a hosted interface that can connect to GitLab and trigger pipelines from a web app.
How I built it
I built ML Helper with a React and Vite frontend, a lightweight Python training and search pipeline, and a Flask backend for hosted API integration. GitLab CI/CD orchestrates parallel search jobs, Python scripts handle candidate selection and result summarization, and GCP is used for hosting and deployment through Cloud Run, with Spot-friendly configuration for compute planning.
Challenges I ran into
The project became messy because the frontend, backend ideas, CI flow, and GCP setup were all mixed together. I had to untangle duplicate pipeline logic, fix missing execution paths, handle GCP authentication, enable required Google Cloud APIs, and work around permission issues when trying to use Secret Manager securely. GitLab permissions were also tricky, especially around token types and project settings visibility.
Accomplishments that I'm proud of
I’m proud that the project now has a clean GitLab CI/CD flow, a hosted Cloud Run deployment, real GCP integration, and a clearer architecture instead of a demo-only prototype. I’m also proud that it now reflects the original green-ML goal better by using short runs, resumable search artifacts, and Spot-aware cloud defaults.
What I learned
I learned how quickly ML tooling can become hard to manage when experiment logic, infrastructure, and product UI are developed at the same time. I also learned a lot about GitLab pipeline automation, Cloud Run deployment, GCP service configuration, and the importance of separating browser-facing code from secret-bearing backend workflows.
What's next for ML Helper
Next, I want to make the GitLab integration more secure by moving tokens fully into Secret Manager, add true artifact polling and pipeline status updates in the UI, automate GitLab Runner provisioning on GCP Spot VMs, and improve the ML side by making the search smarter with better early stopping, checkpoint recovery, and support for real training scripts rather than simulated runs.
Built With
- javascript
- python
- terraform
Log in or sign up for Devpost to join the conversation.