Inspiration
As a Data Science Masters student, I realized that while we obsess over maximizing accuracy metrics like $R^2$ or minimizing $RMSE$, we rarely discuss the computational cost of our models. Training a single large AI model can emit as much carbon as five cars in their lifetimes. I wanted to build a tool that makes "Green AI" accessible, allowing developers to see the invisible carbon cost of their algorithms before they deploy them.
What it does
EcoMetric is a comprehensive benchmarking tool that compares various Regression and Classification Machine Learning models. Unlike standard evaluation pipelines, this tool treats CO2 Emissions (kg) as a primary metric alongside traditional performance scores.
How I built it
The project is built entirely in Python. It uses Scikit-Learn for the model architectures (Random Forest, SVM, Linear/Logistic Regression, etc.) and CodeCarbon to track real-time energy consumption based on the hardware's power draw and the local power grid's carbon intensity.
The core workflow involves:
- Data Ingestion: Users upload a CSV or generate a synthetic dataset.
- Cross-Validation: We implement strictly rigorous 5-Fold Cross Validation to ensure reliability.
- Tracking: During the training and inference loops, the background tracker monitors energy usage.
Mathematical Evaluation
To ensure the tool is scientifically robust, I calculated standard metrics using the following definitions:
Mean Squared Error (MSE): Used to penalize large errors in regression. $$MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_i - \hat{Y}_i)^2$$
Coefficient of Determination ($R^2$): To judge the goodness of fit. $$R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}$$
Carbon Emission Estimate: $$CO_2 = E \times I$$ (Where $E$ is Energy consumed in kWh and $I$ is Carbon Intensity of the local grid).
Challenges I faced
The biggest technical challenge was isolating the energy consumption of the specific training process from the background operating system noise.
Built With
- codecarbon
- github
- matplotlib
- numpy
- pandas
- python
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.