Inspiration

The motorsport environment demands rapid, high-stakes decision-making from engineers, drivers, and technicians. Given the severe limitations on physical track time, success relies heavily on leveraging simulation and telemetry to accelerate development.

I observed that while many existing tools offer real-time data visualization, there is a significant lack of integrated AI-driven software. This gap exists because raw telemetry is challenging to utilize—it contains noise and missing values, requiring specialized signal processing before it can effectively power reliable machine learning models.

I created ApexAI to bridge this gap. My focus was on the entire data pipeline: transforming complex, raw telemetry into clean, actionable datasets. Beyond just data cleaning, I built a comprehensive development platform to manage, automate, and accelerate the model creation process itself.

While my current demonstration focuses on real-time vehicle identification based on sensor dynamics, ApexAI’s architecture is flexible enough to adapt to any labeling task—such as driver identification, component fault detection, or skill classification. My ultimate goal is to provide comprehensive, data-driven support to the entire racing ecosystem.

What it does

ApexAI is a comprehensive, on-premise-ready AutoML/MLOps platform designed for time-series data analysis. It provides:

End-to-End Automation & Deployment: Seamless environment setup and reproducible deployment via a single command (Docker).

Flexible Training Modes: Supports three distinct modes, including manual Single Training, Hyperparameter Optimization (HPO) for automated parameter tuning, and the fully automated BO-NAS for comprehensive model and parameter search.

MLOps Management & Reproducibility: Automatically tracks and visualizes all training progress, metrics, and parameters. All resulting AI models and artifacts are securely saved to dedicated storage for complete reproducibility.

Real-Time Simulation & Proof: The application demonstrates robust, real-time vehicle identification by playing back uploaded CSV data as a simulated live telemetry stream. The user can apply any custom-trained model and utilize multi-model ensembling for significantly improved prediction robustness.

How we built it

I focused heavily on MLOps and reproducibility. The entire environment, including MLflow, Optuna, and the data pipeline, is containerized using Docker Compose. I engineered a specialized data pipeline to handle the noisy, high-frequency telemetry, including anti-aliasing filtering and resampling to 10Hz. The model utilizes LSTM, GRU, Transformer, and Informer architectures for time-series feature extraction, integrating Test-Time Augmentation (TTA) and multi-model ensembling within the real-time Streamlit simulator for validation.

Challenges we ran into

  1. Lack of Reproducibility and Experiment Management The core difficulty in deep learning development was maintaining control over the experimental process. The high volume of metrics, training parameters, and model artifacts generated by the scripts made it challenging to accurately track past runs, compare results efficiently, and ensure complete reproducibility.

  2. Noisy and Irregular Telemetry Data Raw data streams arriving from the vehicles presented significant preparation hurdles. The telemetry exhibits non-uniform sampling rates across different sensors, coupled with inherent issues like latency, dropped signals, and high-frequency noise. This irregularity made creating a synchronized, reliable, and high-quality training dataset exceptionally difficult.

  3. Complex Conditional Search Space To leverage advanced models, we faced the difficulty of designing an optimization strategy that could handle complex conditional parameters (e.g., the number of attention heads required for the Transformer architecture). Managing this vast and complex search space for BO-NAS required significant effort to ensure the optimization process converged efficiently within the competition timeframe.

Accomplishments that we're proud of

I am most proud of establishing a reproducible, end-to-end MLOps platform, proven by the following achievements:

1. Integrated AutoML & Optimization: I successfully unified Hyper-Parameter Optimization (HPO) and Neural Architecture Search (NAS) into a singular, intelligent system that automatically explores and discovers optimal models.

2. MLOps and Reproducibility: I ensured complete MLOps compliance by automatically logging and visualizing all optimization trials, parameters, and resulting artifacts via MLflow, guaranteeing complete reproducibility for any future experiments.

3. One-Command Deployment: I realized the easiest way to build the entire platform using Docker. Executing just one command sets up the complex, multi-component MLOps environment immediately.

4. Automated Data Pipeline: I implemented a specialized pipeline capable of automatically generating high-quality training datasets from raw telemetry, dramatically shortening the data preparation phase.

5. Real-World Simulation: I created a Kafka-style real-time simulator which demonstrates immediate production viability and showcases the robustness achieved through Test-Time Augmentation (TTA) and multi-model ensembling.

What we learned

I learned that in cutting-edge data science, speed and infrastructure trump brute-force computation. The true value lies not in finding the single best model manually, but in building a platform that can quickly explore thousands of possibilities (AutoML) in a secure, repeatable environment (MLOps). This crucial shift in focus from algorithm tuning to pipeline automation is the key to accelerating race strategy and making this technology directly useful for my professional work.

Thank you for providing this invaluable opportunity to learn new knowledge and polish my skills. I have the deepest appreciation for Toyota Gazoo Racing, and for Devpost and the platform you have given developers to contribute to the future of motorsport.

What's next for ApexAI

ApexAI is built to grow into the foundational tool for TGR. Future steps include:

1. Simulator Integration: Connecting ApexAI to professional driving simulators for real-time driver performance feedback and virtual training based on "driver fingerprinting."

2. Predictive Maintenance: Utilizing the anomaly detection capability (sudden confidence drops) for early warning of mechanical failures or tire degradation during a race.

3. Cloud Deployment: Enhancing the current Docker architecture for seamless deployment onto TGR's internal cloud resources, while maintaining strict data confidentiality.

Built With

Share this project:

Updates