MottSauce
Inspiration
Mott insulators have always felt like a paradox: materials that band theory predicts should conduct electricity, yet experimentally behave as insulators. That tension between elegant theory and messy quantum reality inspired us. If we could learn the boundary between metallic and Mott insulating behavior directly from data, we could help researchers shortcut months of simulation or lab work.
At the heart of the problem is electron–electron interaction. In simple terms, when the on-site Coulomb repulsion U dominates over the hopping parameter t in the Hubbard model,
$$ H = -t \sum_{\langle i,j \rangle,\sigma} c^\dagger_{i\sigma} c_{j\sigma} + U \sum_i n_{i\uparrow} n_{i\downarrow}, $$
the system can transition into a Mott insulating phase. We wanted to see if machine learning could implicitly learn this boundary from real materials data.
What it does
MottSauce is a machine learning pipeline that:
- Pulls real quantum chemistry data from the Materials Project database
- Featurizes crystal structures and electronic properties
- Trains an XGBoost classifier to distinguish:
- Mott insulators
- Conventional conductors
Researchers can input a new material’s properties and receive a fast prediction, potentially saving months of computational or experimental work.
How we built it
Data Collection
We queried the Materials Project API to obtain structural, compositional, and electronic properties.Featurization
Using established materials informatics tools, we converted crystal structures and compositions into numerical descriptors — capturing chemistry, bonding environments, and electronic signals.Label Engineering
We curated a dataset of known Mott insulators and conventional metals, ensuring physically meaningful class separation.Modeling
We trained an XGBoost classifier, tuning hyperparameters to optimize precision and recall while avoiding overfitting.Evaluation
We validated performance using cross-validation and confusion matrices, focusing on minimizing false positives that could mislead researchers.
Challenges we ran into
- Label ambiguity: Not all materials are cleanly classified; literature disagreements required careful curation.
- Class imbalance: True Mott insulators are rarer than metals.
- Feature leakage: Avoiding direct proxies that trivially encode the answer.
- Physical interpretability: Ensuring the model wasn’t just statistically correct, but physically plausible.
Accomplishments that we're proud of
- Built an end-to-end reproducible pipeline from raw materials data to predictions.
- Achieved strong classification performance without overfitting.
- Created a tool that meaningfully bridges physics and machine learning.
- Maintained interpretability using feature importance analysis.
What we learned
- Strong domain knowledge is critical in scientific ML.
- Data curation is often harder than model training.
- Physical intuition (like the ratio U/t) can guide feature design.
- Interpretability matters deeply in scientific applications.
What's next for MottSauce
- Expand the dataset with more experimentally confirmed Mott systems.
- Incorporate graph neural networks for structure-aware modeling.
- Predict transition probabilities instead of binary labels.
- Build a simple web interface so researchers can query materials directly.
Ultimately, we want MottSauce to become a reliable screening tool for discovering quantum materials — accelerating progress in superconductivity, quantum computing, and next-generation electronics.
Log in or sign up for Devpost to join the conversation.