It started with a craving.

Our team shares a deep, borderline obsessive love for malatang and hotpot. The rich broths, the customizable ingredients, and the numbing mala is what we live for. But every time we tried to find the one, we would end up spending way too long scrolling through Yelp with no real way to filter by what actually matters to us.

So when the datathon kicked off, we already knew what had to be done — we build the thing we wish existed.

Divide & Conquer

Once we had the idea, we mapped out exactly what needed to be built and split the work across four roles:

Person Role Folder
Alvin Do Data Engineer data/
Devin Hua Backend Developer backend/
Mattheu Nguyen Recommendation Engineer recommendation/
Vivian Lam Frontend Developer frontend/

Everyone stepped up and owned their piece. We used GitHub to stay in sync, pushing and pulling changes as we built, and utilized branch merging to tie everything together.

How We Built It

The Data Layer (Alvin Do)

The foundation of everything was the Yelp Open Dataset, with over 150,000 businesses and 7 million reviews. Alvin Do tackled the full ETL (Extract, Transform, Load) pipeline: downloading and extracting the raw JSON files, filtering down to only malatang and hotpot restaurants, and utilizing pandas and DuckDB to structure and clean the dataset, loading it into a SQLite database (malatang.db) that the rest of the team could query.

The Recommendation Engine (Mattheu Nguyen)

Mattheu Nguyen built the scoring algorithm that takes a user's preferences (spice level, broth type, meats, ingredients, sides, and appetizers) and matches them against real restaurant data and review text using keyword analysis. The more a restaurant's reviews and categories matched your preferences, the higher its score.

The Backend (Devin Hua)

Devin Hua built the backbone that connected everything together. He created a Flask API that served as the bridge between the frontend and the recommendation engine. The API handled incoming user preference data from the frontend, passed it to the scoring algorithm, queried the SQLite database, and returned the single best matched malatang restaurant for the user. He also configured Flask-CORS to allow the React frontend to communicate with the Python backend seamlessly across ports, ensuring the full stack could talk to each other locally without any cross-origin issues.

The Frontend (Vivian Lam)

Vivian was the creative heart of the project. Beyond just building the UI, she completely defined the look, feel, and personality of Malatang Matcher from the ground up. She brainstormed and designed the full UX flow, walking users through a step by step preference selection experience that felt fun and intuitive rather than like a boring form. What makes the frontend truly special is that every single illustration in the project was personally hand drawn by Vivian herself. From the cute steaming hotpot bowl to the squid, sausage, potato tornado, fried dough, eggs, and veggies, all the way down to the charming "ready to tie the knot?" submit button tied with a bow. The result is a UI that feels warm, playful, and perfectly on brand for a team that genuinely loves malatang.

Challenges

Working with a dataset this large came with real challenges. The Yelp dataset is several gigabytes, too large to store on GitHub and too large to load into memory all at once. Our data engineer had to download and extract the raw JSON files locally and generate the SQLite database on their own machine. When it came to processing the massive 7 million row review file, rather than writing a chunked pipeline that would load the data piece by piece, DuckDB was used to query the entire review file in one shot, filtering directly to only the reviews tied to our malatang restaurants without ever loading the full dataset into memory. This made the pipeline significantly faster and cleaner to write compared to a traditional chunked processing approach.

This created a tricky moment: the database existed on one laptop but not on the others. Teammates trying to run the backend kept hitting errors because malatang.db simply was not there. Since the file was too large for GitHub and intentionally excluded via .gitignore, we had to AirDrop the database directly between machines, a very hackathon solution to a very real data engineering problem.

What We're Proud Of

In just a few hours, we went from a craving and an idea to a functioning, end to end product with real data, real recommendations, and real results. None of us walked in knowing how to do everything we ended up doing. Along the way we picked up skills that were completely new to us, including ETL pipelines, data cleaning, SQL databases, REST APIs, and figuring out how to connect all of it into something a real user can actually interact with. We learned how to work under pressure, how to unblock each other when things broke, and honestly how to just keep moving and ship something.

But more than the technical stuff, we genuinely had a great time building this together. Brainstorming what the product would do, debating what it should look like, figuring out how all the pieces fit. It never really felt like a competition. It felt like a group of friends who really love malatang finally building the thing they always wished existed.

What's Next

Right now, Malatang Matcher is built for people like us, malatang obsessives who know exactly what they want in a bowl. But the matchmaking system we built is not limited to malatang. The preference-based scoring engine can be extended to any cuisine, any food experience. Our vision is to expand Malatang Matcher into a universal food matchmaking platform so that everyone, no matter what they are craving, can find their match!

Built with 🌶️ and way too much love for malatang.

Built With

Share this project:

Updates