Inspiration

We are both aspiring data scientists hoping to make our mark in this rich, complex field. The connection between LLMs and data science we feel has yet to be explored to its fullest potential and as such we felt compelled to make this a reality with Melody.

What it does

With the advent of MCP, and the ability for a model to execute code of its own volition, we are able to empower an LLM to create ML/statistical models, generate visualizations, run statistical analysis, and perform the key tasks of a data scientist. Melody leverages a low latency conversational agent along with a responsive text interface to allow audio and written communication between the agent and the user.

How we built it

The core of Melody's backend is the Gemini API combined with a FastMCP server that exposes various data science tools to our agent allowing it to pick and choose which models, visualizations, or analyses to run based on the user's prompts and the model's own reasoning. How Melody stands out apart from her analyst capabilities is a comprehensive text-to-speech pipeline enabled through ElevenLabs that inspired its name. As for the frontend, we utilize Flask, HTML/CSS for a responsive UI and a crisp UX making sure that any plots generated by the agent are rendered appropriately for the user.

Challenges we ran into

We have encountered quite a few difficulties in both the MCP and ElevenLabs areas. The Gemini/MCP section proved to have some issues with model memory and persistence of user inputted data and maintaining the model's awareness of what the user has asked thus far. As for the ElevenLabs component, we encountered some issues when trying to make sure the audio sent back to the user is played appropriately. Coupled with the fact that our team is on different OS's, there proved to be some challenges in making code work across devices.

Accomplishments that we're proud of

We are happy to say that all of the above challenges were solved! The model persistence issues were fixed via saving server state and forcing the model to utilize MCP tools that allow it to list currently available datasets and models between answering user prompts so that it is always aware of what is currently accessible even if the dialogues in which they were inputted out now out of the model's context window. As for the ElevenLabs issues, we were able to resolve this simply by poring over documentation and being rigorous about error-checking.

We are also generally proud that we were able to implement the vision we had from the beginning without making compromises on features or capabilities, with a large part of this being enabled through the tools/access HackUMass provided to us.

What we learned

This project has been a significant learning experience for us as although we have done a hackathon in the past, this is the most ambitious project that we have taken on thus far. As such, this was both an opportunity to learn about state of the art tools like ElevenLabs and MCP but also a lesson in time management and partitioning work efficiently. This project gave us enormous insight into what goes into creating complex production level systems utilizing multiple third-party services that all must work together in tandem.

What's next for MelodyAI

We want to push this vision of a pocket data scientist further. Adding more tools, enhancing the UI, utilizing more powerful agents and adding the ability to choose between them, data transformation capabilities and so much more!

Built With

Share this project:

Updates