Discofy

GIF
Discofy

Inspiration

Leveraging Large Language Models (LLMs) to enhance music discovery and playlist curation by understanding user moods and preferences.

What it does

Our team developed a backend FastAPI application featuring CRUD endpoints that connect to our MongoDB cluster to manage songs and playlists, among other operations. Additionally, we implemented an LLM chain that accepts parameters such as mood, energy level, and lyrical focus. This input is processed through a runnable LangChain consisting of two chains:

Formatting and Song Retrieval Chain: Formats the initial request and identifies relevant songs based on the user's criteria.
External Search Chain: Utilizes tools to search the internet for relevant YouTube and Spotify links.

All data is returned as a formatted JSON object. Initially, we encountered significant challenges in ensuring the chain properly utilized the tools and accurately passed data. Extensive prompt engineering was required to resolve these issues. Moreover, earlier iterations allowed users to send any message, making it difficult for the LLM to consistently find the correct links. By restructuring the input to be more structured and open-ended, the LLM was able to effectively use the tools and retrieve the appropriate links.

How we built it

We built the application using the following technologies and methodologies:

Backend Framework: We chose FastAPI for its high performance and ease of building RESTful APIs.
Database: We utilized MongoDB for its flexibility in handling JSON-like documents, making it ideal for managing songs and playlists.
LLM Integration: We leveraged LangChain to create robust LLM chains that handle user input and data retrieval.
Prompt Engineering: We developed precise prompts to guide the LLM in processing user inputs and interacting with external tools effectively.
Tool Integration: We incorporated tools for searching YouTube and Spotify to fetch relevant links based on user preferences.
OAuth Implementation: We implemented OAuth on the frontend application to handle user authentication securely.
Component Refactoring: We refactored the modal component that makes requests to our LLM endpoints to use a Shadcn dropdown for the structured input with mood, lyrical focus, etc., enhancing the user interface and input consistency.
JSON Formatting: We ensured all responses are returned as structured JSON objects for consistency and ease of use in frontend applications.

Challenges we ran into

LLM Tool Utilization: Initially, the LLM struggled to effectively use the integrated tools, leading to inaccurate or irrelevant link retrievals.
Data Passing Issues: Ensuring that data was correctly passed between chains required meticulous debugging and prompt adjustments.
User Input Variability: Allowing users to send any message made it difficult for the LLM to maintain consistency in finding the correct links.
Prompt Engineering: Developing prompts that were both effective and efficient required extensive trial and error.
Integration Complexity: Coordinating between FastAPI, MongoDB, LangChain, and external tools added layers of complexity to the project.
Cloud Integration: Moving services to a cloud environment introduced latency and performance issues, requiring us to optimize database connections and reduce the size of data transferred between components to ensure smooth operations.
Database Schema Formation: Defining a schema that supports both the current playlist structure and future scalability presented challenges. We had to balance flexibility for new features (such as mood-based song recommendations) with maintaining query performance.
Next.js Frontend Changes: The frontend integration posed challenges when restructuring the interface to accommodate structured inputs. Implementing state management for the complex LLM responses required significant refactoring in the Next.js application, particularly with React hooks and component re-renders.

Accomplishments that we're proud of

Robust Backend Development: Successfully built a scalable FastAPI backend with comprehensive CRUD operations connected to MongoDB.
Effective LLM Integration: Developed a functional LLM chain that accurately processes user inputs and retrieves relevant music links.
Overcoming Technical Hurdles: Solved complex issues related to tool utilization and data passing through advanced prompt engineering.
Structured User Input Handling: Implemented a more structured input system that significantly improved the LLM's performance in link retrieval.
Seamless Tool Integration: Integrated YouTube and Spotify search tools effectively, enhancing the overall user experience.
OAuth Implementation: Successfully integrated OAuth on the frontend to secure user authentication.
UI Enhancements: Refactored the modal component to use a Shadcn dropdown, improving the user interface and input reliability.
Cloud Integration: Overcame performance challenges introduced by cloud services to ensure seamless communication between our FastAPI backend, MongoDB, and the LLM chains.

What we learned

Importance of Prompt Engineering: Crafting precise prompts is crucial for guiding LLMs to perform desired tasks accurately.
Structured Inputs Enhance Performance: Providing more structured and well-defined inputs can significantly improve the reliability of LLM outputs.
Integration Challenges: Combining multiple technologies and tools requires careful planning and iterative testing to ensure seamless operation.
Cloud Service Optimization: Migrating to cloud services requires not only architectural changes but also optimization to prevent latency and resource bottlenecks.
Database Schema Design: Designing a schema that supports both flexibility for future features and efficient query performance requires careful consideration.
State Management in Next.js: Handling dynamic responses from LLMs in the frontend requires robust state management and React hook expertise.

What's next for Discofy

Frontend Development: Building a user-friendly interface to interact with the backend services and display curated playlists.
Enhanced Personalization: Incorporating more user-specific data to further tailor music recommendations.
Expanded Tool Integration: Adding support for additional music platforms and search tools to broaden the range of available links.
Scalability Improvements: Optimizing the backend and database to handle increased user load and data volume.
Advanced Analytics: Implementing analytics to gain insights into user preferences and system performance.
Continuous Prompt Optimization: Refining prompts to further enhance the accuracy and relevance of LLM-generated outputs.
Feature Enhancements with MongoDB: Exploring and implementing additional features related to MongoDB to leverage its capabilities further.
Agent with Node Graphs (Future Implementation): Planning to implement an agent with node graphs that utilize multiple tools, enhancing the LLM's ability to navigate and retrieve accurate information.