RAGtime Rhythms
๐ต Inspiration
As a software engineer by profession, I've always been fascinated by the intersection of technology and music.
Music is deeply interconnected, often in ways we don't immediately see. The inspiration for RAGtime Rhythms came from reading Will Smith's memoir, where he shared a story about how changing his cadence to match Rakim's helped him create a hit song. It made me wonder: does Rakim even know the impact he had on Will's music?
This is where the idea came from: while I recognize direct influences like covers and samples, what about the hidden connections? How does influence travel through time, genres, and collaborations? RAGtime Rhythms aims to surface these unseen links in music, providing a deeper understanding of how artists shape each other's sounds.
๐ผ What it does
RAGtime Rhythms is an AI-powered music knowledge engine that I built to:
- Generate artist influence graphs to visualize how musicians connect through collaborations, samples, and shared genres
- Find hidden musical relationships by leveraging structured data from MusicBrainz and dynamic AI reasoning
- Allow users to explore genre evolution to see how different music styles influenced and fused into new ones
- Offer AI-driven recommendations when graph data is insufficient, ensuring a rich and interactive experience
๐ ๏ธ How I built it
I combined graph databases, music metadata, and AI-powered reasoning to create RAGtime Rhythms. My stack includes:
- MusicBrainz โ My primary dataset, storing detailed metadata on artists, recordings, and genres
- ArangoDB โ I used this as a graph database to model and query musical relationships
- NetworkX โ This enabled my graph analysis, such as finding paths between artists and visualizing influence networks
- LangChain โ Allowed me to dynamically generate and execute AQL queries and NetworkX algorithms, making my chatbot truly intelligent
An example of a tool used to turn natural language into an AQL query
@tool
def text_to_aql_to_text(query: str):
"""Converts a natural language query into an optimized AQL query and executes it with fuzzy search."""
# Check for the closest entity match
closest_match = fuzzy_search_entity(query)
if closest_match:
print(f"๐ Fuzzy Search Matched: {closest_match}")
query = query.replace(query, closest_match)
# Construct an optimized AQL prompt
aql_prompt = f"""
Convert the following natural language question into a structured AQL query:
User Query: {query}
- If the query involves an artist or genre, use fuzzy search to find the closest match.
- Return only the AQL query itself (no explanations).
Generated AQL Query:
"""
# Generate AQL query using LLM
aql_query = llm.invoke(aql_prompt).content.strip()
then when I dicovered the ArangoLangChain integration it turned into!
# Initialize LangChain ArangoGraph
arango_graph = ArangoGraph(db)
arango_graph.set_schema() #
llm = ChatOpenAI(temperature=0.2, model_name="gpt-4o")
chain = ArangoGraphQAChain.from_llm(llm, graph=arango_graph, verbose=True)
def get_artist_influence_graph(artist_name):
"""
Fetches an artist's influence graph using ArangoGraphQAChain.
"""
query = f"Find all artists and genres connected to {artist_name}."
response = chain.run(query)
if not response:
return f"โ ๏ธ No influence data found for '{artist_name}'."
return response
1. Cover Performances & First Performers
SELECT
a1.name AS performer,
a2.name AS first_performed_work,
'first_performer' AS relationship_type
FROM musicbrainz.l_artist_work law
JOIN musicbrainz.artist a1 ON law.entity0 = a1.id
JOIN musicbrainz.l_recording_work lrw ON law.entity1 = lrw.entity1
JOIN musicbrainz.l_artist_recording lar2 ON lrw.entity0 = lar2.entity1
JOIN musicbrainz.artist a2 ON lar2.entity0 = a2.id
WHERE lrw.link IN (
SELECT id FROM musicbrainz.link WHERE link_type IN (
SELECT id FROM musicbrainz.link_type WHERE name = 'performance'
)
)
LIMIT 200000;
Purpose:
- I wanted to identify artists who performed covers of existing works
- Track which artists originally popularized a song
- Analyze how a song's interpretation evolves over time
Insights I Expected:
- How often certain songs are covered and by whom
- Whether certain genres have more reinterpretations than others
- How covers shape an artist's influence (e.g., did an artist gain prominence by covering someone else?)
2. Artist-Genre Relationships
SELECT
a.name AS artist_name,
g.name AS genre_name,
'belongs_to_genre' AS relationship_type
FROM musicbrainz.artist a
JOIN musicbrainz.l_artist_genre lag ON a.id = lag.entity0
JOIN musicbrainz.genre g ON lag.entity1 = g.id
LIMIT 200000;
Purpose:
- Establish a baseline for genre categorization of artists
- Compare artists with overlapping genre affiliations
- Provide data for recommendationsโif an artist is linked to multiple genres, I can suggest similar artists from that genre
Insights I Expected:
- Which genres have the most diverse set of artists?
- Are there any artists associated with unexpected genres?
- Do certain genres dominate specific time periods?
3. Genre-to-Genre Influence
SELECT
g1.name AS genre_1,
g2.name AS genre_2,
lt.name AS relationship_type
FROM musicbrainz.l_genre_genre lgg
JOIN musicbrainz.genre g1 ON lgg.entity0 = g1.id
JOIN musicbrainz.genre g2 ON lgg.entity1 = lgg.entity1
JOIN musicbrainz.link l ON lgg.link = l.id
JOIN musicbrainz.link_type lt ON l.link_type = lt.id
WHERE lt.name IN ('subgenre', 'influenced by', 'fusion of')
LIMIT 50000;
Purpose:
- Map how different genres influence each other over time
- Show how subgenres branch out from parent genres
- Understand how musical styles merge to create hybrid genres
Insights I Expected:
- Which genres have the strongest influence on others?
- What are the most hybrid genres, combining elements from multiple styles?
- Can I identify genre shifts over time (e.g., jazz influencing hip-hop)?
Each of these queries provided me with a different perspective on musical influence. Some focus on direct connections (collaborations, samples, covers), while others showed broader patterns (genre evolution, remix culture). By combining these insights into a graph database, I create a richer picture of how music flows across time and styles.
Challenges I ran into
1. Sparse Graph Due to Limited Connections in l_artist_artist
Initially, when building the artist influence graph, I relied on the l_artist_artist table, which directly links different artists relations (parents,member of band, spouse). However, this approach resulted in a sparse graph with far fewer connections than I expected.
Solution:
- I came across a whitepaper titled "Analyzing Music Metadata on Artist Influence" by Marek Kopel, which proposed using samples and covers as a proxy for influence
- Inspired by this, I expanded the graph by incorporating sampling and cover performance data, which improved the density and interconnectedness of my graph
- This allowed me to track influence beyond direct collaborations, capturing how music flows between generations of artists
2. Incomplete Artist-to-Genre Mapping
Another challenge I faced was linking artists to genres. The l_artist_genre table, which should provide these relationships, was poorly populated, making it difficult for me to categorize artists by genre.
Solution:
- I downloaded the full artist JSON data dump from MusicBrainz, where genres were stored within each artist object instead of in the table
- I then wrote a custom script to extract genre information from the JSON dump and insert missing data into the l_artist_genre table
- This greatly improved my genre mapping and allowed me to better understand artist-genre relationships
3. Generating High-Quality AQL Queries with the AI Agent
One of the major issues I encountered was getting my AI agent to generate accurate and useful AQL queries. Early iterations of the agent struggled with:
- Generating syntactically incorrect queries
- Using field names that didn't exist in the MusicBrainz schema
- Failing to return relevant results due to mismatches in entity names
Solution:
- I introduced fuzzy searching to help match artist and genre names even when there were spelling variations or inconsistent metadata
- I also allowed more flexibility in query structure, enabling the agent to make educated guesses instead of failing outright when exact matches weren't found
- Ultimately decided to use the ArangoLangchain integration which made things even easier
- These improvements significantly enhanced my AI's ability to interact with the database effectively, making it more resilient to errors
Accomplishments that I'm proud of
1. Building My First Graph Database ๐ถ
This was my first time working with a graph database, and I found it exciting to explore how artists, genres, and musical influences interconnect. I tried to stuff the graph with as many nodes and connections as possible!
2. Successfully Implementing the AI Agent ๐ค
One of my biggest wins was getting the AI-powered chatbot agent to work effectively. It can:
- Generate dynamic AQL queries to explore my graph
- Run network analysis using NetworkX to uncover hidden connections
- Provide historical context when data isn't explicitly available in the database
3. Discovering Genre Evolution Through the Graph ๐ต
One of the most fascinating aspects of my project was seeing how genres evolve over time. For example:
- I found that pop has directly inspired K-pop, synthpop, and electropop
- Blues influenced rock, which in turn led to punk, metal, and alternative rock
- Hip-hop has branches in R&B, trap, and drill music
The graph made it easy for me to visualize how musical styles evolve and influence each other.
4. Uncovering Unexpected Artist Connections ๐คฏ
One of the longest influence paths I discovered was a connection from Beethoven to Snoop Dogg! Seeing how classical music indirectly shaped modern hip-hop through various generations of influence was really cool to me.
What I learned
1. Working with Graph Databases ๐บ๏ธ
This was my first deep dive into graph databases, and it was eye-opening to see how nodes and edges can model real-world relationships. I found that graph databases excel at capturing complex relationships, which made them perfect for exploring musical influence and genre evolution. I learned how to structure queries efficiently, especially when working with ArangoDB to extract meaningful insights.
2. Using LangChain for AI-Powered Query Generation ๐ค
LangChain made it possible for me to create an AI agent capable of generating dynamic queries and executing them against the graph database. I gained a better understanding of how to integrate natural language processing (NLP) with structured data retrieval. I also learned how to design tools for LLM agents that can generate AQL (Arango Query Language) queries dynamically while handling errors and fuzzy searches gracefully.
3. Graph Theory & Network Analysis ๐
Using NetworkX, I was able to apply graph algorithms to explore paths of influence, artist connections, and genre evolution. I got hands-on experience using algorithms like shortest path search to discover hidden connections between artists. I found it fascinating to see how influence propagates through music history, whether through direct collaborations, samples, or shared genre traits.
4. The Complexity of Music Metadata ๐ผ
One of my biggest takeaways was that music metadata is messy. Not every artist is well-categorized, and genres aren't always explicitly linked. To work around this, I had to enhance the dataset manually, using external sources to fill gaps where the structured data was incomplete. This process reinforced the importance of data preprocessing and augmentation in building robust applications.
What's next for RAGtime Rhythms
1. Interactive Music History Games ๐ฎ
One of the most exciting next steps for me is turning RAGtime Rhythms into an interactive learning experience. Some game ideas I have include:
- Six Degrees of Music Separation โ Given two artists, players must find the shortest path connecting them based on samples, collaborations, or genre influences
- Genre Evolution Challenge โ Players must arrange genres in the order they influenced each other, uncovering how sounds evolved over time
- Who Sampled Who? โ A trivia-style game where users guess which artist sampled a particular track or who originally performed a song
2. Expanding the Knowledge Graph ๐ก
- I plan to incorporate more metadata sources (e.g., Spotify, Discogs, Genius annotations) to improve artist-genre connections and add more relationships
- I want to refine my fuzzy search techniques so the AI can handle variations in artist names more effectively
3. AI-Enhanced Insights ๐ค
- I aim to make the AI agent more conversational, so it can provide deeper historical context on how artists or genres evolved over time
- I'm working on implementing graph-based recommendation models, allowing users to discover underrated artists based on their musical preferences
Built With
- arangodb
- networkx
- python


Log in or sign up for Devpost to join the conversation.