Inspiration
We drew inspiration from the engaging and addictive nature of music in interactive games like Piano Tiles and Guitar Hero, where players are immersed in dynamic musical experiences that naturally introduce them to new songs and genres through various levels and challenges. Additionally, we were inspired by Spotify's Daily Mixes, which curate personalized playlists based on users' listening habits and preferences, leveraging data-driven algorithms to offer a highly customized music discovery experience. By combining these elements, we aim to make music discovery on TikTok both interactive and personalized, enhancing user engagement and fostering a deeper connection to diverse musical content.
What it does
Our solution aligns with the goals of building a simple music discovery feature and enhancing the artist/fan community by utilizing the Music and Artist Discovery feature and the Daylist. These tools create an intuitive way for users to discover new and diverse music, supporting increased exposure, content diversity, and the viral spread of music. Additionally, the Artist Discovery feature highlights new artists and showcases those followed by friends, while the Music Tiles game engages users interactively. This fosters a sense of community by connecting users with new artists and their friends' musical interests.
How we built it
Database:
We utilized Atlas MongoDB as our database solution, taking advantage of its cluster creation, replication, and sharding capabilities to ensure high availability and scalability. The database structure is inspired by Spotify’s WebApi.Tables and attributes:
Artist Table:
_id, SpotifyId, Username, popularity_score, genres, spotify_follower_count, tiktok_follower_count, total_views_count, total_likes_count, total_share_count, total_video_countGames Table:
_id, Track_ID, Track_Name, Artist, ScoresUser Table:
_id, username, bio, spotify_id, following_friends, following_artists, liked, preferenceVideos Table:
_id, Track_ID, Track_Name, Acousticness, Danceability, Energy, Instrumentalness, Liveness, Loudness, Speechiness, Tempo, Valence, Duration_ms, Creator_Artist_IDs, Popularity, Creation_Time, Description, Like_Count, Share_Count, View_Count, CDN_URL, Genres, youtube_link
Each table and its attributes serve specific purposes, facilitating the storage and retrieval of data related to artists, games, users, and videos within your MongoDB database setup using Atlas MongoDB.Some important terminologies:
Genre map:
A 2D array organizing similar genres into cohesive groups for streamlined categorization and analysis.
New artist popularity score:This custom metric nicknamed (STanRi’s measure of popularity) enhances the visibility of emerging artists based on their content quality relative to their follower count. It assigns higher scores to artists with fewer followers but exceptional content, moderate scores to established artists with average content, and lower scores to artists with subpar content.
Formula:
(1.5* total_likes_count + 2* total_views_count + 1.8* total_share_count) / (3 * (spotify_follower_count + tiktok_follower_count) + 3* Popularity_Score)Rationale for Weight Assignments:
- Total Likes Count (Weight: 1.5): Likes are given less weight because not everyone who enjoys content actually clicks the like button. It's a measure of appreciation but not an absolute indicator of quality or enjoyment.
- Total View Count (Weight: 2): Views are weighted higher because they directly reflect the popularity and reach of the content. Higher views generally suggest higher quality and more widespread appeal.
- Total Shares (Weight: 1.8): Shares carry more weight than likes but less than views. Sharing indicates a higher level of appreciation as users are willing to endorse the content with their networks, although not all viewers share content they enjoy.
- Total Followers Count (Spotify and TikTok) (Weight: 3): Followers on both platforms are heavily weighted because they signify the artist's existing fan base and potential reach. Artists with more followers are likely more established and hence we will need a higher value to bring down the total score
- Popularity Score (Spotify) (Weight: 3): The Spotify popularity score is weighted similarly to total followers count. A high score indicates broader popularity among listeners, and hence we will need a higher value to bring down the total score
- Total Likes Count (Weight: 1.5): Likes are given less weight because not everyone who enjoys content actually clicks the like button. It's a measure of appreciation but not an absolute indicator of quality or enjoyment.
Bootstrap resampling: Bootstrap resampling is a statistical method used to estimate the distribution of a statistic by iteratively sampling with replacement from the original dataset. This approach enables the empirical determination of uncertainty or variability in a statistic, without relying on assumptions about the data's underlying distribution.
We have developed two key components to enhance music discovery on TikTok, specifically for emerging artists:
- Music and Artist Discovery through Upcoming Artist Recommendations and the Daylist Feature:
It showcases recommended new artists and a leaderboard for emerging talents, determined by New artist popularity score. Additionally, it highlights artists followed by friends but not yet followed by the user. - Daylist Feature: The Daylist feature aids users in discovering diverse content on the platform.
- Daylist Algorithm:
- Objective:
Curate a diverse list of 10 songs: 3 from the user's preferred genre, 4 from genres similar to their preferences, and 3 from unrelated genres. - Procedure:
Calculate an embedding where each element represents target acousticness, danceability, energy, and instrumentalness.
Factors contributing to target acousticness: Instrumentalness, Liveness, Speechiness
Factors contributing to target danceability: Tempo, Energy, Valence, Loudness
Factors contributing to target energy: Loudness, Tempo, Valence, Liveness, Speechiness
Factors contributing to target instrumentalness: Speechiness, Liveness
Compute a weighted sum of factors of these characteristics using bootstrap resampling.
- Objective:
- Daylist Algorithm:
- Formula:
Summation of weight * factor
We use this method to assign weights because the influence of each factor on a user's characteristic features evolves over time. Therefore, it is impractical to predetermine the values.
Determine the Euclidean distance between these target features and the features of a track. The track with the smallest distance is considered most similar to the target characteristics.
Extract the genre of these tracks, use a genre map to identify similar and dissimilar genres, and curate the playlist using the Spotify recommendation API by passing the genres and target characteristics of the songs.
- Algorithm for Recommending New Artists:
- Objective: Recommend new and upcoming artists based on the user's genre preferences.
- Procedure:
Calculate the closest genres using the same approach as for the Daylist feature.
Based on the user's genre preferences and the new artist popularity score, recommend the top 10 emerging artists.
- Objective: Recommend new and upcoming artists based on the user's genre preferences.
- Piano Tiles
The Piano Tiles feature offers users an engaging and interactive way to enjoy songs from their Daylist, new artists, and their favorite artists through a dynamic game format. As users play, the speed of the tiles increases progressively, enhancing the challenge and excitement. A leaderboard ranks players based on their high scores for each song, fostering a competitive and enjoyable experience. - Game Rules:
- Starting the Game:
Users select a song from their Daylist, a new artist, or their favorite artist to begin playing.
The game starts at a moderate speed, which gradually increases as the game progresses. - Scoring:
Users must click on the moving tiles in sync with the music to score points.
Each successful tile click increases the user's score.
- Starting the Game:
- Tile Interaction:
Clicking directly on a tile earns points. The more accurately users click on the tiles, the higher their score will be.
If users click outside of a tile, they lose the game immediately. - Speed Progression:
The speed of the tiles increases incrementally based on the duration of play. The longer the user plays, the faster the tiles move, requiring quicker reflexes and sharper focus. - Leaderboard:
A leaderboard displays user rankings for each song based on their high scores.
Users can view their position relative to other players, motivating them to improve their scores and climb the ranks. - Game Over Conditions:
The game ends if a user misses a tile or clicks outside of a tile. - High Score and Progress Tracking:
The game tracks high scores for each song, allowing users to see their best performances.
Progress is saved so users can aim to beat their previous high scores in subsequent gameplay sessions. - Game Development Logic:
- Loading Audio
First, we start by loading the audio file. This means converting the MP3 file into a format that the computer can understand and work with. - Onset Strength
We calculate the onset strength of the audio signal. Onset strength measures sudden changes or peaks in the audio, which often indicate the start of a musical note. - Onset Detection Parameters
There are several settings we adjust to find these onsets:
pre_max and post_max: These help determine how big of a change in sound we're looking for, before and after a point in time.
pre_avg and post_avg: These help smooth out the changes in sound, making sure we're not picking up too many small variations.
wait: This setting makes sure we don't pick up onsets that are too close together in time.
delta: This sets a threshold for how much of a change in sound we need to consider it an onset. - Initial Onsets
Using the onset strength, we identify the initial times where there might be a musical note starting. These are our first guesses at where the notes begin. - Peak Picking
To refine our initial guesses, we look for peaks in the onset strength. These peaks are the strongest indications that a musical note is starting, and we focus on these to pinpoint the exact times. - Filtering
We make sure that the detected onsets are spaced apart by at least a minimum interval of time. This helps filter out rapid changes or noise in the audio signal that aren't actual musical notes. - Output
Finally, we get a list of times where we believe significant musical notes begin in the audio. These times are where the music starts to change or a new note is played, based on our analysis of the audio signal.
Architecture:
Database:
We have chosen to host our database on a private MongoDB Atlas cluster due to its high scalability and availability. This setup leverages sharding and replication to distribute data across multiple servers, ensuring seamless scalability and redundancy. MongoDB Atlas offers comprehensive monitoring capabilities via its web interface, providing real-time insights into metrics such as current database size, the number of data reads and writes within a specified timeframe, and the number of active connections. Security is a paramount consideration, and MongoDB Atlas allows us to enhance it by whitelisting specific IP addresses that can access the database, ensuring controlled access. Additionally, user isolation further strengthens our security posture by segregating user data access. The NoSQL nature of MongoDB, which supports JSON document storage, aligns perfectly with our data structure requirements. This allows us to store and query data in its native JSON format, streamlining data handling and manipulation. The tables we have created are designed to offer flexibility and ease of implementation, enabling efficient data manipulation and retrieval. Our database connection architecture employs the singleton pattern, ensuring that each API call within a session uses a single, consistent database connection. This approach prevents the exhaustion of the database connection pool, maintaining optimal performance and reliability. By utilizing MongoDB Atlas, we achieve a robust, secure, and efficient database solution tailored to our specific needs.Middleware:
For each of our functionalities, we have implemented REST APIs, enabling us to leverage microservices architecture. Each component—artists, videos, and users—has its own dedicated API, promoting modularity and ease of maintenance. We have enforced a global rate limit of 1000 API hits per 15 minutes to ensure fair usage and prevent abuse. To secure our APIs, we have protected the headers and implemented selective CORS, allowing access only from specific IP addresses, ports, and content types. All sensitive credentials, such as the database URL, username, password, and API keys, are securely stored in environment files to enhance security. Additionally, we have implemented response caching to minimize redundant database queries, thereby improving performance and reducing load on the database.FrontEnd:
To fully leverage microfrontend architecture, we have ensured that no global variables are used across our UI screens. Each UI screen is designed to be self-contained, with its own isolated state and logic. This approach enhances modularity, making it easier to develop, test, and maintain individual components independently. By keeping UI screens self-contained, we also improve the scalability and flexibility of our frontend, allowing different teams to work on separate parts of the application without causing conflicts or dependencies on shared state.
Outcomes
Our solution enhances exposure for emerging artists through the Music and Artist Discovery feature, which recommends new artists based on user preferences and friend networks. This increases the visibility of independent musicians and helps artists with good content widen their audience through the new artists leaderboard. As a result, emerging artists gain more exposure, leading to greater recognition and potential career opportunities. Additionally, the Daylist feature curates a diverse list of songs from various genres, both related and unrelated to user preferences, ensuring that users are introduced to a broader range of music. This promotes a more diverse and personalized content consumption experience, allowing users to explore different genres and styles they might not typically encounter.
To boost viral potential and engagement, the Piano Tiles game offers a dynamic and engaging way to discover music. By playing the game and sharing their scores, users contribute to the viral spread of songs and artists. The interactive nature and competitive element of the game encourage users to share and promote new music, increasing the chances of songs going viral. Furthermore, the social aspects of the Artist Discovery feature, such as seeing what friends are following, foster active engagement and interaction with the content. This deepens users' connection to the music and artists, building a stronger community around shared musical interests and enhancing their overall connection to the platform.
Tools
React.js, MongoDB, Express, NodeJS, Python, Spotify WebAPI, Youtube API, JavaScript
Challenges we ran into
Music Integration for Piano Tiles:
- We faced difficulties in transmitting the music files through the API.
- Ensuring that the files were correctly formatted and processed for the game required extensive troubleshooting.
Lack of Available Data:
- Despite utilizing the Spotify Web API, there were limitations on the amount of data we could access.
Navigating Through Rate Limits of API:
- The YouTube API imposed limits on the number of requests we could make within a certain timeframe.
- This necessitated careful planning to avoid exceeding these limits, which could result in temporary access restrictions.
Hosting Database on MongoDB:
- Ensuring the security of our database, including data encryption, access controls, and regular backups, was critical.
Working Parallelly:
- Merging code from different team members without causing conflicts or introducing bugs was challenging.
- We employed version control systems and continuous integration practices to mitigate these issues.
Accomplishments that we're proud of
Our solution enhances exposure for emerging artists through the Music and Artist Discovery feature, which recommends new artists based on user preferences and friend networks. This increases the visibility of independent musicians and helps artists with good content widen their audience through the new artists leaderboard. As a result, emerging artists gain more exposure, leading to greater recognition and potential career opportunities.
Additionally, the Daylist feature curates a diverse list of songs from various genres, both related and unrelated to user preferences, ensuring that users are introduced to a broader range of music. This promotes a more diverse and personalized content consumption experience, allowing users to explore different genres and styles they might not typically encounter.
To boost viral potential and engagement, the Piano Tiles game offers a dynamic and engaging way to discover music. By playing the game and sharing their scores, users contribute to the viral spread of songs and artists. The interactive nature and competitive element of the game encourage users to share and promote new music, increasing the chances of songs going viral. Furthermore, the social aspects of the Artist Discovery feature, such as seeing what friends are following, foster active engagement and interaction with the content. This deepens users' connection to the music and artists, building a stronger community around shared musical interests and enhancing their overall connection to the platform.
What we learned
Designing a Distributed System from Scratch
Embarking on the journey of designing a distributed system from the ground up provided us with invaluable insights and practical experience. We learned how to architect a distributed system by breaking down the application into microservices, ensuring that each service can operate independently yet cohesively. This involved making critical decisions on data storage strategies. We focused on scalability and fault tolerance, implementing mechanisms such as replication, sharding, and load balancing to handle increased load. Managing data consistency across distributed components introduced us to various consistency models and strategies for data synchronization.
Creating REST APIs in Python
Developing RESTful APIs in Python was a critical aspect of our project, enhancing our understanding of backend development and API design. Hands-on experience with popular Python frameworks such as Flask and FastAPI provided the tools and libraries needed to create robust and performant APIs. We learned to structure our code efficiently, handle routing, and manage middleware.
Audio Processing in Python
We learned how to handle various audio file formats using libraries like pydub and librosa, enabling us to read, write, and convert between formats to ensure compatibility with our application. Extracting meaningful features from audio files, such as tempo, pitch, loudness, and spectral characteristics, was crucial for tasks like music recommendation and analysis.
What's next for thic-tok-music
Future Scope
Containerization of our Microservices: Implementing containerization to package our microservices with their dependencies, ensuring consistent environments across different stages of development and deployment.
Monitoring of APIs: Establishing robust monitoring solutions for our APIs to track performance, detect anomalies, and ensure the reliability and availability of our services.
MicroFrontend Patterns: Exploring microfrontend architecture to break down the frontend into smaller, independent pieces, allowing teams to work on different parts of the application concurrently and improve scalability.
Load Balancing using NGinX Reverse Proxy: Utilizing NGinX as a reverse proxy to distribute incoming traffic evenly across our microservices, enhancing load balancing and improving the overall performance and reliability of our system.
Built With
- express.js
- javascript
- mongodb
- node.js
- python
- react.js
- spotify-webapi
- youtube

Log in or sign up for Devpost to join the conversation.