We went through ten or fifteen different ideas, but couldn't quite shake this one - taking photos and converting them to music. All of us had a certain draw to it, whether it was an interest in computational photography and looking at different features we could draw from images, or music and how we could select songs based off of pitch variation, energy, and valence (musical positiveness). Perhaps the most unique part of our project, our user experience simply consists of sending images from one's phone via text, where photos usually live in the first place. No additional app installation/user entry needed!
What it does
Our app uses Twilio to help users send photos to our server, which parses images and derives features to correlate against accumulated Spotify song data. Using features like saturation and lightness derived from the color profile of the image to utilizing sentiment analysis on keywords from object detection, we determined rules that mapped these to song features like danceability, variance, energy, and mode. These songs form playlists that map to the original image - for instance, higher saturated images will map to more "danceable" songs and images with higher sentiment magnitude from its keywords will map to higher "energy" songs. After texting the photo, the user will get a Spotify playlist back that contains these songs.
How we built it
Our app uses Twilio to handle SMS messaging (both sending images to the server and sending links back to the user). To handle vision and NLP parsing, we used Google Cloud APIs within our Flask app. Specifically, we used the Google Cloud Vision API to extract object names and color profiles from images, while using the Google Cloud Natural Language API to run sentiment analysis on extracted labels to determine overall mood of an image. For music data, we used the Spotify API to run scripts for accumulating data and creating playlists.
Challenges we ran into
One challenge we ran into was determining how to map color profiles to musical features - to overcome this, it was incredibly useful to have a variety of skills on our team. Some of us had more computational photography experience, some with more of a musical background, and some of us had more ideas on how to store and retrieve data.
Accomplishments that we're proud of
We're proud of being able to use a number of APIs successfully and handling authentication across all of them. This was also our first time using Twilio and using SMS texts to interface with the user. Overall, we're super proud of coming up with an MVP pretty early on, and then being able to each independently build upon it, making our product better and better.
What we learned
We learned a lot about how to derive information from photos and how deep the Spotify API goes. We also learned about how to divide up our strengths and interests so we could finish our project efficiently.
What's next for ColorDJ
Next for ColorDJ: WhatsApp integration, more efficient song database, Google Photos integration (stretch) for auto-generated movies. Machine learning could make an appearance here with training models in parallel to better match color profiles or derived keywords with songs.
Putting music to photos and adding that extra dimension helps further connect people with their creations. ColorDJ makes it easier to generate a playlist to commemorate any memory, literally with two taps on a screen!