Today BIG data is readily available, but it is becoming harder for the music industry to truly differentiate between meaningful and less meaningful datasets. We are welcoming a new avenue for capturing data as EEG devices become available to the general consumer market.
What it does
Using EEG data captured by the MUSE headset, we are able to interpret the listener's engagement, arousal, and focus. Our unique methodology involves compiling these parameters into a cohesive RGB image that is interpreted by a machine learning algorithm. The machine learning image classification algorithm should be able to determine whether the listener likes or dislikes a song.
Our Goal: We are expanding upon the "lean forward"/"lean back" methodology currently used by the big data market to track user engagement. Our goal is to determine whether the listener had a positive or negative experience regardless of their subsequent engagement with the streaming platform.
Implication for Producers: Producers, record labels, and artists will be able to capture more meaningful metadata over the like and dislike patterns of their songs.
Implication for Listeners: The intent for listeners is to provide them with a curated playlist of music without any active engagement with the streaming application or device. Since each neural network is personalized, the data can be kept by the user. Additionally, the like/dislike output of the network can be aggregated and abstracted to form meaningful datasets for the industry.
How we built it
A single test subject listened to a variety of songs in order to train a personalized ML network. The network's goal is to predict whether or not the person liked the song without subsequent active engagement (i.e. the listener hears the song, but does not like or dislike within the app).
MaxMSP: We used the MuArts EEG signal processing patch on MaxMSP to process the signal data transmitted by the MUSE headband. The patch processes three signals over time: emotional engagement (positive/negative), emotional arousal, and focus/relaxation. These signals were recorded as the test subject listened to each song.
Universal Music Group/7digital: The songs played for the test subject were randomly chosen from the 7digital music library provided by Universal Music Group.
Python/Numpy: We cleaned up the data from the three parameters (engagement, arousal, and focus) and generated an RGB pixel image to represent the emotion of the listener at specific points in time. First, we normalized the range of data between the values of 0-255. Then we composed a 3-dimensional array that would be compiled into an image using the Nympy library.
TensorFlow: Using the images generated with Python and the like/dislike parameter from the listener, we trained a convolution neural network for image classification. The neural network used the Inception V3 architecture and was pre-trained using ImageNet. The algorithm used 20 training samples and 5 test samples from the listener.
Challenges we ran into
1) Our team was comprised of very skilled individuals in the AR, music, and neuroscience. However, we did not have a proficient full-stack developer to help us automate and connect all our parts under a platform. As a result, we kept the music streaming, Muse data capture, and data analysis separate.
2) Our initial product concept was supposed to analyze the data stream from the Muse headband in real time. Unfortunately, it proved complicated to develop a real-time machine learning solution. We resorted to an offline analysis solution to train our algorithm.
Accomplishments that we proud of
1) We are introducing a proof-of-concept for the application of biometrics in the music industry.
2) Using biometric data we created a visual representation of the emotional state of a listener over time.
3) Our image classification neural network was trained using the RGB images generated by the biometric data.
What we learned
Biometrics will allow us to harness a dimension of data that once seemed intangible. This will have major implications in artistic industries that evoke emotional responses from their viewers. By being able to quantify and analyze the physical responses of an individual, industries will be able to expand their dataset for evaluating their products' success.
What's next for BrainView
1) Our methodology needs to be automated in order to facilitate the data collection and analysis process.
2) New ways of compiling the image need to be explored to optimize the image-classification algorithm's results.
3) Explore real-time analysis solutions to predict listener's response simultaneously as the track is played.
1) Playlist recommendations based on the emotional response of the user
2) Helping paralyzed individuals communicate their media preferences
3) Build a visual AR emotional output to engage the consumer in a game application that encourages song classification