This project is inspired by Microsoft's AI Xiaoice that generates Chinese poetry from input images. Our team, a group of classical music fans, then came up with the idea of composing music from images. Since music, unlike poetry, is highly indistinguishable in meaning, it is impractical to associated objects to a piano piece. We then constrain the labels to be human emotions and came up with the idea to constrain input images to contain facial expressions.
What it does
Our Cal Hack project is a website that generates piano music from images of human faces, and the music styles will vary according to the emotions detected from these images.
How we built it
In our implementation, we referred to the WaveNet project by Google DeepMind and the Emotion API by Microsoft Azure. The facial expressions in the input images will first be analyzed by the Emotion API; it gives an index to each of the eight categories of emotions such as happiness, sadness, neutral, anger, and etc. Next, the emotion indexes will be processed by the WaveNet where the causal neural network will generate original music from our pre-trained models. Our pre-trained models are trained on the Google Cloud virtual machine equipped with GPU. The dataset we used are approximately 8GB of classical piano music ranged from Bach to Rachmaninov.
Challenges we ran into
We had a difficult time dealing with virtual machine on google cloud. We also ran out of time in training the models.
Accomplishments that we're proud of
What we learned
how to make website understand more about python and oop how to use virtual machine on google cloud ubuntu basics of machine learning, especially convolution NN and causal NN
What's next for Life Buzz Creator
Continue to finish this project and add more features on it.