This project is inspired by Microsoft's AI Xiaoice that generates Chinese poetry from input images. Our team, a group of classical music fans, then came up with the idea of composing music from images. Since music, unlike poetry, is highly indistinguishable in meaning, it is impractical to associated objects to a piano piece. We then constrain the labels to be human emotions and came up with the idea to constrain input images to contain facial expressions.

What it does

Our Cal Hack project is a website that generates piano music from images of human faces, and the music styles will vary according to the emotions detected from these images.

How we built it

In our implementation, we referred to the WaveNet project by Google DeepMind and the Emotion API by Microsoft Azure. The facial expressions in the input images will first be analyzed by the Emotion API; it gives an index to each of the eight categories of emotions such as happiness, sadness, neutral, anger, and etc. Next, the emotion indexes will be processed by the WaveNet where the causal neural network will generate original music from our pre-trained models. Our pre-trained models are trained on the Google Cloud virtual machine equipped with GPU. The dataset we used are approximately 8GB of classical piano music ranged from Bach to Rachmaninov.

Challenges we ran into

We had a difficult time dealing with virtual machine on google cloud. We also ran out of time in training the models.

Accomplishments that we're proud of

What we learned

how to make website understand more about python and oop how to use virtual machine on google cloud ubuntu basics of machine learning, especially convolution NN and causal NN

What's next for Life Buzz Creator

Continue to finish this project and add more features on it.

Share this project: