GM already has an app for audiobooks. Why not have one for podcasts?

What it does

It listens to the contents of the podcasts, and tries to pick out ones similar to the ones the user has already listened to.

How we built it

We use RevSpeech to transcribe the podcasts, and then use doc2vec to make a vector representation of the transcription. These high dimensional vectors approximate the semantic meaning of the podcast. We use these to pick similar one to what the user has listened to.

We trained the vector representations on Spell's machine learning platform, using the API to initiate experiments.

Challenges we ran into

We needed to transcribe a large volume of podcast data. We have been able to do this thanks to the generosity of the RevSpeech team. Thank you!

We also had trouble interfacing the different components of the system, from missing data that should be attached to the transcript to the format of the feature vectors being passed around.

Accomplishments that we're proud of

We successfully used machine learning algorithms to create a system which recommends related podcasts. We have also managed to produce an attractive and useable interface.

What we learned

  • Python is a great language for rapid prototyping, but the lack of interface definitions often make it hard to use.
  • How to use Bottle
  • Jetlag and hackathons don't play nice together.

What's next for HackMIT recommends...

We should be able to produce a more elaborate mathematical model for the recommendations. We might also expand to other platforms.

Built With

Share this project: