We're both avid podcast listeners, but we often run into problems like:

During a conversation with a friend, I’m reminded of something I heard in a podcast. But I have no idea which episode it was in, nor how far in to the episode the particular part was.

When listening to a podcast, sometimes I’m bored of the current topic and want to jump ahead. I hit jump forward a few times, and spend a few effortful seconds figuring out what they’re talking about. Repeat till I find a new topic, then go back till I find the beginning of that topic.

When listening to a podcast, sometimes the speaker says something really interesting that I want to hear again. But again, I’m dropped mid-sentence and have to focus to orient myself, then wait a few seconds in that vigilant state to get to the part I was looking for.

What it does

Yapa transcribes podcasts. That lets me:

  • Find the episode and timestamp I want to tell my friend about using whatever key words I remember
  • Read through the transcript and jump to the next topic I’m interested in
  • Seek back/forward and land on the beginning of a sentence every time

How we built it

The transcription is performed by a service called Assembly AI. We built a backend in Rails, which we deployed on Heroku. We built an iOS app with Swift.

Challenges we ran into

  • Transcriptions aren't perfect! (But we think they're good enough to make this useful)
  • We'd like to automate the process of adding "chapter markers" to a podcast, but automatically classifying sections of a podcast into chapters is difficult.
  • We'd like to automatically identify different speakers in a podcast, but that functionality in the transcription API we chose seems to be broken.

Accomplishments that we're proud of

  • Using this for the first time felt really good. It scratched a huge itch.

What we learned

Being able to navigate audio like text is powerful. Its one of the big pain points of consuming audio, and even our simple v0 proved its usefulness. There’s magic here.

What's next for Yapa

We've got a bunch of other ideas for what we'd like to see in a podcast app as users, as well as ideas that could benefit podcast creators.

User-facing features:

  • Aggregate listening data from other podcast subscribers in order to identify popular parts of an episode, and automatically skip the less interesting parts (like ads).
  • Table-stakes podcast app features like shortening silences, boosting voices, and speed control.
  • Identification and display of different speakers. Controls to "jump to next speaker" or "jump to beginning of this speaker"
  • "Smart preview" of an episode/show - collect the most interesting parts of an episode or show into a short "trailer" for users considering subscribing.

Creator-facing analytic features:

  • Which parts of an episode do subscribers listen to most?
  • Which parts do they skip most?
  • At what points in an episode do the most shares occur?
  • What time of day do people listen?
  • What portion of my subscribers are regular listeners?
Share this project: