Inspiration

Sometimes, when I'm in a meeting or too busy, or using phone, I receive an audio in whatsapp, and I need to listem them but I can't, so I created this bot in whatsapp where you forward that audio and you get real transcription of that.

What it does

Get real transcription of whatsapp audio that you send.

How we built it

Tech stack is simple.

  • twilio side (associated with whatsapp number): that points to api gateway (lambda).
  • AWS side: in code check infrastructure directory (terraform with AWS), s3 bucket for frontend webpage (simple webpage), aws queue to send every twilio audio to that, another aws callback run in aws lambda that get every audio and send to runpod using faster-whisper, wait for transcription and send to number for user, code is github in directory server, I'm using python (zappa for lambda aws).

It's a simple process, every audio that is received is sended to an AWS queue, and with lambda we are processing via runpod until we get a response and send that to final user.

Challenges we ran into

What we learned

I got some lessons about implementing all lambda with AWS queues.

What's next for Transcribe using whatsapp

  • Implement same bot for Telegram.
  • Set rate limit per number per day.

Built With

Share this project:

Updates