VoiceScribe

Home page
Uploading the files
Uploaded files
Summarized texts after transcribing
Selecting audio to choose from for interaction via keyboard or microphone

Inspiration

VoiceScribe was inspired by the need to quickly turn meeting recordings and long voice notes into clear, actionable summaries. With more meetings happening online and people missing important details or struggling to catch up, we wanted to create a tool that makes audio content instantly searchable, summarized, and interactive, helping teams save time and stay aligned.

What it does

VoiceScribe allows users to upload audio recordings in various formats (MP3, WAV, M4A, OGG and more). The app transcribes the audio using AssemblyAI, summarizes key discussion points with Google Gemini, and enables users to ask questions about the meeting transcript via a smart chat interface. Users get structured summaries (including action items, deadlines, and people responsible) and can chat with the transcript to clarify anything that happened during the meeting.

How we built it

Frontend: Built with React and Tailwind CSS for a clean, responsive UI with drag-and-drop file upload, audio playback, and live chat features.
Backend: Powered by Express.js. Handles file uploads, audio conversion using ffmpeg, and manages all API integrations.
AI/ML: Uses AssemblyAI for accurate transcription and Google Gemini for summarization and Q&A.
Deployment: Deployed on Railway for the backend, with the frontend hosted on Netlify. CORS and environment variables were managed for smooth integration between preview and production environments.

Challenges we ran into

Getting CORS to work seamlessly between Bolt.new preview, production deployments, and third-party APIs required lots of debugging and careful configuration.
Handling different audio file formats (especially webm) required real-time conversion and additional error handling.
Managing rate limits and asynchronous polling with AssemblyAI’s API to keep the user experience smooth.
Synchronizing changes across environments to make sure the app worked the same in local, preview, and live production.

Accomplishments that we're proud of

Built a real-time audio transcription and summarization tool from scratch in a short timeframe.
Created a seamless upload-to-summary workflow, making advanced AI features easy to use for non-technical users.
Made the app flexible enough to support multiple deployment targets and preview environments.
Solved tricky cross-origin and API integration issues to provide a reliable experience.

What we learned

How to connect multiple third-party APIs (AssemblyAI, Google Gemini) and handle their quirks.
Techniques for file conversion, asynchronous polling, and state management in React.
Best practices for managing environment variables and CORS in modern full-stack apps.
The value of clear error handling, logging, and user feedback.

What's next for VoiceScribe

Add support for even more audio and video file formats.
Allow users to export transcripts and summaries to Google Docs, Notion, or email.
Build in team collaboration features for sharing and commenting on summaries.
Explore more advanced AI features, like speaker identification and topic tracking.

Built With

assemblyai
express.js
gemini
javascript
multer
netlify
node.js
railway
react
tailwind

Submitted to

World’s Largest Hackathon presented by Bolt

Created by

Eric, that's me, was responsible for backend and API development, including setting up the Express.js server, writing the core API endpoints for file uploads, transcription, and Q&A, and managing file conversion logic using ffmpeg and Multer. i also handled cloud deployment to Railway and Netlify, configured environment variables, and helped ensure the project worked smoothly in production.

Eric Afari Jr
Emmanuel Agyei

Updates

Eric Afari Jr started this project — Jun 30, 2025 04:50 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.