EchoWizard Backend ๐Ÿง™โ€โ™‚๏ธ

The backend service for EchoWizard, an AI-powered tool that transcribes and comprehends audio files (including YouTube videos), then generates intelligent summaries and answers questions with precise timestamps.


๐Ÿ”ง Tech Stack

  • Node.js + Express โ€“ RESTful API server
  • Google Gemini GenAI โ€“ Audio comprehension (summaries and question answering)
  • AssemblyAI โ€“ Transcription and timestamp extraction
  • yt-dlp โ€“ YouTube audio extraction
  • Multer โ€“ File upload handling

โœจ Features

  • Upload audio files or provide a YouTube video URL
  • Transcribe audio with sentence-level timestamps
  • Generate detailed summaries with cited timestamps
  • Ask follow-up questions and receive timestamped answers
  • Automatically deletes temporary files after processing

๐Ÿ”Œ API Endpoints

POST /summarize-upload

Description: Upload an audio file or provide a YouTube URL to receive a timestamped transcript and structured summary.

Request (multipart/form-data):

  • audio โ€“ (optional) an audio file to upload
  • youtubeUrl โ€“ (optional) YouTube video URL to process

Response:

{
  "summary": "Structured summary with timestamps",
  "fileUri": "Google Gemini uploaded file URI",
  "mimeType": "audio/mpeg",
  "transcript": "Formatted transcript with timestamps"
}

POST /answer

Description: Ask a question about an already-processed audio file.

Request (multipart/form-data):

  • fileUri โ€“ The URI returned from /summarize-upload
  • mimeType โ€“ audio/mpeg
  • transcript - Timestamped transcript from /summarize-upload
  • question - Your question

Response:

"A helpful answer based on the transcript and audio, with timestamps included where relevant."

๐Ÿงช Running Locally

1. Clone and install

git clone https://github.com/your-username/echowizard-backend.git
cd echowizard-backend
npm install

2. Add your .env

Create a .env file in the root:

ASSEMBLYAI_API_KEY=your_assemblyai_key
GEMINI_API_KEY=your_google_genai_api_key
PROJECT_ID=your_google_project_id

3. Start the server

npm start

Server will run at http://localhost:3000


๐Ÿ“ Project Structure

.
โ”œโ”€โ”€ index.js                 # Main Express app
โ”œโ”€โ”€ helpers/
โ”‚   โ”œโ”€โ”€ assembly.js          # Transcription logic
โ”‚   โ””โ”€โ”€ downloadYoutube.js   # YouTube MP3 downloader
โ”œโ”€โ”€ uploads/                 # Temp audio files
โ”œโ”€โ”€ .env                     # API keys (not committed)
โ””โ”€โ”€ README.md

๐Ÿง  Example Flow

Send a YouTube URL or audio file to /summarize-upload

Get back a timestamped transcript and rich summary

Ask specific follow-up questions using /answer


๐Ÿช„ Future Ideas

  • Support for longer audio files
  • Implement OAuth to login and save chat history

๐Ÿ“œ Author

Made with ๐Ÿ’ก by Sabina Ismailova for AI Hackfest

Built With

Share this project:

Updates