LanguageAssistant

Inspiration

Learning a new language is hard, but mastering confident, natural-sounding speech is even harder. For non-native English speakers, classroom learning often doesn't translate to real-world conversations. The fear of making mistakes in pronunciation or using incorrect grammar can be a huge barrier. Our inspiration was to create a personal, non-judgmental AI coach that could bridge this gap—a tool that listens to actual conversations and provides the specific, actionable feedback needed to help users sound like a native speaker. We wanted to build something that would genuinely help people feel more confident and integrated when speaking English.

What it does

LanguageAssistant is a web-based application that acts as an AI-powered English fluency tutor. A user can record themselves speaking, and our application will:

Securely upload their audio to the cloud.

Transcribe the entire conversation, identifying different speakers.

Analyze the full transcript for grammatical errors, providing corrections and explaining the rules that were violated.

Assess word-by-word pronunciation, flagging words that might be difficult to understand.

Identify any slang or idioms used, explaining their meaning and whether they were used correctly in context.

The final result is a comprehensive, easy-to-read report that gives the user a clear path to improving their spoken English.

How we built it

We built LanguageAssistant on a serverless architecture using Amazon Web Services to create a scalable and powerful backend.

Frontend: A clean, single-page web application built with HTML and styled with Tailwind CSS. It uses the browser's MediaRecorder API to capture audio.

Storage: Amazon S3 is used for storing both the raw audio uploads and the JSON transcripts generated by AWS Transcribe. We used Amazon Cognito to grant the browser secure, temporary credentials for direct S3 uploads.

Transcription: Amazon Transcribe is the core of our transcription service. It takes the audio file from S3 and produces a highly accurate, word-level JSON transcript with confidence scores for pronunciation.

AI Analysis & Logic: The main brain of our application is an AWS Lambda function written in Python. This function orchestrates the entire process: it starts the Transcribe job, waits for completion, parses the transcript, and then uses a powerful prompt to call the Anthropic Claude 3 Sonnet model via Amazon Bedrock.

API: We used Amazon API Gateway to create a secure REST API endpoint, which acts as the bridge between our frontend and the Lambda function.

Challenges we ran into

Our journey was a fantastic learning experience filled with real-world debugging.

IAM Permissions: The biggest initial challenge was mastering IAM. Ensuring our Lambda function had the precise permissions to interact with S3, Transcribe, and Bedrock without being overly permissive required a lot of trial and error.

API Gateway Configuration: We spent significant time debugging our API Gateway setup. We initially missed enabling Lambda Proxy Integration, which caused our Lambda function not to receive the request body correctly.

AWS Account Activation: The most significant roadblock was discovering that our backend was failing due to an SubscriptionRequiredException. We learned that new AWS accounts need to be fully activated with a verified payment method before using services like Transcribe. Troubleshooting this taught us a valuable lesson about the administrative side of cloud development.

Asynchronous Processing: Our Lambda function has to wait for the Transcribe job to complete, which can take time. We had to carefully configure the Lambda timeout to prevent it from failing prematurely, highlighting the importance of understanding asynchronous workflows in the cloud.

What we learned

Throughout this hackathon, we learned how to build a complete, end-to-end, AI-powered application using a serverless architecture. We gained hands-on experience with the entire development lifecycle, from frontend JavaScript to backend Python and deep into the configuration of core AWS services like IAM, Lambda, S3, and API Gateway. Most importantly, we learned the art of debugging a distributed system, tracing errors from the browser all the way through the cloud and back.

What's next for LanguageAssistant

We're incredibly excited about the future of LanguageAssistant. Our next steps would be to add multi-language support for transcription, allowing us to help an even wider audience. We also plan to track user progress over time, creating a personalized learning path and visualizing their improvement in pronunciation and grammar.