Inspiration

Aura was fundamentally created as an accessibility tool for users with conditions like Dyslexia, motor disabilities, and visual impairments. The core challenge was that traditional writing and research tools require repetitive keyboard actions, visual focus, and complex mouse movements, creating major barriers for many. Our goal was to build a hands-free, voice-first agent that could eliminate these barriers and turn the entire web into an accessible writing interface. By replacing cumbersome clicking and typing with simple voice commands, Aura allows users to focus purely on their thoughts and ideas, rather than the mechanics of digital interaction.

What it does

  • Aura is a voice-first AI Copilot that lives in your Chrome side panel, providing on-demand, context-aware assistance for writing, reading, and research on any webpage.

  • Hands-Free Interaction: Uses continuous listening and the wake word "Hey Aura" to eliminate keyboard and mouse input for editing tasks.

  • Context-Aware Transformations: Instantly analyzes selected text to offer style-switchable AI transformations, including Simplify, Expand, Proofread, Clarify, and Summarize.

  • Accessibility Features: Includes ELI5 (Explain Like I'm 5) style switching to instantly simplify complex jargon for improved reading comprehension, and "Read This Aloud" functionality for auditory processing.

  • Knowledge Management: Creates an active Glossary of complex terms detected on the page and generates multi-format Citations (APA, MLA, Chicago) on demand.

Research Automation: Executes voice commands like "Research this" to generate a diverse set of search queries and open them in new tabs for efficient investigation.

How we built it

Aura utilizes a Hybrid AI architecture, integrating Google's built-in client-side APIs for privacy and resilience, and the Gemini API for complex tasks:

  • API Used: Summarizer API
    Location: Background Service Worker
    Function & Benefit: Local Summarization. Provides fast, cost-free, and network-resilient text summarization.

  • API Used: Proofreader API
    Location: Background Service Worker
    Function & Benefit: Local Proofreading. Ensures grammar correction is private and available even when offline, a critical accessibility feature.

  • API Used: Translator API
    Location: Background Service Worker
    Function & Benefit: Local Translation. Enables multilingual access directly on the device, greatly expanding accessibility.

  • API Used: Gemini API (Cloud)
    Location: Background Service Worker
    Function & Benefit: Complex/Hybrid Fallback. Powers features requiring complex prompting or specialized knowledge: Expand, Simplify, Clarify, Research Automation, and Citation Generation.

  • API Used: Web Speech API
    Location: Side Panel (sidepanel.js)
    Function & Benefit: Enables continuous voice wake word and command recognition.

Challenges we ran into

  • Eliminating the Race Condition: The most significant challenge was synchronizing the asynchronous text selection (from the content script) with the synchronous voice command processing (in the side panel). This was solved by discarding the standard passive message passing for selection and implementing an Active Context Fetch via chrome.tabs.sendMessage.

  • Maintaining Continuous Listening: Due to the strict lifecycle of Manifest V3 Service Workers, maintaining the Web Speech API connection required implementing a multi-strategy Keep-Alive System (using alarms, intervals, and long-lived ports) to prevent the voice service from terminating after 30 seconds of inactivity.

  • Building a Learning Context: Integrating user preference tracking (chrome.storage.local) with real-time page analysis (getPageContext) to dynamically adjust the AI's system instructions (background.js) to provide genuinely context-aware suggestions.

Accomplishments that we're proud of

  • Mission-Driven Accessibility: Successfully delivering a truly hands-free, voice-first interface that directly addresses known barriers for users with motor and cognitive disabilities.

  • Hybrid AI Mastery: Developing a resilient and efficient Hybrid AI model where local client-side APIs handle the most frequent, privacy-sensitive tasks, with a seamless fallback to the cloud for advanced functions.

  • Solving the MV3 Race Condition: Implementing the Active Context Fetch and the multi-strategy Service Worker Keep-Alive system, demonstrating a deep technical mastery over the hardest architectural challenges in Manifest V3 development.

What we learned

We gained deep expertise in the constraints of the modern Chrome extension environment, particularly mastering asynchronous communication patterns to overcome inherent timing flaws in the browser's APIs. We learned that for critical, high-speed tasks like voice command processing, relying on the browser's native event flow is unreliable; developers must actively control the timing and context retrieval to ensure application integrity.

What's next for Aura: The Context-Aware Writing Assistant

  • Adaptive Learning: Utilizing the current usage data to automatically suggest the correct style or action based on the user's past behavior and the current page context.

  • Full Multimodal Support: Expanding the Prompt API integration to allow voice commands to analyze images on the screen ("Hey Aura, explain this chart").

  • Advanced Accessibility Layers: Implementing real-time transcription and haptic/visual feedback features to further assist users with severe hearing or visual impairments.

Built With

Share this project:

Updates