Inspiration: Beyond the Clinical Grid
Traditional Augmentative and Alternative Communication (AAC) devices are often "digital prisons." They are clinical, expensive, and force non-verbal individuals into a state of Linguistic Isolation. Users are forced to memorize static grids only to produce "speak" (e.g., "Want water now"), which strips them of their personality and social nuance.
We were inspired to break this cycle. We believe communication shouldn't be a mechanical task; it should be fluid—like an ocean wave. Ocean WAVE was born from the mission to transition AAC from "medical equipment" to "premium personal hardware," giving users not just a voice, but nuance, tone, and agency.
The Magnitude of the Problem The lack of accessible communication is a global economic and social crisis:
The Economic Toll: In the UK alone, the failure to provide timely speech support costs the economy £8 billion annually in lost productivity and care costs.
The Accessibility Gap: High-end dedicated AAC hardware often costs between $5,000 and $25,000, making "voice" a luxury many cannot afford.
The "Silent" Population: With over 2.1 billion people globally requiring some form of assistive technology, the current "static grid" model is a bottleneck to human potential.
Ocean WAVE disrupts this by leveraging the Gemini 3 Flash model to provide a high-fidelity, affordable, and intelligent alternative that runs on any modern browser.
What It Does
The Intelligence Layer Ocean WAVE is a high-fidelity, motor-accessible communication terminal that transforms fragmented inputs into human expression.
Hierarchical Intent Construction: Users use a high-contrast symbol grid or keyboard designed for motor impairments.
Contextual Predictive Engine: Using Gemini 3 Flash, the system predicts the user's intent in real-time. It doesn't just suggest words; it suggests ideas, reducing motor effort by up to 40%.
The Refinement Engine (Social Nuance): A user can input "Store go want" and select a Tone Profile (Polite, Casual, or Professional). Gemini restructures this into: "I would like to go to the store, please." This allows a non-verbal user to code-switch between a friend and a doctor for the first time.
Neural Audio Synthesis: We bypassed robotic browser voices. Using the Gemini 2.5 Flash TTS (Kore model), Ocean WAVE generates ultra-realistic, warm, and human-like audio that carries emotional weight.
Technical Architecture: Engineering the Wave We built a high-performance Single Page Application (SPA) using React 19 and TypeScript, engineered for 99.9% uptime and zero-latency feel.
Unified AI Service Layer: We architected a central services/gemini.ts module that orchestrates three distinct generative streams:
Text-to-Text: For grammatical refinement and tone shifting.
Text-to-JSON: For structured word-prediction based on user history.
Text-to-Audio: For high-fidelity neural speech.
Custom Audio Pipeline: Since Gemini returns raw PCM audio data, we built a proprietary decoding pipeline using the Web Audio API. We manually handle 24kHz sample rates and decode base64 chunks into Float32Arrays for low-latency, "no-fry" playback.
Hardware Blueprint UI: Using Tailwind CSS, we developed a design system focused on Visual Processing Disorders. It features high-contrast (Blue/Orange/White) borders, distinct drop shadows, and active:translate-y animations to provide tactile "press" feedback on touchscreens.
Challenges & Triumphs
The "Fried" Audio Hurdle: Initially, raw PCM data sounded distorted. We had to write a custom decodeAudioData wrapper to manage buffer synchronization, ensuring the voice sounds as natural as a human.
The Latency Paradox: AI takes time, but communication is instant. We implemented a "Polishing" overlay that provides visual reassurance, turning a 500ms wait into a moment of anticipation rather than a technical lag.
What’s Next: The Future of OCEAN WAVE
Multimodal Vision: Snap a photo of a menu, and Gemini will instantly populate the grid with those food items.
Voice Banking: Enabling users to "clone" their original voice (or a loved one's) to maintain their unique vocal identity.
Built With
- cloud
- cs
- gemini
- html
- java
- typescript
Log in or sign up for Devpost to join the conversation.