Cognitive Arena & Phonics

APP ICON
Hugging Face space
Dashboard

🧠 Inspiration

The inspiration for the Cognitive Arena module of Mount AI Scholar stemmed from a critical gap in modern EdTech: the lack of real-time, privacy-first cognitive remediation for dyslexia and phonological processing disorders. The complexity of mapping phoneme-grapheme correspondence requires more than just flashcards; it requires a dynamic environment that listens, analyzes, and adapts. I wanted to build an engine that doesn't just score a student, but algorithmically understands their phonetic deviation and generates instant, customized remediation pathways.

⚙️ What it does

The Cognitive Arena is a secured neuro-cognitive session environment. It provides:

Active Phonetic Rehabilitation: Users practice pronunciation while the system performs live comparative validation against expected transcripts.
AI-Driven Remediation: Leveraging Gemini (as a bridge toward my ultimate Gemma 4 Edge local inference architecture), the system detects specific graphemic confusions (e.g., b/d/p/q or Sibilants/Fricatives) and instantly creates adaptive practice regimens.
Secure Patient Sessions: Every session is securely logged with 100% data segregation, ensuring absolute privacy and allowing for historical progress tracking.

🔨 How I built it

I engineered the platform using a modern, scalable full-stack approach:

Frontend Architecture: React 18, TypeScript, and Tailwind CSS. The UI is heavily optimized using functional components, hook-based state management, and strict render control to prevent UI blocking during real-time cognitive processing.
Backend & Security: Google Firebase. I implemented Google Authentication married with Firestore. For absolute data integrity, I wrote strict Firestore Security Rules ensuring that arena_sessions are inaccessible to anyone but the authenticated user.
AI & Voice Pipeline: I integrated the Web Speech API for zero-latency voice ingestion, passing the transcript delta to our AI remediation engine to construct the patient's cognitive profile in real-time.

Mathematically, the core challenge is minimizing the phonetic deviation $D$. If $p_i$ is the target phoneme and $\hat{p}i$ is the spoken phoneme, the system's objective function for the remediation engine is to construct exercises that minimize the error over $N$ speech samples: $$ \mathcal{L}{cognitive} = \frac{1}{N} \sum_{i=1}^{N} \omega_i \cdot \delta(p_i, \hat{p}_i) $$ Where $\omega_i$ represents the contextual weight of the specific phoneme based on the user's historical dyslexia profiles.

⚠️ Challenges I ran into

Building a synchronous AI pipeline over asynchronous voice streams in the browser was structurally complex.

State Management: Handling continuous audio transcripts without causing recursive render loops in React required heavy memoization and strict useEffect dependency arrays.
Privacy-by-Design Compliance: Writing standard NoSQL queries wasn't enough. I had to design compound queries and granular Firebase security rules to ensure patient data could theoretically pass highly strict medical/educational compliance frameworks.

🏆 Accomplishments that I'm proud of

I am incredibly proud of the seamless UX/UI integration. The Arena doesn't look like a standard, clunky medical or educational app. It features a dark, immersive "Cosmic-Slate" UI with glassmorphism, instant visual feedback, and a highly performant session history dashboard. It looks and feels like elite software, while housing highly technical cognitive-computing operations under the hood.

📚 What I learned

Managing real-time interaction between a client-side Speech API, a NoSQL data stream (Firestore), and an AI generation layer forced me to think deeply about architectural bottlenecks. I leveled up my ability to structure non-blocking UI threads and manage secure, scoped data streams in Firebase.

🚀 What's next for Mount AI Scholar

The Cognitive Arena is Phase 1. The Master Plan is to transition the cloud-based AI calls entirely to local edge inference (deploying Gemma 4 locally via Python/FastAPI) to achieve true, offline "Privacy by Design" with zero data leaving the device. Following that, Phase 2 targets the WWDC 2027 Swift Student Challenge: porting this entire engine into the Apple ecosystem using CoreML and projecting these neuro-cognitive sessions into spatial environments via ARKit and SwiftUI.

Built With

ai
elastic
gemini
huggingface
ml
react
tts

Updates

tahawinner25-ai winner started this project — Jun 06, 2026 02:52 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.