Neuro Notes: The Cognitive Neural Layer for Synchronous Collaboration
1. Inspiration: The Physics of Information Decay
We approached the problem of lost meeting knowledge as a thermodynamic system. In an unobserved meeting, structured data (decisions, logic) degrades into unstructured noise over time. We modeled this "Value Retention" () using a modified Information Entropy decay function: $$V(t) = V_0 \cdot e^{-(\lambda + \delta)t}$$ Where:
- : Initial Semantic Density (bits of useful information per minute).
- : The "Forgetting Constant" (cognitive decay of attendees).
- : The "Documentation Friction" (loss due to manual note-taking lag).
The Neuro Notes Innovation: Traditional transcription only captures the raw signal (). Neuro Notes introduces a Real-Time Structure Function that runs synchronously. By structuring data at , we effectively set and , creating an immutable digital asset that resists entropy.
2. System Capabilities: From Signal to Intelligence
Neuro Notes operates as a high-frequency trading algorithm for speech—ingesting, analyzing, and executing on data streams in milliseconds.
A. Active Semantic Filtering & Signal Extraction
The system utilizes a Relevance Function to filter the noisy speech stream . For every speech segment , we calculate a vector importance score:
$$R(s_i) = \alpha \cdot \text{sim}(s_i, \text{Context}{\text{global}}) + \beta \cdot \mathbb{I}(\text{Intent}{\text{markers}})$$
- If $R(s_i) > \theta_{retention}$ (Threshold), the segment is persisted as Key Context.
- If explicit intent markers (e.g., "I will," "We decided") are detected, the segment is routed to the Deterministic Extraction Pipeline to create JSON artifacts (Action Items/Decisions).
B. Generative UI (The "Polymorphic Interface")
We pioneered Voice-to-Visuals, where the frontend interface is not static but a function of the conversation.
- Command: "Neuro Notes, plot the latency vs. user load."
- Transformation: The system executes a mapping function :
$$f_{\text{gen}}: (\text{Speech}, \text{Data}{\text{context}}) \to \text{RenderableComponent}(\text{Type}, \text{Config}{\text{JSON}})$$
- Output: The React frontend dynamically mounts a
<Recharts />or<Mermaid />component based on the inferredType.
C. Context-Aware RAG (The "Archive Auditor")
We enable "Time-Travel Querying" using high-dimensional vector search. When a user asks a question , we retrieve relevant context using Cosine Similarity over the embedding space $\mathbb{R}^{768}$:
$$\text{Score}(Q, D_i) = \frac{\sum_{j=1}^{n} Q_j D_{ij}}{\sqrt{\sum_{j=1}^{n} Q_j^2} \sqrt{\sum_{j=1}^{n} D_{ij}^2}}$$
This allows precise retrieval of facts (e.g., "What was the budget decision in Q3?") from a dataset of millions of tokenized meeting minutes.
3. Engineering Architecture: The "Tri-Layer" Distributed System
We architected Neuro Notes as a distributed neural system designed for resilience and low latency.
Layer 1: The Ingestion Layer (The Ear)
Technology: "Grey Hat" Chrome Extension (Manifest V3)
Core Logic: DOM Mutation Observation
Instead of processing raw audio (high latency, privacy risk), we scrape the DOM. We attach a MutationObserver to the caption container with a specific config:
const config = { childList: true, subtree: true, characterData: true };
To handle the continuous stream, we model the capture as a Discrete Sampling Function: $$S_{captured}(t) = \int_{t}^{t+\Delta t} \sum \delta(t - t_{change}) \cdot \text{DOM}_{text} \, dt$$
- : The sampling rate ensures we capture rapid speech without blocking the browser's main thread.
- Deduplication: We implement a Rolling Hash Algorithm (Rabin-Karp) on incoming strings to instantly reject duplicate caption frames sent by Google Meet.
Layer 2: The Cognitive Core (The Brain)
Technology: Node.js Cluster + Gemini 1.5 Flash Core Logic: Sliding Window Segmentation with Context Carry-Over.
To solve the "Boundary Problem" (sentences cut between batches), we define our processing window with a Context Overlap : $$W_n = [ t_{start} - \epsilon, \ t_{end} ]$$
- : The last 10 seconds of Batch are pre-pended to Batch .
- Reasoning Pipeline: The LLM does not just "summarize." It executes a strictly typed extraction: $$f_{LLM}: \text{Text}_{raw} \xrightarrow{\text{Schema Validation}} { \text{Actions}: \text{Array}, \text{Decisions}: \text{Array} }$$
Layer 3: The Synchronization Layer (The Hand)
Technology: Socket.io (Transport) + Firebase Firestore (Persistence) + n8n (Effectors)
| Component | Responsibility | Latency Budget |
|---|---|---|
| Broadcaster | Pushes JSON diffs to client (Socket.io) | |
| Ledger | Persists state to Firestore (Atomic Writes) | |
| Effector | n8n Webhook dispatch to Jira/Slack | Async |
4. Challenges & Algorithmic Solutions
Challenge A: The "Hallucination" of Decisions
LLMs often interpret casual suggestions as firm decisions. Solution: The Consensus Verification Algorithm. We treat decision extraction as a probabilistic classification problem. A decision is only committed to the database if the Conditional Probability of Agreement given the Context exceeds a high threshold: $$P(\text{Agreement} \mid \text{Context}) > \tau_{consensus} \quad (\text{where } \tau = 0.85)$$
- We enforce this via Chain-of-Thought Prompting: The model must identify explicit linguistic markers (e.g., "Agreed," "Let's lock that in," "No objections") before calculating the probability score.
Challenge B: The Latency vs. Context Trade-off
Real-time AI is a fight between speed and intelligence.
- Small Buffer (): Fast updates, but AI lacks context (misses "it" references).
- Large Buffer (): Deep context, but UI lags behind reality.
- Solution: We found the global optimum at ****. This creates a "Reasoning Pulse" every minute that feels natural to the user while providing enough tokens for the LLM to reason correctly.
5. Accomplishments: Quantifiable Success
1. The "Money Clock" (Behavioral Engineering)
We implemented a real-time integral calculus function to visualize the Burn Rate of the meeting. This is not just a metric; it is a psychological nudge to reduce meeting bloat. $$Cost(T) = \int_{0}^{T} \left( \sum_{i=1}^{N} \frac{\sigma_i}{\Omega_{annual}} \right) dt$$
- : Number of attendees (detected via Extension).
- : Estimated annual salary of attendee .
- : Standard working hours (2080).
- Result: Test groups reduced meeting duration by 18% when the ticker was visible.
2. End-to-End Latency
By decoupling the Ingestion (Client) from Processing (Server), we achieved an Event-to-Insight Latency () of < 200ms for live transcription updates and < 2s for AI-generated insights.
6. What's Next: The "Organizational Knowledge Graph"
We are moving beyond single-meeting intelligence to a Graph-Based Memory System. We formally define the organization as a Knowledge Graph , where:
- Nodes (): Entities (Projects, People, Decisions, Dates).
- Edges (): Semantic Relationships ("Owned By," "Due On," "Blocked By").
The Future Algorithm: If "Project Alpha" is discussed in Meeting and Meeting , Neuro Notes will infer a relationship edge:
E(A, B) = flink(EntityAlpha, ContextA, ContextB)
This creates a self-healing, searchable corporate brain where you can query: "Show me the decision timeline for Project Alpha across all Q3 meetings."
Log in or sign up for Devpost to join the conversation.