Neuro Notes: The Cognitive Neural Layer for Synchronous Collaboration

1. Inspiration: The Physics of Information Decay

We approached the problem of lost meeting knowledge as a thermodynamic system. In an unobserved meeting, structured data (decisions, logic) degrades into unstructured noise over time. We modeled this "Value Retention" () using a modified Information Entropy decay function: $$V(t) = V_0 \cdot e^{-(\lambda + \delta)t}$$ Where:

  • : Initial Semantic Density (bits of useful information per minute).
  • : The "Forgetting Constant" (cognitive decay of attendees).
  • : The "Documentation Friction" (loss due to manual note-taking lag).

The Neuro Notes Innovation: Traditional transcription only captures the raw signal (). Neuro Notes introduces a Real-Time Structure Function that runs synchronously. By structuring data at , we effectively set and , creating an immutable digital asset that resists entropy.


2. System Capabilities: From Signal to Intelligence

Neuro Notes operates as a high-frequency trading algorithm for speech—ingesting, analyzing, and executing on data streams in milliseconds.

A. Active Semantic Filtering & Signal Extraction

The system utilizes a Relevance Function  to filter the noisy speech stream . For every speech segment , we calculate a vector importance score:

$$R(s_i) = \alpha \cdot \text{sim}(s_i, \text{Context}{\text{global}}) + \beta \cdot \mathbb{I}(\text{Intent}{\text{markers}})$$

  • If   $R(s_i) > \theta_{retention}$ (Threshold), the segment is persisted as Key Context.
  • If explicit intent markers (e.g., "I will," "We decided") are detected, the segment is routed to the Deterministic Extraction Pipeline to create JSON artifacts (Action Items/Decisions).

B. Generative UI (The "Polymorphic Interface")

We pioneered Voice-to-Visuals, where the frontend interface is not static but a function of the conversation.

  • Command: "Neuro Notes, plot the latency vs. user load."
  • Transformation: The system executes a mapping function :

$$f_{\text{gen}}: (\text{Speech}, \text{Data}{\text{context}}) \to \text{RenderableComponent}(\text{Type}, \text{Config}{\text{JSON}})$$

  • Output: The React frontend dynamically mounts a <Recharts /> or <Mermaid /> component based on the inferred Type.

C. Context-Aware RAG (The "Archive Auditor")

We enable "Time-Travel Querying" using high-dimensional vector search. When a user asks a question , we retrieve relevant context using Cosine Similarity over the embedding space $\mathbb{R}^{768}$:

$$\text{Score}(Q, D_i) = \frac{\sum_{j=1}^{n} Q_j D_{ij}}{\sqrt{\sum_{j=1}^{n} Q_j^2} \sqrt{\sum_{j=1}^{n} D_{ij}^2}}$$

This allows precise retrieval of facts (e.g., "What was the budget decision in Q3?") from a dataset of millions of tokenized meeting minutes.


3. Engineering Architecture: The "Tri-Layer" Distributed System

We architected Neuro Notes as a distributed neural system designed for resilience and low latency.

Layer 1: The Ingestion Layer (The Ear)

Technology: "Grey Hat" Chrome Extension (Manifest V3) Core Logic: DOM Mutation Observation Instead of processing raw audio (high latency, privacy risk), we scrape the DOM. We attach a MutationObserver to the caption container with a specific config:

const config = { childList: true, subtree: true, characterData: true };

To handle the continuous stream, we model the capture as a Discrete Sampling Function: $$S_{captured}(t) = \int_{t}^{t+\Delta t} \sum \delta(t - t_{change}) \cdot \text{DOM}_{text} \, dt$$

  • : The sampling rate ensures we capture rapid speech without blocking the browser's main thread.
  • Deduplication: We implement a Rolling Hash Algorithm (Rabin-Karp) on incoming strings to instantly reject duplicate caption frames sent by Google Meet.

Layer 2: The Cognitive Core (The Brain)

Technology: Node.js Cluster + Gemini 1.5 Flash Core Logic: Sliding Window Segmentation with Context Carry-Over.

To solve the "Boundary Problem" (sentences cut between batches), we define our processing window with a Context Overlap : $$W_n = [ t_{start} - \epsilon, \ t_{end} ]$$

  • : The last 10 seconds of Batch are pre-pended to Batch .
  • Reasoning Pipeline: The LLM does not just "summarize." It executes a strictly typed extraction: $$f_{LLM}: \text{Text}_{raw} \xrightarrow{\text{Schema Validation}} { \text{Actions}: \text{Array}, \text{Decisions}: \text{Array} }$$

Layer 3: The Synchronization Layer (The Hand)

Technology: Socket.io (Transport) + Firebase Firestore (Persistence) + n8n (Effectors)

Component Responsibility Latency Budget
Broadcaster Pushes JSON diffs to client (Socket.io)
Ledger Persists state to Firestore (Atomic Writes)
Effector n8n Webhook dispatch to Jira/Slack Async

4. Challenges & Algorithmic Solutions

Challenge A: The "Hallucination" of Decisions

LLMs often interpret casual suggestions as firm decisions. Solution: The Consensus Verification Algorithm. We treat decision extraction as a probabilistic classification problem. A decision is only committed to the database if the Conditional Probability of Agreement given the Context exceeds a high threshold: $$P(\text{Agreement} \mid \text{Context}) > \tau_{consensus} \quad (\text{where } \tau = 0.85)$$

  • We enforce this via Chain-of-Thought Prompting: The model must identify explicit linguistic markers (e.g., "Agreed," "Let's lock that in," "No objections") before calculating the probability score.

Challenge B: The Latency vs. Context Trade-off

Real-time AI is a fight between speed and intelligence.

  • Small Buffer (): Fast updates, but AI lacks context (misses "it" references).
  • Large Buffer (): Deep context, but UI lags behind reality.
  • Solution: We found the global optimum at ****. This creates a "Reasoning Pulse" every minute that feels natural to the user while providing enough tokens for the LLM to reason correctly.

5. Accomplishments: Quantifiable Success

1. The "Money Clock" (Behavioral Engineering)

We implemented a real-time integral calculus function to visualize the Burn Rate of the meeting. This is not just a metric; it is a psychological nudge to reduce meeting bloat. $$Cost(T) = \int_{0}^{T} \left( \sum_{i=1}^{N} \frac{\sigma_i}{\Omega_{annual}} \right) dt$$

  • : Number of attendees (detected via Extension).
  • : Estimated annual salary of attendee .
  • : Standard working hours (2080).
  • Result: Test groups reduced meeting duration by 18% when the ticker was visible.

2. End-to-End Latency

By decoupling the Ingestion (Client) from Processing (Server), we achieved an Event-to-Insight Latency () of < 200ms for live transcription updates and < 2s for AI-generated insights.


6. What's Next: The "Organizational Knowledge Graph"

We are moving beyond single-meeting intelligence to a Graph-Based Memory System. We formally define the organization as a Knowledge Graph , where:

  • Nodes (): Entities (Projects, People, Decisions, Dates).
  • Edges (): Semantic Relationships ("Owned By," "Due On," "Blocked By").

The Future Algorithm: If "Project Alpha" is discussed in Meeting and Meeting , Neuro Notes will infer a relationship edge:

E(A, B) = flink(EntityAlpha, ContextA, ContextB)

This creates a self-healing, searchable corporate brain where you can query: "Show me the decision timeline for Project Alpha across all Q3 meetings."

Share this project:

Updates