Name: Second Brain AI

Markdown Submission:

Inspiration

In today's fast-paced digital world, we are constantly bombarded with information: meetings, tutorials, brainstorming sessions, and online conferences. However, the brightest ideas and most crucial actions often vanish into thin air as soon as the window closes. Traditional note-taking tools are manual and often interrupt the flow of thought. We were inspired by the concept of a "Second Brain"—a system for organizing and connecting knowledge—and asked ourselves: What if we could build one that operated autonomously? What if an AI could be our ever-present companion, observing, listening, and weaving a web of knowledge in real time without us having to lift a finger?

What it does

Second Brain AI transforms your computer into an intelligent knowledge partner. It's a web application that uses the Gemini Live API to:

  1. See and Hear Your Screen: When you start a session, you share your screen and microphone. The AI ​​simultaneously processes the visual and auditory context.

  2. Converse in Real Time: You can talk to the AI, ask it questions about what's on the screen, and it will respond with low-latency audio, creating a fluid and natural conversation.

  3. Automatically Extract Ideas: As you work and speak, the AI ​​analyzes the conversation and screen content to intelligently identify and extract key "ideas": facts, actions, questions, and memories.

  4. Build a Dynamic Knowledge Graph: This is where the magic happens! The AI ​​not only extracts ideas but also understands the relationships between them. These connections are instantly visualized in a stunning, interactive 3D knowledge graph that grows and evolves throughout your session. 5. Create a Persistent Memory: Each session is saved, including the full transcript, all extracted ideas, and an AI-generated summary. Over time, you build a complete and searchable archive of your work and thoughts.

  5. Unify Knowledge: The application can merge the graphs of all past sessions into a "Global Graph," revealing surprising and unexpected connections between topics from different days or projects.

It is, in essence, an automated knowledge base that builds your personal knowledge base while you simply do your work.

How we built it

  • Frontend: We built a polished and responsive user interface with React, TypeScript, and Tailwind CSS, focusing on an intuitive user experience.

  • Real-Time Conversational AI: The core of the application is the Gemini Live API (gemini-2.5-flash-native-audio-preview-09-2025). We handle audio streams (input and output) and image streams (from screenshots) to enable real-time multimodal interaction.

  • Intelligent Idea Extraction: We use Gemini 2.5 Flash with a defined JSON response schema (responseSchema). This forces the AI ​​to return ideas in a structured format and, more importantly, to identify and list the relationships between new and existing ideas, which directly feeds the graph.

  • Data Visualization: The 3D knowledge graph is rendered using react-force-graph-3d, leveraging Three.js for a smooth and interactive experience.

    • Persistence: In the current prototype, we use the browser's localStorage API for rapid development, allowing users to save and reload sessions.

Architecture with Google Cloud Run in Mind: While the current prototype operates on the client side, it is designed to scale with a serverless backend on Google Cloud Run. This backend would handle:

  • Secure Data Storage: Migrating persistence from localStorage to a scalable database such as Firestore or Cloud SQL.

  • User Authentication: Managing secure user accounts.

  • API Key Management: Centralizing and securing the Gemini API key instead of exposing it on the client.

  • Asynchronous Processing: Running heavier tasks, such as re-analyzing old sessions or generating complex summaries, as Cloud Run services.

Challenges We Ran Into

  1. Multimodal Stream Synchronization: Managing microphone audio input, screen video frames, AI audio output, and real-time transcription updates without creating latency or race conditions was a significant engineering challenge.

  2. Generating Reliable Connections: "Convincing" the AI ​​not only to extract ideas but also to consistently identify and return connections.

Built With

Share this project:

Updates