Inspiration
We all do it. You're scrolling TikTok and a chef breaks down the perfect carbonara technique. You save it. An Instagram reel recommends five hidden restaurants in Paris. You save it. A LinkedIn post explains a fundraising strategy that could change your startup. You save it.
You never go back to any of them.
We realised the problem isn't access to knowledge, it's that the platforms designed to deliver it are built for consumption, not retention. Your saved folder is a graveyard. The best insights in the world are shared in 60-second clips and forgotten in 60 minutes.
BrickNodes started from a simple frustration: why is there no infrastructure to make the content we consume actually stick?
We wanted to build the infrastructure that sits between consumption and knowledge,something that captures what you save, extracts what matters, and gives it back to you in a form you can actually work with.
What It Does
BrickNodes is an AI-powered knowledge base for the content you already consume.
One tap from any app easily shares a TikTok, Instagram reel, carousel, or LinkedIn post to BrickNodes. We handle the rest:
- Capture: Detect the platform and content type automatically
- Process: Gemini 2.0 Flash-Lite transcribes video, OCRs carousels, and extracts key learnings all in a single API call
- Organise: Content is structured into a searchable personal library with semantic tags and categories
- Synthesise: Select multiple pieces of content on the same topic and generate AI-written guides with citations back to the original sources, or structured lists (restaurants, books, tools) pulled directly from the content
- Query: Spin up RAG-powered AI agents grounded in the content you've curatedno hallucination from general training data, just your knowledge
The core loop: save → capture → synthesise → retrieve. Your saved content becomes a living knowledge base that compounds as you use it.
How We Built It
Mobile App: React Native (Expo) with an iOS Share Extension for native one-tap capture from any app. Antigravity IDE for agentic coding
Backend: Next.js API routes handling all third-party API calls through a proxy layer — no client-exposed keys. Authentication via Google OAuth with Firebase Auth.
AI Pipeline:
- Gemini 2.0 Flash-Lite for extraction — transcription, key learnings, tags, and key moments in a single call
- Gemini 2.0 Flash for synthesis — guide generation, list creation, and agent responses
- Gemini text-embedding-004 for vector embeddings enabling semantic search and RAG retrieval
Observability (Opik):
Every LLM call is traced through Opik with full input/output logging, cost tracking, and metadata. We built an LLM-as-judge evaluation pipeline that scores every extraction on four dimensions: tag relevance, key moment accuracy, extraction completeness, and overall quality. Prompts are versioned (v1/v2/v3) and linked to experiments through Opik's native evaluate() API, with fixed datasets enabling direct comparison across prompt versions.
Data & Storage: Firebase (Firestore) for structured data, vector storage for embeddings.
Challenges We Faced
Pushing a frictionless product. Our main issue at the start was thinking about how to eliminate frictions as much as possible, given the varied amount of sources content can come from. At first, we used 3 different third-party APIs to get the metadata, transcript, and key moments of a post. This was not only costly but also slow and unreliable. Eventually we were able to reduce it to one 3rd party API to get the contents metadata, as well as Gemini 2.0 Flash-Lite for the OCR and key learnings.
Carousel processing speed. Our initial pipeline made 10 parallel Gemini OCR calls per carousel, which immediately hit rate limits and took 20 seconds instead of 4. We solved this by batching all images into a single Gemini call with the post caption, dropping to one API call and ~3 seconds.
Prompt engineering for structured output. Getting Gemini to consistently return valid JSON with the right schema across wildly different content types (a 15-second dance tutorial vs a 3-minute coding walkthrough vs a 10-image travel carousel) required three prompt iterations. Opik's experiment framework was invaluable here, we could run the same dataset against v1, v2, and v3 and see exactly where each version improved or regressed.
What We Learned
- Curation is the bottleneck, not content. The insight that drove the entire product. Everyone has access to the same information, not all information is useful and the differentiator is who can retain and organise it.
- Evaluation isn't optional. Without Opik scoring every extraction, we had no idea if prompt changes were actually improvements. LLM-as-judge turned prompt engineering from vibes into data.
What's Next
- Specialist agents built from curated content, no more "assume you're an expert in..." You know exactly where the expertise comes from.
- Fine-tuning pipeline as every extraction scored by the LLM judge creates graded training data. We're building towards QLoRA fine-tuning of smaller models for lower cost, faster extraction, and better quality. The product improves itself.
Built With
- ensembledata
- expo-router
- firebase
- gemini
- k6
- opik
- react-native
- tanstack
Log in or sign up for Devpost to join the conversation.