Inspiration
Our inspiration stems from the frustration of fragmented knowledge management. Developers, students, and analysts often store documents, code files, and images in separate tools, resulting in no shared context. Traditional keyword search fails to understand concepts and intent. We built PRISM to address this societal challenge by unifying all data into a single reasoning space, aiming to boost knowledge accessibility and productivity by up to 65%.
What it does
PRISM transforms chaotic digital content into a queryable knowledge hub. It indexes 46+ file types (documents, 35+ code languages, and images) and allows users to query them via a dual-mode RAG chat. It provides context-aware answers with source citations, visualizes semantic relationships via an interactive vector graph, and offers smart recommendations to discover hidden cross-modal connections between documents, code, and images.
How we built it
The application is built on a custom, high-performance RAG pipeline integrating three core services: Qdrant Cloud stores the vector index; Google Gemini AI handles embeddings (text-embedding-004), chat (gemini-2.5-flash), and image vision; and Appwrite Cloud manages authentication, storage, and chat history. We engineered a proprietary pipeline named UVAMP (Unified Vector and Multimodal Processor) to handle document chunking, code parsing, and image description generation.
Challenges we ran into
The primary challenge was achieving high RAG Groundedness across three disparate modalities. Specifically:
Code Similarity: Ensuring semantic code search (finding similar functions, not just keywords) worked across 35+ languages.
Multimodal Unification: Consistently embedding and retrieving text, code, and Gemini-generated image descriptions within the same 768-dimensional vector space.
Real-time Synchronization: Building an event-driven architecture to keep Appwrite storage statistics, document counts, and Qdrant vector status synchronized.
Accomplishments that we're proud of
Achieving 89% RAG Groundedness validated by a strict 0.5 semantic retrieval threshold.
Successfully building a True Multimodal RAG system that unifies documents, code, and images.
Developing the innovative Vector Insights Dashboard featuring a force-directed graph with UMAP dimensionality reduction for 768D vectors.
The entire application is mobile-optimized, secure, and deployed on a professional dark-themed UI.
What we learned
We gained deep expertise in:
The necessity of using metadata filtering (userId, documentType) in Qdrant payloads to enforce security and user isolation.
Advanced prompt engineering and response structuring to enable multimodal cross-referencing within the Gemini RAG workflow.
How to utilize UMAP to visually represent highly complex, high-dimensional vector spaces in a meaningful 2D format.
What's next for Prism
3D Vector Visualization: Explore documents, code, and images in immersive 3D space.
Realtime Collaboration Tools: Share files, insights, and recommendations with teams in real-time.
Support for YouTube Indexing: Integrate YouTube URLs to index transcripts, chapters, and visual metadata for semantic search.
Native Mobile App: Access your knowledge base on the go with dedicated iOS and Android apps.
Built With
- appwrite
- gemini
- nextjs
- qdrant
- tailwindcss
- uvmap-custom-pipeline