Prism

Meet Prism! You AI-powered financial report advisor.

Comment

Inspiration

We wanted to make a PDF report advisor since current LLMs are inefficient at navigating and querying long, structured documents. By applying a RAG pipeline, we can index and retrieve only the most relevant chunks, reducing latency and cost.

What it does

Upload & extract: Users POST a PDF; we parse every page with PyMuPDF and combine the text.
RAG backend: We split the text into chunks, embed them with HuggingFace embeddings, and store/retrieve vectors in ChromaDB. On a question, we retrieve the top‑n relevant passages and feed those with the query into Google Gemini.
Stateless chat API: Two endpoints (/upload-pdf, /chat) that handle file ingestion, retrieval, and LLM querying.

How we built it

Flask for routing and API endpoints.
Flask‑CORS to enable cross‑origin calls for a decoupled front-end.
PyMuPDF to extract raw text from PDFs.
LangChain orchestrating text splitting and retrieval logic.
HuggingFaceEmbeddings + ChromaDB for vector store and similarity search.
Google Gemini SDK (google.genai) as the LLM inference engine.
dotenv for managing environment variables securely.

Challenges we ran into

Token limits: Feeding entire documents to the LLM crashed on longer reports, solved by chunking and retrieval.
State management: Balancing chat history growth vs. context freshness.
Error handling: Corrupted PDF pages and rate limits required robust exception handling.
Frontend integration: We implemented the full RAG pipeline on the backend but haven’t yet wired the React/Next.js UI to consume the /chat and retrieval endpoints.

Accomplishments that we’re proud of

RAG pipeline: Chunking, embedding, and similarity search fully functional, drastically cutting down context size.
End‑to‑end prototype: From file upload to LLM response in under 500 lines of clean, modular Python.
LLM integration: Smooth hot‑reload of API keys and prompt templates via .env.

What we learned

Efficient retrieval: How chunk size, overlap, and embedding model selection affect relevance.
Prompt engineering: Crafting retrieval‑augmented prompts improved answer accuracy.
Integration patterns: Decoupling the retrieval layer from the chat history, setting up CORS, and designing stateless APIs.

What’s next for Prism

Frontend hookup: Build/extend our frontend to call our /upload-pdf and /chat endpoints, displaying retrieved passages and chat bubbles.
UX improvements: Highlight source passages in the PDF viewer for transparency.
Persistence: Add user authentication and save past reports/chat histories in a database.
Performance tuning: Experiment with alternative vector stores (FAISS) and embedding models for speed/cost trade‑offs.

Built With

gemini
huggingface
langchain
python
react

Updates

Ayman Haque Haque started this project — Apr 19, 2025 10:41 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.