GraspEaser

Inspiration

Often there are pieces of information that are difficult to understand and that's where we thought to make a project that simplifies content and later we integrated a chatbot for URL, text, PDF, DOCX input to chat with content.

What it does

GraspEaser is a content simplification tool that processes various input types, including video, images, text, PDFs, DOCX files, and URLs, using a Large Language Model (LLM). For videos and images, it directly sends the content to the LLM for simplification wherease other input type get simplified with help of retrieval augmented generation. When provided with a URL, it scrapes the webpage to extract text before processing. For inputs other than video and images it loads the content, splits it into chunks, generates embeddings, and stores the chunks in a vector store memory. Additionally, GraspEaser features a chatbot that allows users to ask questions based on the uploaded content, the system retrieves relevant chunks from the vector store and provides both the extracted content and the prompt to the LLM for generating a response following a Retrieval-Augmented Generation (RAG) approach to deliver proper responses

How we built it

Frontend: Next.js, typescript, jotai Backend: node.js, express.js, prisma, postgresql, langchain API: gemini

Challenges we ran into

Processing diverse input types and ensuring smooth handling of video, images, text, PDFs,DOCX and URLs.
Efficient text chunking and chunk retrieval from the vector memory store based on prompt.
Optimizing LLM responses to ensure high-quality content simplification and relevance.
Managing large-scale data storage and retrieval while keeping the system fast and efficient.

Accomplishments that we're proud of

Successfully built a system that processes video, images, text, PDFs, DOCX files, and URLs for content simplification.
Implemented Retrieval-Augmented Generation (RAG) with a vector memory store for proper responses.
Developed a chatbot that allows users to interact with the content the user gave.
Efficiently handled LLM integration to improve response quality.
Overcame challenges related to scraping and query optimization for better user experience.

What we learned

Efficient data processing for text, PDFs,DOCX and URLs.
Implementing Retrieval-Augmented Generation (RAG) using embeddings and a vector memory store.
Optimizing LLM interactions for better content simplification and response generation.
Handling web scraping challenges to extract text from URLs.
Enhancing user experience by building an interactive chatbot for content-based queries.
Managing large-scale data by effectively splitting, storing, and retrieving relevant chunks.