Inspiration
Browser like comet and openai atlas have feature like this to interact with web page where user can query about the present content of the page and LLM can answer that
What it does
It summarizes the content (content can be the web page or your copied text). When it summarizes the active tab (webpage which user currently viewing in chrome) user can also ask question related to the content of webpage. It will boost the productivity of user by reducing time consumed in reading long articles of webpage. If internet connectivity is poor or their is no internet al all then also user can get answers to their general queries.
How i built it
I use gemini nano model present in chrome to interact with my content like text to summarize and prompts. I built that in javascript, HTML5 and CSS3. Used vs code as IDE.
The page content get scraped and further get chunked before feeding the content to gemini nano so that context window length should not exceed.
I stored the page content in form of chunks in localstorage of chrome and when user ask the query it will look into that and smartly get the most matched keyword chunk and provide it to gemini nano which will enhance the text.
For scraping the webpage content i am using pure javascript like document.getElementsByTagName() to get some tags data (majorly in which contents are present)
Coming to API part i used Prompt API and Summarize API.
It is like a RAG implementation but not totally since its purely on client side
User can use it without internet
Challenges i ran into
The major challenge is to mimic RAG kind stuff since its totally on client side and user can access it in offline mode as well so i can't use embedding models, vector databases, similarity search like cosine since all that required server side implementation too.
Accomplishments that i proud of
It is performing very well providing RAG like behaviour and its all on client side. So NO INTERNET IS REQUIRED INTERACT WITH LLM IN BROWSER
What i learned
To use gemini nano model built in chrome and how can we utilize it to make good products that solve many problems. With this model we can make our website to interact with gemini nano in client side of chrome browser and use it for various cases like to translate, chat, answer some queries
What's next for Pagesumm AI chrome extension
Enhancing the performance ,making it persistent with content,adding memory and enhanced searching
Log in or sign up for Devpost to join the conversation.