System is the Google Gemini API, specifically the gemini-2.5-pro o flash model, chosen for its intelligence and massive context window, which is ideal for document analysis.
PDF Processing: The ability to read PDFs directly in the browser is powered by the open-source pdf.js library from Mozilla, which extracts text on a page-by-page basis.
Performance: To speed up the in-depth analysis, I implemented a system of concurrent API calls using Promise.all in JavaScript, which dramatically reduces waiting times.
Chat Logic: The "Chat with your PDF" feature is based on an approach similar to Retrieval-Augmented Generation (RAG): the document's text is indexed into small chunks; when the user asks a question, a search algorithm identifies the most relevant chunks to provide as context to the AI.
Challenges Development presented several complex challenges:
Memory Management: The initial approach of reading an entire PDF into memory caused the browser to crash with large files. This led to the implementation of the "chunking" strategy.
Analysis Speed: The sequential analysis of chunks was too slow. The challenge was to re-architect the logic to execute API calls in parallel while managing the API's rate limits.
Chrome's Security (CSP): Integrating modern JavaScript modules (.mjs) in a Manifest V3 extension caused several errors related to the Content Security Policy, which required specific configurations in the manifest.json to resolve.
Prompt Engineering: Crafting prompts that "force" the AI to answer based only on the provided context, without inventing information, was a process of careful iteration and refinement.
Accomplishments I wanted to create a tool that allows me to quickly gather information on web pages and PDFs.
I'm satisfied with the result; the program does excellent text analysis and captures key aspects and organizes topics well. I probably should have refined some aspects, but I work on too many projects and don't have much time, since I created this project myself. However, I'm satisfied with it.
Advanced PDF Features: For PDF documents, Lens offers two revolutionary modes:
Fast In-Depth Analysis: For long PDFs, it analyzes the document in "pieces" by sending multiple requests to the AI in parallel, then assembles a single, detailed final report in record time. What I learned Of course. Here is the project description formatted exactly as you requested.
Inspiration
In an information-saturated digital age, distinguishing fact from opinion and deeply understanding complex documents is more challenging than ever. The inspiration for Lens AI Analysis Toolkit came from the need for an intelligent tool to "see through" the noise. The goal was to create a personal assistant that doesn't just summarize content, but actively helps you question, analyze, and converse with any text, whether it's a news article or a dense PDF document.
What it does
Lens is a powerful Chrome extension that integrates into your browser as a side panel, transforming how you interact with online content.
Standard Analysis: For any webpage or PDF, Lens can perform instant analysis to identify:
Bias: Detects prejudice and partial viewpoints in the text.
Tone: Analyzes the emotional sentiment and writing style.
Argument: Extracts the main claims and their supporting evidence.
Conceptual Map: Generates a visual map (using Mermaid.js) of the main ideas and their connections.
Fast In-Depth Analysis: For long PDFs, it analyzes the document in "pieces" by sending multiple requests to the AI in parallel, then assembles a single, detailed final report in record time.
How I built it
The project was built using a mix of modern web technologies and powerful artificial intelligence APIs.
Architecture: It's a Chrome Extension (Manifest V3) built with HTML, CSS, and JavaScript.
AI Engine: The core of the system is the Google Gemini API, specifically the gemini-1.5-pro-latest model, chosen for its intelligence and massive context window, which is ideal for document analysis.
PDF Processing: The ability to read PDFs directly in the browser is powered by the open-source pdf.js library from Mozilla, which extracts text on a page-by-page basis.
Performance: To speed up the in-depth analysis, I implemented a system of concurrent API calls using Promise.all in JavaScript, which dramatically reduces waiting times.
Chat Logic: The "Chat with your PDF" feature is based on an approach similar to Retrieval-Augmented Generation (RAG): the document's text is indexed into small chunks; when the user asks a question, a search algorithm identifies the most relevant chunks to provide as context to the AI.
Development presented several complex challenges:
Memory Management: The initial approach of reading an entire PDF into memory caused the browser to crash with large files. This led to the implementation of the "chunking" strategy.
Analysis Speed: The sequential analysis of chunks was too slow. The challenge was to re-architect the logic to execute API calls in parallel while managing the API's rate limits.
Chrome's Security (CSP): Integrating modern JavaScript modules (.mjs) in a Manifest V3 extension caused several errors related to the Content Security Policy, which required specific configurations in the manifest.json to resolve.
Prompt Engineering: Crafting prompts that "force" the AI to answer based only on the provided context, without inventing information, was a process of careful iteration and refinement.
What I learned:
This project was an incredible learning opportunity.
I deepened my knowledge of asynchronous JavaScript, especially async/await and Promise.all for managing complex operations.
I gained solid practical experience in developing Chrome Manifest V3 extensions, navigating their intricacies and security rules.
I learned how to design and implement a basic Retrieval-Augmented Generation (RAG) system, a fundamental technique for creating AI applications that respond accurately based on specific source material.
What's next for Future I definitely want it to be perfectly synthesised and capture the key arguments. I've already done it, but it can be improved, and I'd also like to optimize the API calls.
I want to reduce the cost per call as much as possible. And then I want to make it more graphically appealing.
Built With
- css
- gemini-2.5-pro-latest
- html
- javascript
- json
- mermaid.js
- pdf.js