Inspiration

When I did Jigsaw 2025 Kaggle Challenge last 2 weeks, I had to read through some elementary paper of NLP to have some foundational knowledge to train one model using BERT. And I realize it would be quicker to know what paper might benefit me if I have a summarizer right in my pocket to use.

What it does

User clicks Extract → Detect content type → Extract text → User clicks → Summarize → Tier decision → Process → Display summary

How we built it

Vanilla HTML and CSS, with a popup.js and a background.js to extract the texts, do some background works, then make API calls.

Challenges we ran into

Initially, you could extract text from regular web pages easily using document.body.innerText. But then you wanted to handle PDFs, there is no accessible DOM. Then the tool crash when I try to summarize the cat's Wikipedia page, it is too long. Therefore I come up with the chunking method, that I will splice the texts into smaller chunks, and make one API call for each chunk. But the chunk method is a bit overwhelming when dealing with the academic paper, you would see that academic papers fall between 10–20 pages. So I build a function to output a confidence score out of 5, and if one file is >= 3, it is a academic paper, I would only make API calls upon the Abstract, Introduction, Result, and Summary.

What's next for AI Text Summarizer

Improving the conversion rate of how much to splice a long text, to optimize the number of API calls and the quality of output for the chunking method. Next feature is to build a Prompt AI into this extension so you can ask questions to elaborate besides the summary generated.

Built With

Share this project:

Updates