Inspiration We often see big PDF files that contain many different documents together, like invoices and delivery notes. It takes time to find where each document starts. We wanted to use AI to solve this problem.
What it does This tool looks at each page of a PDF and tells you which pages start a new document. It can handle things like:
How we built it We convert each page of the PDF to text or image.
We send pages to a language model (LLM) and ask: “Is this a new document?”
We collect all the answers and return the page numbers that start new documents.
Challenges we ran into Some PDFs are too long, so we had to break them into parts.
Teaching the LLM to spot document changes was hard at first.
Setting up Ollama locally took time and space (big models!).
Some scanned PDFs had poor quality text.
Accomplishments The tool works with different LLMs.
It can run both online and offline.
It’s easy to use and ready for real-world PDFs.
What we learned How to use different LLMs in the same project.
How to write better prompts for page classification.
How to handle long or messy PDF files with AI.
What’s next Add a simple web interface.
Make the model smarter with more examples.
Classify page types (invoice vs receipt).
Add confidence scores to help trust the results.
Log in or sign up for Devpost to join the conversation.