I started building PDF Analyzer after seeing similar needs in a hackathon. While semantic chunking wasn’t a stated problem, I realized it's a real challenge. Even today, there's no general AI that can semantically chunk large PDFs reliably. This is especially useful for scientific papers. Initially, I used GPT and regex rules (which work for ~50% of cases), but I hit the core issue: no one-size-fits-all logic exists. So now, I'm developing my own AI that understands document structure and performs intelligent, adaptive chunking.

Built With

  • mongodb
  • nextjs
  • openai
  • prisma
  • shadcn/ui
  • tailwind
  • xenova/transforms
Share this project:

Updates