DocWeb

Homepage
Preview page
Generated HTML page

Inspiration

Many organizations struggle with converting static PDF documents into accessible, interactive web content. The rise of AI models like Baidu's ERNIE and PaddleOCR-VL provided the perfect opportunity to build an intelligent, automated solution.

What it does

DocWeb is a web application that transforms PDF documents into responsive webpages using artificial intelligence. The platform:

Extracts text from multi-page PDFs using PaddleOCR-VL for accurate optical character recognition
Converts content into structured Markdown format
Generates HTML webpages with AI-powered styling via ERNIE 4.5
Exports multiple formats (HTML, Markdown, JSON) for maximum flexibility
Provides real-time previews of generated content

How I built it

DocWeb was built using a modern, modular architecture:

Frontend: Streamlit for an intuitive, responsive user interface
OCR Engine: PaddleOCR-VL for document text extraction
AI Models: Baidu's ERNIE 4.5 for intelligent HTML generation and styling
Processing Pipeline: Custom Python modules for PDF extraction, Markdown conversion, and HTML generation
Styling: CSS-based theming for a clean, professional interface with full customization

The workflow follows a logical five-step process: Upload → Extract → Convert → Generate → Download

Challenges we ran into

Several technical challenges shaped our development:

OCR Accuracy: Ensuring reliable text extraction from PDFs with varying quality, layouts, and fonts
Markdown Conversion: Preserving document structure and formatting during conversion
UI/UX Design: Creating an intuitive interface while maintaining performance with large files

Accomplishments that we're proud of

Multiple Export Formats: Users can download HTML, Markdown, and JSON—maximizing flexibility
Real-time Preview: Implemented live preview functionality so users see results instantly
Error Handling: Robust error management with user-friendly feedback messages

What I learned

This project taught me valuable lessons:

How to integrate multiple AI services (OCR + LLM) into a cohesive workflow
The balance between automation and user control in AI applications
Document processing challenges: PDFs are not uniformly structured, requiring flexible approaches
The power of combining multiple specialized AI models for superior results

What's next for DocWeb

Future enhancements we're planning:

Batch Processing: Enable users to convert multiple PDFs simultaneously
Advanced Styling: Give users templates and customization options for generated HTML
API Endpoint: Expose DocWeb as an API for enterprise integration

Built With

ernie
paddleocr-vl
streamlit

Updates

Ujwal Kandi started this project — Dec 18, 2025 10:48 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.