📝 About the Project
🌟 Inspiration
Every PDF contains information—reports, assignments, brochures, menus, research papers—but most of it stays locked inside static pages. With ERNIE and PaddleOCR-VL, I wanted to explore how AI can free that content and instantly transform a PDF into a modern, accessible webpage.
This warm‑up task was the perfect opportunity to experiment with:
- AI‑based OCR
- Markdown transformation
- HTML generation using ERNIE
- GitHub Pages deployment
It felt like building a tiny “AI publishing pipeline” from scratch.
🛠️ How I Built It
1. Extracting Text with PaddleOCR-VL
I uploaded my PDF to PaddleOCR-VL and used its advanced OCR + layout features to extract:
- Clean text
- Structure
- Basic layout information
This gave me a solid foundation for generating Markdown.
2. Converting OCR Output into Markdown
The OCR results were cleaned, formatted, and structured into readable Markdown. Headings, bullet points, and paragraphs were organized to mirror the PDF’s layout.
3. Generating the Webpage Using ERNIE
I fed the Markdown into ERNIE with a simple prompt:
“Convert this Markdown into a clean HTML webpage with modern styling.”
ERNIE generated a responsive HTML page with inline CSS, matching the original content and adding a professional look.
4. Deploying with GitHub Pages
I created a repository, uploaded the generated index.html,
and deployed it using GitHub Pages to publish the final webpage online.
🚧 Challenges I Faced
- OCR noise: Extracted text sometimes needed cleaning, especially in tables or unusual fonts.
- Maintaining layout: Matching the visual structure of the PDF required careful Markdown formatting.
- Prompt tuning: Getting ERNIE to generate clean HTML required multiple prompt iterations.
- Deployment tweaks: Ensuring the HTML rendered properly on GitHub Pages took some adjustments.
Each challenge helped me better understand AI models, modern web rendering, and the importance of clean structured data.
🎓 What I Learned
- How to integrate OCR, LLMs, and web deployment into one workflow
- How ERNIE handles Markdown → HTML transformations
- The power of PaddleOCR‑VL for structured text extraction
- Practical GitHub Pages deployment
- How to convert a static document into an interactive digital experience
This task strengthened my confidence in building pipelines that combine OCR + LLMs + Web.
🧰 Built With
- PaddleOCR-VL — PDF text + layout extraction
- ERNIE 5 — Markdown-to-HTML generation
- Markdown — Intermediate document formatting
- HTML/CSS — Final webpage
GitHub Pages — Deployment
AI Studio / HuggingFace — For model access
Built With
- css**
- ernie-4.5**
- ernie-5**
- github-pages**
- html**
- huggingface**
- markdown**
- paddleocr-vl**
- python**
Log in or sign up for Devpost to join the conversation.