ERNIE & PaddleOCR Implementation Proof
Live Demo: https://sanketnawale.github.io/ernie-warmup-task/
The deployed webpage clearly shows our AI-powered pipeline:
PaddleOCR-VL Usage:
- Tool: Baidu AI Studio PaddleOCR-VL API
- Input: z/OS TSO/E Command Reference PDF (448 pages)
- Output: 1,031,508 extracted characters
- Code: See
step1_extract_pdf_v2.pyin repository
ERNIE 4.0 API Usage:
- API: erniebot Python SDK with
ernie-4.0-turbo-8kmodel - Input: Extracted PDF content
- Output: Generated HTML structure + CSS styling (15,253 characters)
- Code: See
step2_generate_webpage.pyin repository
Processing Pipeline: PDF → PaddleOCR-VL → ERNIE 4.0 → GitHub Pages
Visible Proof: The live webpage displays " Built with AI" section with full attribution to both technologies.##Inspiration Working with IBM mainframe documentation, I noticed how valuable technical PDFs are locked away in non-searchable, non-interactive formats. I wanted to leverage ERNIE's AI capabilities to automate the transformation of these resources into modern, accessible web pages.
What it does
This project automatically converts complex technical PDFs into beautiful, responsive web pages using a two-step AI pipeline: PaddleOCR-VL intelligently extracts text from 448-page IBM z/OS Command Reference ERNIE 4.0 generates clean, structured HTML with modern CSS styling GitHub Pages hosts the final result at https://sanketnawale.github.io/ernie-warmup-task/
How I built it
Tech Stack:
PaddleOCR-VL (via Baidu AI Studio) for intelligent document parsing
ERNIE 4.0 API for AI-powered HTML generation
Python 3 with PyPDF2 for fallback text extraction
GitHub Pages for zero-cost hosting
Pipeline:
python
Step 1: Extract PDF content
python step1_extract_pdf_v2.py
Processes 448 pages → 1,031,508 characters
Step 2: Generate webpage
python step2_generate_webpage.py
ERNIE 4.0 creates HTML/CSS → 15,253 characters
Step 3: Deploy
git push origin main
GitHub Pages auto-deploys
Challenges I ran into
Large PDF processing: Initial OCR attempts hit API limits; solved by implementing chunk-based processing ERNIE prompt engineering: Required iteration to get clean HTML output without unnecessary wrappers GitHub Pages deployment: Fought with Jekyll workflow errors; switched to simple branch deployment Character encoding: Handled special mainframe characters and formatting preservation
Accomplishments
Successfully processed 1M+ characters from complex technical documentation Generated production-ready responsive HTML in under 2 minutes Created fully automated pipeline requiring zero manual intervention IBM-themed professional design with mobile responsiveness
What I learned
Advanced prompt engineering for ERNIE 4.0 to generate structured output PaddleOCR-VL's capabilities for technical document understanding GitHub Pages deployment strategies and troubleshooting Efficient chunking strategies for large document processing
What's next
Add search functionality to the generated webpage Support for multiple PDF formats and languages Interactive table of contents generation Batch processing for documentation libraries Integration with CI/CD pipelines for automated doc publishing
Built With
- ai
- css3
- ernie
- github
- html5
- natural-language-processing
- ocr
- paddlepaddle
- python


Log in or sign up for Devpost to join the conversation.