Inspiration
The idea behind this project came from wanting to understand how AI tools like PaddleOCR and ERNIE can automate simple but time-consuming tasks. PDF documents, especially CVs, are commonly used but not very interactive. I wanted to transform a static PDF into a clean, responsive webpage entirely through an AI pipeline. The warm-up task felt like a great way to learn how OCR and LLMs can work together to build something useful in minutes.
What it does
The project converts a PDF CV into a fully generated HTML webpage using PaddleOCR-VL and ERNIE.
It:
- Extracts text and structure from a PDF using OCR
- Converts the extracted content into Markdown
- Uses ERNIE to turn the Markdown into a modern HTML webpage
- Publishes the page automatically using GitHub Pages
The final result is a clean, responsive, AI-generated personal website.
How we built it
- PaddleOCR-VL was used to extract text and layout from the CV PDF.
- The OCR output was cleaned and formatted into Markdown so ERNIE could understand the structure.
- ERNIE generated a fully styled HTML webpage with responsive design using the Markdown as input.
- The generated HTML was saved as
index.htmland uploaded to a GitHub repository. - GitHub Pages was enabled to instantly deploy the website online.
Challenges we ran into
- Character limits in the ERNIE interface required condensing the Markdown and restructuring the content.
- Markdown formatting needed to be carefully cleaned so the final webpage would look professional.
- GitHub Pages configuration initially resulted in a blank page due to the wrong folder being selected for deployment. Switching Pages to serve from the root fixed the issue.
Accomplishments that we're proud of
- Successfully building a complete AI pipeline with OCR → Markdown → HTML → Deployment.
- Creating a polished, responsive webpage generated entirely by AI.
- Finishing the warm-up task with a smooth workflow that can be reused for more complex applications.
What we learned
- How OCR models extract and structure PDF content.
- How to prepare Markdown to maximize the quality of HTML generated by ERNIE.
- How LLMs can automate UI layout and styling.
- How to deploy static websites easily using GitHub Pages.
- The importance of input cleaning, prompt design, and handling tool limitations.
What's next for Web Builder: Build a Web Page with PaddleOCR & ERNIE
- Adding customizable themes so users can choose different styles.
- Allowing users to upload any PDF (reports, articles, resumes) and auto-generate a website.
- Integrating multilingual support using ERNIE to translate content before generating the webpage.
- Expanding the pipeline into a full “PDF-to-Website AI Builder” platform for non-technical users.

Log in or sign up for Devpost to join the conversation.