About the Project
🚀 Inspiration
Vietnam OCR was born out of the need to automate and simplify the extraction of Vietnamese text from physical documents, such as invoices, forms, and handwritten notes. In many industries in Vietnam, the manual entry of data from paper-based documents is still common — time-consuming and error-prone.
We wanted to build a tool that could accurately recognize Vietnamese characters, handle various fonts and layouts, and easily integrate into existing systems for digital transformation.
🛠️ How We Built It
We developed the system using a combination of:
- YoloV8 for text detection and layout analysis
- VietOCR and custom-trained transformer models for Vietnamese text recognition
- FastAPI for serving the OCR API backend
- Mysql for managing projects, images, and annotations
- Docker for scalable deployment
- A web interface built with Reactjs for managing uploads and viewing results
We also implemented image preprocessing (denoise, resize, contrast adjustment) to improve recognition accuracy.
🔗 Live demo: https://sbxai.devhub.io.vn/
📚 What We Learned
- Fine-tuning OCR models for Vietnamese requires careful handling of tone marks and diacritics
- Text layout detection is critical for multi-column documents
- Real-time performance tuning (under 10 seconds per image) required memory optimization and batching techniques
- Designing APIs with scalability and maintainability in mind is essential for long-term usage
⚠️ Challenges
- Vietnamese has complex diacritics and varied handwriting styles, making OCR more difficult than English
- Handling low-quality scans, skewed documents, and background noise was a major hurdle
- Deploying on resource-constrained servers while maintaining high accuracy and performance
- Building a flexible schema to support various document types (invoices, IDs, receipts, etc.)
Vietnam OCR is part of our broader vision to support Vietnamese businesses in digitizing their workflows through AI. We're actively improving accuracy and plan to support multi-language recognition in the future.
Log in or sign up for Devpost to join the conversation.