Inspiration
The seed for Nhaka 2.0 was planted when I found a box of my grandmother's letters from 1923. Time had been cruel: the ink had oxidized into "ghost text," water stains obscured dates, and the paper was brittle.
I tried existing OCR tools. They failed completely.I tried standard AI chatbots. They "hallucinated" names and dates that didn't exist, rewriting history instead of preserving it.
I realized that Heritage Preservation is not just about reading text; it's about trust. We don't just need a tool that guesses; we need a system that thinks like a team of experts: a historian to verify dates, a linguist to understand old dialects, and a conservator to assess damage.
Nhaka means "Heritage" in Shona. I built this to ensure that the history of Zimbabwe—and the world—doesn't fade away with the ink.
What it does
Nhaka 2.0 is an Archive Resurrection System powered by a multi-agent AI swarm. Instead of a single "black box" model, it deploys five specialized AI agents that collaborate, argue, and verify each other in real-time to restore damaged documents.
The Scanner (Vision): Uses PaddleOCR-VL to map the document layout, detect physical defects (tears, stains), and extract raw text confidence scores.
The Linguist (Language): Specialized in historical dialects (including the 1931 Doke Shona orthography), handling transliteration of obsolete characters like ɓ, ɗ, and ȿ.
The Historian (Context): Cross-references extracted entities against a database of historical facts (1890-1950) to flag anachronisms (e.g., "This treaty date contradicts the known timeline").
The Validator (Quality Control): Acts as the judge. It compares the outputs of the Scanner, Linguist, and Historian. If they disagree, it flags the text as "Low Confidence" rather than guessing.
The Repair Advisor (Conservation): Analyzes the visual damage and generates an AR-ready "Hotspot Map" with chemical preservation advice for physical archivists.
The user watches this entire process unfold in the "Agent Theater", seeing the transparency of the AI's reasoning character-by-character.
How we built it
l architected the system to be modular, verifiable, and scalable.
1. The Intelligence Layer (Novita AI & ERNIE)
We leveraged the ERNIE 4.0/4.5 models via Novita AI for the cognitive agents.
Why ERNIE? Its superior logic reasoning capabilities were crucial for the Historian and Validator agents. We used Chain-of-Thought (CoT) prompting to force the agents to "show their work" before outputting a result.
Multimodal Input: We fed the OCR coordinates and visual defect data into the context window, allowing the text agents to "see" where the damage was.
2. The Vision Layer
PaddleOCR-VL: Chosen over Tesseract for its superior handling of noisy backgrounds and skewed text.
OpenCV: Used for pre-processing (adaptive thresholding, deskewing) before the AI even sees the image.
3. The Orchestration (Backend)
Python & FastAPI: Built a fully asynchronous pipeline.
Server-Sent Events (SSE): This was critical. We stream the "thought process" of every agent to the frontend. There are no loading spinners—only live intelligence.
Hypothesis (Testing): We didn't just write unit tests; we used Property-Based Testing to generate thousands of random document scenarios to prove our data models (Coordinates, Confidence Scores) never break.
4. The Experience (Frontend)
React + TypeScript + Vite: For a type-safe, lightning-fast UI.
Agent Theater: A custom UI component that visualizes the multi-agent debate.
Challenges we ran into
The "Hallucination" Trap: Early versions of the Historian agent would confidently invent kings who never existed. We solved this by implementing an Adversarial Validator—an agent whose only job is to doubt the other agents.
Doke Shona Orthography: The Shona language used a unique alphabet (Doke's recommendation) between 1931 and 1955. Standard LLMs struggle with characters like ȿ (whistling 's'). We had to fine-tune the Linguist agent's system prompt with specific unicode mappings to handle this accurate transliteration.
Async Orchestration: Coordinating 5 agents to talk about the same document without race conditions was difficult. We used a "Shared Context Board" pattern in Python where agents append insights sequentially.
Accomplishments that we're proud of
Transparency First: We didn't hide the AI. We made the "Thinking Process" the main feature. Users love seeing the Historian correct the Scanner.
Rigorous Testing: Achieving a high level of code reliability using Hypothesis property-based testing. The system is robust against malformed inputs.
Real Utility: This isn't just a chat app. It produces structured, validated archival data that can actually be used by libraries and museums.
What we learned
Confidence > Correctness: In archival work, it is better for the AI to say "I am 40% sure" than to lie and say "I am 100% sure." Uncertainty is valuable data.
Multi-Agent Superiority: A swarm of small, specialized prompts outperforms one giant "do everything" prompt. The agents check each other's blind spots.
What's next for Nhaka 2.0 Archive-Restoration
Mobile Field Unit (PWA): Most archives in rural Zimbabwe (like mission schools) don't have high-speed internet. We are optimizing the frontend into a Progressive Web App (PWA) with "Offline Mode," allowing researchers to scan documents in the field and sync the restoration when they return to connectivity.
Human-in-the-Loop (RLHF): AI is not infallible. We are building a "Correct & Teach" feature where verified historians can manually correct the Agent's transliterations. These corrections will be used to fine-tune our ERNIE models, creating a feedback loop that makes the system smarter about Zimbabwean history over time.
Language Expansion: Doke Shona was just the start. We plan to expand the Linguist Agent's capabilities to support Old Ndebele and Chewa, covering the major historical languages of the Federation era (1953-1963).
Public API for Libraries: We aim to decouple the backend and release the POST /resurrect endpoint as a paid API. This would allow institutions like the University of Zimbabwe Library to integrate our restoration engine directly into their existing digital repository software (like DSpace) without needing new hardware.
Built With
- ernie-4.0
- fastapi
- hypothesis-testing
- novita-ai
- opencv
- paddleocr
- property-based-testing
- python
- react
- server-sent-events
- shadcn-ui
- supabase
- tailwind-css
- typescript
- vite
Log in or sign up for Devpost to join the conversation.