Inspiration

Growing up in Karachi, we watched family members miss government deadlines, pay middlemen for information that should be free, and make decisions about who to vote for with almost zero real data. Pakistan publishes its federal budget every year — 17 trillion rupees of public money. The National Assembly publishes attendance records for every MNA. But it's all buried in dense Excel files, scanned PDFs, and government portals that ordinary citizens simply cannot navigate.

The final push came when we discovered that 59 MNAs attended zero assembly sessions last year — yet drew full salaries. That data was publicly available on na.gov.pk. Nobody had made it readable.

We asked ourselves: what if a rickshaw driver in Karachi could open an app, type his city name, and instantly see whether his elected representative even showed up to work? What if a schoolteacher could see that less than 2% of the federal budget goes to education — and share that fact with one tap?

That question became HisaabKitaab.


What it does

HisaabKitaab Pakistan (حساب کتاب — "The Account Book") is a bilingual civic intelligence platform that transforms Pakistan's raw public finance and parliamentary data into something every citizen can understand, use, and share.

🕐 Live National Debt Clock A real-time counter showing Pakistan's debt interest accruing at PKR 309,692 every single second — based on the official FY2025-26 debt servicing budget of PKR 9,775 billion. The moment you open the app, you feel the fiscal crisis — not as an abstract statistic, but as a live number ticking in front of you.

🏛️ WakalaCheck — MNA Accountability Explorer Search any of Pakistan's 336 MNAs by name, city, or constituency. See their attendance percentage, bills sponsored, questions raised in assembly, AI performance grade, voting record on key bills, committee memberships, and contact details — all sourced from na.gov.pk and PILDAT reports. Compare any two MNAs side-by-side. Sort the full national leaderboard by attendance, bills, or questions raised.

🧮 Viral Tax Calculator & Share Card Enter your monthly salary. See exactly how your income tax is split: 48.4% to debt servicing, 21.8% to provincial transfers, 15% to defence, under 2% to education. One tap generates a shareable image card for WhatsApp and Twitter.

📊 3-Year Budget Trend Explorer & AI Anomaly Detector Compare any ministry's budget across FY2023-24, FY2024-25, and FY2025-26. An AI-powered anomaly detector automatically flags suspicious patterns — like the Climate Change budget dropping 40% despite catastrophic national flood risk.

📄 AI Legislative Bill Summarizer Drag and drop any official parliamentary bill PDF. AI reads it and produces a plain-language 3-sentence summary in both English and Urdu Nastaliq script. No legal literacy required.

🤖 Bilingual Budget Chatbot Ask anything about Pakistan's budget in English or Urdu. The chatbot queries 3 years of official Finance Division spreadsheet data before answering — so every response is grounded in real figures, not AI guesswork.

Everything is fully bilingual — Urdu Nastaliq and English throughout, with RTL layout support. Because Pakistan's public data belongs to all Pakistanis, not just English-literate ones.


How we built it

Data Pipeline

We downloaded official federal budget Excel files from finance.gov.pk for three fiscal years (FY2023-24, FY2024-25, FY2025-26) and built a custom Node.js XLSX parser that extracts ministry-level allocations, normalizes inconsistent naming across years, and filters out domestic debt principal roll-overs to show true operating budgets.

For parliamentary data, we built a Cheerio-based scraper against na.gov.pk to extract all 336 MNA profiles — English and Urdu names, constituency codes, party affiliations, attendance figures, bills sponsored, questions raised, committee memberships, and profile images — cross-referenced with PILDAT parliamentary watch reports.

AI Architecture

All AI features run on a Gemini 2.0 Flash → Groq LLaMA3-70B fallback chain. If Gemini fails or rate-limits, Groq silently takes over. Every endpoint also has a mock fallback using real computed data, so the app never breaks.

The budget chatbot uses a dynamic RAG approach: before each AI call, it scans all ministry data for terms matching the user's query, injects the relevant multi-year figures as context, then calls the AI — so answers are grounded in actual spreadsheet data, not hallucinations.

Stack

  • Frontend: React 19, TypeScript, Tailwind CSS, Framer Motion, Recharts, Vite PWA
  • Backend: Node.js, Express, TypeScript, pdf-parse, XLSX, Cheerio
  • AI: Google Gemini 2.0 Flash (primary), Groq LLaMA3-70B (fallback)
  • Data: finance.gov.pk, na.gov.pk, PILDAT parliamentary watch reports
  • Deployment: Vercel (frontend), Railway (backend)

Challenges we ran into

Data normalization was the hardest problem. The Pakistani government renames ministries between budget years — "Ministry of IT & Telecom" becomes "IT & Telecom Division" becomes "Information Technology." Without a careful name-mapping system, 3-year comparisons produce completely wrong results. We spent hours manually building and verifying this mapping against the official documents.

na.gov.pk is not an API. Extracting 336 MNA profiles required reverse-engineering their HTML structure, handling inconsistent page layouts, managing rate limits, and building a data hydration system for when the live site is slow or down during demos.

Urdu Nastaliq rendering was more complex than expected. Getting AI models to output proper Nastaliq script (not Roman Urdu) consistently required careful prompt engineering. RTL layout had to be applied selectively — not globally — because the app mixes English and Urdu content on the same screen.

The debt clock math had to be verified precisely against the official budget document to ensure the PKR 309,692/second rate was accurate and defensible, not just dramatic. We worked backwards from the annual allocation figure and verified it three times before showing it to anyone.

AI hallucination on budget figures was a real risk. Early versions of the chatbot would confidently invent budget numbers. The RAG pipeline — injecting real spreadsheet data as context before every AI call — was the solution, but building it correctly took iteration.


Accomplishments that we're proud of

367 real MNA records with Urdu names, attendance data, voting records, committee memberships, and contact details — all scraped from official sources and queryable in seconds. No other public tool in Pakistan offers this.

A chatbot that actually knows the budget. When you ask "تعلیم کا بجٹ کتنا ہے", it returns the real figure from the Finance Division spreadsheet — not a hallucinated estimate. That required real engineering, not just an API call.

The AI never crashes. Gemini fails → Groq takes over → mock data fallback. Judges can demo this at any time of day, on any network, and it will always show something real and functional.

A PWA that installs on your phone. HisaabKitaab can be added to a home screen like a native app, works offline for budget data, and shows an install banner automatically. Most civic tech projects are desktop-only — ours works for the 78% of Pakistanis who access the internet on mobile.

We built this in under a week — two students from FAST-NUCES Karachi, working remotely, with no budget, using only publicly available data and free-tier APIs.


What we learned

We learned that the bottleneck in Pakistani civic transparency is not data availability — it's data accessibility. The government publishes everything. Finance Division releases budget books. The National Assembly lists every MNA's attendance. PILDAT publishes parliamentary watch reports. The problem is that none of it is designed to be used by citizens. It's designed to satisfy a compliance requirement, not to inform anyone.

We learned that bilingual design is not a feature — it's a requirement. Building the Urdu interface forced us to think about who actually needs this tool. If your civic transparency app only works in English, you've built it for the wrong audience. Pakistan's accountability crisis doesn't affect English speakers most — it affects everyone else.

We learned that AI is most powerful when it's invisible. The best AI features in HisaabKitaab are the ones users don't notice as "AI" — the MNA grade that just appears, the bill summary that just works, the chatbot that just knows the right figure. When AI draws attention to itself, it feels like a gimmick. When it quietly removes a barrier between a citizen and their government's data, it becomes civic infrastructure.

We also learned that scraping na.gov.pk at 2am is a deeply humbling experience.


What's next for HisaabKitaab — Pakistan's AI Accountability Platform

Provincial assemblies — the same architecture scales to Punjab (371 MPAs), Sindh, KPK, and Balochistan with the same scraper pointed at different source URLs. Pakistan's accountability gap is even worse at the provincial level.

WhatsApp bot — wrapping the bilingual chatbot backend in a WhatsApp Business API to reach citizens without smartphones or reliable internet. Pakistan has 100M+ WhatsApp users. Most civic tech never reaches them.

Citizen audit layer — allowing citizens to upload geotagged photos of government development projects (PSDP-funded schools, roads, hospitals) and report whether they were actually built. Crowd-sourced ground-truthing of budget claims.

Push alerts — subscribe to notifications when your MNA votes on a bill you care about, or when a ministry's budget changes significantly year-over-year.

Open data API — a public REST API returning structured Pakistani budget and parliamentary data for journalists, researchers, and NGOs who need machine-readable civic data.

Sustainability — we are in conversations with Code for Pakistan and civil society organizations about long-term hosting, data partnerships with PILDAT, and potential integration with official open government data initiatives.

Pakistan's public data has always been technically available. HisaabKitaab makes it actually accessible — and we're just getting started.

Built With

Share this project:

Updates