Brief Project Description

ScoutBot is an open-source, fully automated opportunity-discovery platform that scrapes 15+ scholarship, fellowship, and internship sites every week and delivers a curated digest to 507 Nigerian university students -- completely autonomously, at zero cost, with zero paid marketing.

Public Demo / Repository GitHub: https://github.com/TechHub-Extensions/ScoutBot Live site: https://TechHub-Extensions.github.io/ScoutBot/ Live database: (paste your public Google Sheet link here)

The email digest open in a browser (email_preview.html) The live Google Sheet with listings The terminal output showing sent confirmations The GitHub repo homepage

Problem Our Startup Solves

Every week, hundreds of scholarships, fellowships, and internships open for Nigerian students -- on a national and international scale -- and most of them expire unfound.

I built ScoutBot because close friends of mine have missed brilliant opportunities: the Afara Initiative, Microsoft internships, fully-funded exchanges. Not because they weren't qualified. Because they found out too late -- days after the deadline, weeks after the listing was buried in a blog post they stumbled on by accident. That moment made something clear: the problem was never the opportunities. We're told at every conference that Africa needs more opportunities, and we do -- but the bigger, quieter problem is distribution.

Nigeria has over 2 million university students. The average student checks 6-8 different websites, WhatsApp groups, and social media feeds just to stay informed. By the time an opportunity surfaces in their feed, it's often already crowded or closed.

I wanted to build something that wakes up every week, does the searching for them, and delivers only the real ones -- clean, current, directly to their inbox. No noise. No dead links. No finding out too late.

How It Was Built

The architecture is intentionally lean -- the entire system runs on free tiers. Every design decision started with one question: will this still run correctly in six months when I'm not looking at it?

Scraping layer -- Python 15+ site-specific scrapers run using BeautifulSoup4 and Requests. Each scraper handles its source's structure, filters by keyword and recency, and passes only credible, current listings downstream. Items are filtered before they ever touch the database.

Storage layer -- Google Sheets + gspread Surviving opportunities land in a live two-tab Google Sheet (Nigeria / International): Title, Category, Application Link, Deadline, Date Added. Schema migration is automatic -- if the bot detects an old column structure, it resets headers on the next run without manual input. cleanup.py runs after every scrape and removes expired listings using header-name column lookup, making it robust to future schema changes.

Delivery layer -- Gmail SMTP + IMAP bounce tracking notify.py builds a responsive HTML email, assembles the subscriber list from multiple sources, deduplicates and validates every address, records bounces in a persistent tab in Google Sheets, and sends via Gmail SMTP. One bad address does not abort the whole send -- each address is handled independently.

Scheduling -- cloud-hosted Python scheduler The full pipeline fires every Sunday at 7 AM WAT via a cloud-hosted schedule cron. No VPS, no paid server, no manual trigger. It runs indefinitely at zero cost.

Community distribution on top Contributors extended the platform with a WhatsApp broadcast bridge (Node.js + whatsapp-web.js + SQLite + Express) and a Telegram digest bot. ScoutBot now reaches students across three channels -- email, WhatsApp, and Telegram.

Tech stack: Python 3.11 / BeautifulSoup4 / Requests / gspread / Google Sheets API / Gmail SMTP and IMAP / python-schedule / Node.js / whatsapp-web.js / SQLite

Challenges We Ran Into

Cloudflare blocking Major opportunity sites block scraping from cloud server IPs even with real browser user agents. Fix: direct site-specific scrapers targeting listing pages, with parser logic tailored to each source's structure.

Subscriber email validation Early digests bounced against corporate and university mail servers. I built a validation pipeline: regex format checking, MX record lookup via dnspython, a persistent Bounced tab in Google Sheets blocking resends to known-bad addresses, and per-address SMTP error handling.

Building for longevity on zero budget Every component had to answer: what happens when I cannot afford to fix this -- when the laptop breaks, the internet drops, or I'm deep in exams? That constraint drove the auto-migration logic, header-name column lookups, the retention cap on listings, and persistent bounce tracking. The system has to take care of itself.

Knowing when to cut I prototyped an AI scoring layer using Google's Gemini Flash -- each opportunity was scored 1-10 for relevance to Nigerian students. Technically it worked. But on a free-tier quota it became the most fragile part of the system and the first thing to fail under load. I cut it. The lesson: AI adds real value when the problem is genuinely ambiguous. When budget is tight and reliability is the product, simplicity wins. AI is the next chapter, not the foundation.

Accomplishments We're Proud Of

507 verified subscribers, grown entirely by word of mouth. Zero paid marketing. Zero advertising budget.

A three-channel distribution system (email, WhatsApp, Telegram) built in stages -- core by one person, extended by contributors who believed in the project enough to open pull requests.

Zero operating cost. As a student from a middle-class background in a lower-income country, this is the accomplishment I'm most proud of. To build is one thing. To build something that sustains itself -- when you cannot always afford to sustain it yourself -- is another.

What We Learned

When to leave AI out. Gemini was the most technically interesting part of the project and the first thing cut. The real lesson is not that AI is bad -- it's that AI layers earn their place when the problem is genuinely ambiguous. When the core value is reliability, complexity is a liability.

Prompt engineering still matters, even when you cut the feature. Before cutting Gemini, the scoring prompt went through six iterations. The phrase "Score 7+ ONLY if the opportunity is explicitly open to Nigerians or Africans broadly" reduced irrelevant high scores by roughly 40%. That precision now lives in how I write every filtering rule in the system.

Build for the moment you're not there. Every decision -- auto-migration, bounce tracking, the 23-day cleanup window -- was made by asking: what happens when I cannot fix this? That mindset is what separates a project from a product.

Future Plans for Our Startup

ScoutBot is already a product. The next phase is making it a platform.

AI-powered opportunity matching -- students fill a short profile (course, year, interests, GPA) and the system ranks which opportunities are most relevant to them personally. The data is already there; the matching layer is next.

University partnerships -- licensing the digest to student unions and career offices across Nigerian universities, reaching students through institutional channels rather than individual sign-ups.

Mobile app -- push notifications the moment a high-match opportunity drops, before it's crowded.

Pan-African expansion -- extending coverage to Ghana, Kenya, South Africa, and diaspora scholarship databases.

Structured contributor programme -- a bounty system for adding new scrapers so coverage grows beyond what one person can maintain.

The market is 2 million Nigerian university students. The infrastructure is already built, already running, already serving real people. What's next is scale.

Every week, hundreds of scholarships, fellowships, and internships open for Nigerian students, and most of them still expire unfound. ScoutBot is changing that, one digest at a time.

Team: Kamsi Richard Ivanna -- creator and sole builder of the core system (scraping engine, Google Sheets pipeline, email delivery, scheduling). Contributors extended it with WhatsApp and Telegram distribution.

Built With

Share this project:

Updates