Inspiration

As UIUC students, we've watched countless peers struggle to break into research. The process is opaque—you don't know which labs are actively looking for undergrads, what their actual requirements are, or how to craft emails that don't immediately get ignored. Most students either spam generic emails or give up entirely. We wanted to build an AI agent system that solves both the discovery problem and the outreach barrier.

What it does

LabScout is a multi-agent AI system that automates research opportunity discovery at UIUC. You upload your resume, specify your interests, and our agent pipeline goes to work:

  1. Discovery Agent searches across UIUC department pages for active research labs
  2. Parsing Agent extracts structured information (PI names, research focus, requirements) from unstructured faculty websites
  3. Matching Agent uses semantic similarity between your background and lab requirements to rank opportunities
  4. Outreach Agent generates personalized cold emails that reference specific projects and demonstrate genuine fit

The result: instead of spending hours manually searching and crafting emails, students get ranked opportunities with one-click personalized outreach in under 30 seconds.

How we built it

  • Backend: Python FastAPI handling agent orchestration and job queue management
  • Agent Architecture: Built four specialized agents using Keywords AI's LLM gateway:
  • Claude Opus 4.5 for search query generation (cost-efficient), HTML parsing, complex matching logic and reasoning and email generation (instruction-following)
  • Keywords AI routing let us optimize model selection per agent task
  • Search & Parsing: Tavily API for web discovery, BeautifulSoup for HTML extraction
  • Matching Engine: text-embedding-3-small via Keywords AI to generate resume/requirement embeddings, cosine similarity for ranking
  • Frontend: React with Vite, clean card-based UI for opportunity browsing
  • Resume Processing: PyPDF2 for text extraction, GPT-4o for structured skill/experience parsing

Challenges we ran into

  • Unstructured web data: UIUC faculty pages have zero consistency. Some labs have detailed READMEs, others just list papers. We had to make our parsing agent robust to HTML chaos and missing information—this required careful prompt engineering and fallback logic.
  • Agent coordination: Getting four LLM agents to reliably pass data between steps without hallucinating or dropping context was harder than expected. We used Keywords AI's observability dashboard to debug which agent calls were failing and why.
  • Semantic matching quality: Early versions ranked opportunities poorly because we were only matching on keywords. We switched to embedding-based similarity with GPT-4o providing reasoning for each match, which dramatically improved relevance.
  • Demo timing: Orchestrating four sequential LLM calls meant latency was initially 45+ seconds. We parallelized the parsing agent across multiple URLs and added progress indicators to keep the UX responsive.

Accomplishments that we're proud of

  • Built a working multi-agent system that actually produces useful results—not just a proof-of-concept
  • Achieved 85%+ match quality in our testing with real UIUC CS/ECE labs
  • Generated genuinely good cold emails that reference specific projects and demonstrate research fit (we'd actually send these)
  • Clean, intuitive UX that doesn't feel like a hackathon project
  • Successfully used Keywords AI for model routing—different agents use different models based on task requirements, with full observability

What we learned

  • LLM observability is critical: Keywords AI's dashboard saved us hours of debugging by showing exactly which prompts were failing and why
  • Semantic search requires context: Raw embeddings aren't enough—having an LLM explain why something matches helps validate results
  • Web scraping with LLMs >> traditional parsers: Instead of writing regex for every faculty page format, one well-prompted LLM handles it all

What's next for LabScout

Short-term:

  • Launch for users at UIUC
  • Add filters for commitment level (10hrs/week vs full-time summer)
  • Email tracking to see which emails get responses to fine tune and improve our cold outreach strategy

Long-term:

  • Expand beyond UIUC to other universities (Illinois system, Big Ten and eventually across America)
  • Response optimization: Learn which email styles get replies, use RL to improve generation
  • Interview prep agent: Once you get a response, help prepare for the research interview
  • Network effects: Aggregate anonymous data on which labs are responsive, best times to reach out, typical response rates
  • Integration with UIUC's official research portal (if we can get institutional buy-in)

GitHub: https://github.com/aadivyaraushan/LabScout

Built With

Share this project:

Updates