Inspiration

As international students, the process of finding the right university and figuring out how to fund our education was incredibly overwhelming. The data we needed — tuition costs, acceptance rates, alumni experiences — was scattered across thousands of outdated, difficult-to-navigate college websites. We realized that students needed a smart, centralized companion that could cut through the noise. We were inspired to build UniBuddy: an AI-powered platform that automatically builds a rich catalog of universities and pairs it with a conversational assistant to guide students seamlessly through their higher-education journey.

What it does

UniBuddy is an intelligent, automated university matching platform. It acts as a 24/7 personalized advisor for prospective students. Instead of manually filtering through generic search engines, a user can fill in their profile — GPA, test scores, budget, preferred countries, and personal statement — and UniBuddy instantly matches them to the universities most likely to accept them.

The platform semantic-searches a vast, continuously updated database to provide perfect-fit recommendations. It ranks every university with a transparent hybrid match score combining semantic vector similarity and structured criteria (GPA fit, budget fit, IELTS score, location preference). Each result comes with human-readable match reasons, a competitiveness label (Safe / Balanced / Ambitious), and a tier (Strong Fit / Target / Reach).

Furthermore, when the user wants to go deeper on a specific school, UniBuddy generates a full AI Application Strategy — a personalized roadmap with a timeline, document checklist, gap analysis, and fallback alternatives — all powered by GPT-4o and refinable through conversational chat.

How we built it

We built UniBuddy using a modern, AI-native hybrid architecture:

Frontend & Foundation: We used Next.js and React for a responsive, modern web interface with a clean onboarding flow and real-time dashboard.
Data Orchestration & Automation: We built a sophisticated enrichment pipeline. We fetch large baseline datasets using BrightData web scrapers, then dispatch AI agents to crawl specific official university URLs, programmatically extracting complex data points (tuition, acceptance rates, GPAs) into a structured schema.
Vector Database & Semantic Search: Every piece of structured data is vectorized using OpenAI Embeddings (text-embedding-3-small) and stored in a scalable Zilliz (Milvus) vector database, making our catalog lightning-fast and semantically searchable.
Hybrid Scoring Engine: At query time, the student's profile embedding is compared against 1,000+ university vectors via cosine similarity (60% weight), combined with a structured score across five dimensions — GPA fit, budget fit, location fit, campus fit, and IELTS fit (40% weight) — to produce a final match score from 0–100.
AI Strategy Generator: We integrated OpenAI GPT-4o to generate personalized application strategies per university, including timeline milestones, gap analysis, required documents, and a fallback alternative. Strategies are refinable through an in-app conversational chat and exportable as a PDF.
Real-time Fallback & Web Intelligence: To ensure UniBuddy never hallucinates or gives outdated info, we integrated Exa as a live RAG fallback. If the catalog lacks data on a specific query, Exa searches the live web. We also use Exa's semantic engine to find real-world alumni profiles (LinkedIn, Facebook) so students can connect with people who walked the path before them.
Deep Scan: Users can trigger an on-demand AI crawl of any university's official website to enrich its data in real time — extracting programs, tuition, admission requirements, and campus details autonomously.

Challenges we ran into

Our biggest challenge was data fragmentation. Every university structures its website completely differently, making traditional scraping nearly impossible. We solved this by implementing an orchestration layer that tries multiple fallback methods: first using the UniBuddy scraper to intelligently parse the page, falling back to BrightData Web Scrapers, and finally utilizing BrightData SERP queries if specific fields were still missing.

Another challenge was keeping the conversational AI grounded. Standard LLMs hallucinated university deadlines and costs. We solved this by directly wiring the Zilliz vector database (for core catalog data) and the Exa search API (for live web data) directly into the GPT-4o system prompt.

A third challenge was scoring fairness — ensuring the match score felt meaningful and explainable, not like a black box. We designed the hybrid scoring system to be fully transparent, generating human-readable match reasons and roadmap suggestions for every result so students understand exactly why a university was recommended.

Accomplishments that we're proud of

We are incredibly proud of the hybrid enrichment pipeline. We successfully built an architecture that doesn't just passively read static data, but autonomously crawls, cleans, vectorizes, and serves new university data using multiple AI agents working in tandem.

We're also proud of the AI Application Strategy feature — using GPT-4o to bridge the gap between a student's profile and a concrete, actionable roadmap. The fact that the strategy is conversationally refinable and exportable as a polished PDF makes it genuinely useful, not just a demo gimmick.

Finally, the hybrid scoring system — combining semantic vector similarity with structured academic criteria — produces recommendations that feel accurate and explainable, which is rare in AI-driven discovery tools.

What we learned

We learned a tremendous amount about AI agent orchestration, vector embeddings, and Retrieval-Augmented Generation (RAG). Working with Zilliz taught us how to efficiently handle high-dimensional data, while orchestrating tools like Exa and BrightData taught us that combining multiple specialized data-fetching APIs yields significantly better results than relying on a single AI model to "know everything."

We also learned that explainability matters as much as accuracy. Users trust a recommendation more when they can see exactly why it was made — match reasons, gap analysis, and competitiveness labels turned out to be just as important as the score itself.

What's next for UniBuddy

Moving forward, we plan to:

Expand our automated pipeline to cover global scholarships tailored to underrepresented communities.
Introduce an Application Tracker dashboard where students can save the universities recommended by the AI and track their application status end-to-end.
Implement personalized email alerts for upcoming application deadlines based on the user's semantic profile.
Add peer comparison — letting students see anonymized profiles of admitted students at their target universities to benchmark themselves realistically.
Explore partnerships with universities to provide verified, real-time admissions data directly into the platform.

Built With

brightdata
exa
nextjs
tinyfish
zilliz

Updates

Hàn Đỗ started this project — Mar 21, 2026 01:10 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.