SeaWolves

Inspiration

As Stony Brook students ourselves, we felt the friction firsthand. Finding basic information — withdrawal deadlines, tuition details, housing policies — meant digging through dozens of pages on the SBU website, never sure if what we found was current. We wanted a single place where any Seawolf could get a reliable, cited answer in seconds.

That frustration expanded into a bigger question: what if AI could serve the entire Seawolf lifecycle — not just current students, but graduates looking to reconnect, and learners who want to actually understand their material, not just get the answer?

SeaWolves was born from that question.

What We Learned

Building SeaWolves taught us how much engineering goes into making AI trustworthy, not just functional.

On the RAG side, we learned that retrieval quality matters far more than the LLM itself. A smarter model cannot fix bad chunks. We spent significant time on source authority scoring and reranking — so that the registrar's official page always outranks a student blog.

For alumni matching, we learned that pure vector similarity is not enough. Two people can have similar embeddings but completely mismatched career paths. We implemented multi-signal reranking combining Jaccard similarity on skills, graduation proximity, and MMR diversity with $\lambda = 0.7$ to balance relevance and variety:

$$ \text{MMR}(d_i) = \lambda \cdot \text{sim}(d_i, q) - (1 - \lambda) \cdot \max_{d_j \in S} \text{sim}(d_i, d_j) $$

For StudyCoach, we learned that how you prompt the model is as important as what you feed it. A strict Socratic system prompt — one concept, one question, never give the answer — completely changes the learning dynamic.

How We Built It

SeaWolves is a full-stack AI platform built in 24 hours:

Frontend — Four Next.js apps (Ask Seawolf :3000, Admin :3001, SB-Alumni :3002, StudyCoach :3003) with TailwindCSS
Backend — A single shared FastAPI backend on :8000 with four route groups, JWT authentication restricted to @stonybrook.edu domains
Vector search — PostgreSQL + pgvector for cosine similarity retrieval across 22,000+ chunks crawled from stonybrook.edu
Caching — Redis for session management and rate limiting
AI providers — Switchable via a single environment variable: mock, local (Ollama), openai (GPT-4o), or bedrock (Claude)
Infrastructure — Docker Compose for local development, AWS ECS Fargate + RDS for production

Challenges

Data quality was the hardest problem. Raw web crawls are noisy — navigation menus, footers, and boilerplate text pollute every chunk. We built a cleaning pipeline to strip boilerplate, filter irrelevant pages by URL pattern, and score source authority per domain.

Keeping the RAG pipeline honest. Early versions confidently hallucinated. We added confidence scoring, source citation on every answer, and an evaluation runner in the Admin Dashboard that runs structured SBU Q&A test suites and reports pass/fail rates per case — so we always know exactly how accurate the system is.

Building four apps in 24 hours. Scope is the enemy of hackathons. We stayed disciplined — shared backend, shared auth, shared database schema — so each frontend team could move independently without duplicating infrastructure.