SBU Sense - AI-Powered Course Discovery Platform

Inspiration

As Stony Brook University students, we've all experienced the frustration of course planning. With thousands of courses, hundreds of professors, and complex degree requirements, finding the right classes can feel overwhelming. Traditional course catalogs are static and don't consider your academic history, interests, or goals. We wanted to create a solution that makes course discovery intelligent, personalized, and data-driven.

SBU Sense transforms course planning from guesswork into a strategic, informed decision-making process powered by real student data and AI.

What It Does

SBU Sense is a comprehensive course discovery and recommendation platform that helps Stony Brook University students:

Intelligent Course Search

  • Semantic Search: Natural language queries like "machine learning" or "organic chemistry" find relevant courses instantly
  • Advanced Filtering: Filter by subject, SBC requirements, professors, course level, letter grade distribution, and more
  • Real-Time Statistics: View A-ratio, student enrollment, and grade distributions for every course

AI-Powered Recommendations

  • Personalized Suggestions: Describe your academic interests and goals, and get tailored course recommendations
  • Transcript Analysis: Upload your PDF transcript to automatically extract your course history
  • Smart Matching: AI analyzes your background, prerequisites, and progression to suggest the perfect courses
  • GPA Boost Calculator: See estimated GPA impact for each recommended course

Data-Driven Insights

  • Leaderboards: Discover top-performing courses and professors by A-ratio
  • Professor Profiles: Compare instructors across courses with detailed statistics
  • Course Progression: Understand prerequisite chains and recommended next courses
  • SBC Tracking: Automatically track which Stony Brook Curriculum requirements you've fulfilled

How We Built It

Backend Architecture

  • FastAPI: High-performance Python API with async support
  • AI Integration:
    • OpenAI GPT-4o-mini for advanced reasoning
    • Pinecone vector database for semantic search
  • Data Processing:
    • Scraped and processed 100,000+ student grade entries over 10 years
    • PDF transcript parsing using pdfplumber
    • Course bulletin scraping and indexing

Frontend Architecture

  • React + TypeScript: Modern, type-safe UI development
  • Vite: Lightning-fast build tool and dev server
  • Tailwind CSS: Utility-first styling for rapid UI development
  • Recharts: Data visualization for statistics

Key Features Implementation

  1. Vector Search:

    • Embedded course descriptions using OpenAI embeddings
    • Semantic similarity search via Pinecone
    • RAG (Retrieval-Augmented Generation) for context-aware recommendations
  2. Transcript Processing:

    • PDF parsing to extract course codes and grades
    • Automatic course history extraction
    • GPA calculation and progression analysis
  3. Recommendation Engine:

    • Multi-stage filtering and ranking
    • Prerequisite checking
    • Course progression analysis
    • Professor rating aggregation
  4. Real-Time Search:

    • Keyword matching across course codes and titles
    • Multi-criteria filtering with AND/OR modes
    • Sortable results by A-ratio, enrollment, etc.

Challenges We Ran Into

Data Quality & Processing

  • Challenge: Processing 100,000+ grade entries with inconsistent formatting
  • Solution: Built robust data cleaning pipelines and validation checks

PDF Transcript Parsing

  • Challenge: Transcripts have varied formats and layouts
  • Solution: Implemented flexible parsing with regex patterns and fallback strategies

AI Response Reliability

  • Challenge: Ensuring AI recommendations only include valid courses
  • Solution: Multi-layer validation, course code verification, and fallback mechanisms

Performance Optimization

  • Challenge: Fast search across large datasets
  • Solution: In-memory indexing, LRU caching, and efficient data structures

Vector Search Integration

  • Challenge: Setting up and managing Pinecone vector database
  • Solution: Automated index creation, dimension detection, and error handling

Accomplishments We're Proud Of

Comprehensive Data Coverage: Successfully processed and indexed 100,000+ student records spanning 10 years

Advanced Functionality:

  • PDF transcript analysis
  • GPA boost estimation
  • Course progression tracking
  • Prerequisite validation

Real-World Impact: Built something that solves a genuine problem for thousands of SBU students

What We Learned

  • RAG Architecture: How to effectively combine vector search with LLMs for context-aware recommendations
  • Data Engineering: Processing and cleaning large-scale educational datasets
  • PDF Processing: Techniques for extracting structured data from unstructured documents
  • AI Prompt Engineering: Crafting prompts that produce reliable, structured outputs
  • Performance Optimization: Caching strategies and efficient data structures for fast queries
  • Full-Stack Integration: Seamlessly connecting React frontend with FastAPI backend

What's Next

Short-Term Enhancements

  • [ ] User accounts and saved course lists
  • [ ] Course comparison tool
  • [ ] Schedule conflict detection
  • [ ] Email notifications for course availability
  • [ ] Mobile app development

Long-Term Vision

  • [ ] Integration with SBU's official course registration system
  • [ ] Peer reviews and course ratings
  • [ ] Study group matching
  • [ ] Career path recommendations based on course history
  • [ ] Expansion to other universities

Built With

Share this project:

Updates