About the project

Inspiration

SyllabusSync started as a practical answer to a recurring student problem: syllabi are full of dates, but manually transferring assignment and exam deadlines into calendars is tedious and error-prone. I wanted a tool that makes course deadlines instantly actionable—scan a syllabus once and automatically populate your calendar with due dates and reminders—so students and instructors can focus on learning instead of scheduling.

What I built

SyllabusSync is a small web app that parses course syllabi to extract assignment and event dates, normalizes and stores them, and syncs the resulting events with Google Calendar (and a Supabase backend for persistence and multi-device sync). The UI supports uploading syllabus files, reviewing parsed items, and choosing which events to add to a calendar.

How I built it

  • Frontend: built with a TypeScript + React stack using Vite, Tailwind CSS, and a collection of UI primitives in ui. The app provides an upload flow, a processing step, and a calendar sync flow.
  • Backend / persistence: uses Supabase for user/session management and to store parsed assignments and user preferences in supabase and supabase.ts.
  • Parsing: a core parser in file-parser.ts extracts candidate date strings from uploaded syllabus text, then applies normalization and heuristics to interpret ambiguous dates (e.g., "Monday, Oct. 5" vs "Oct 5" vs "5 Oct").
  • Calendar integration: Google OAuth + Calendar API is handled in google-auth.ts and google-calendar.ts to create events based on the normalized deadlines.
  • Testing: unit tests for critical parsing and normalization logic live under test and use the project's test setup in vitest.config.ts.

Key idea (brief formalization)

Parsing dates follows two main steps: detection and normalization. Let S be the set of detected date-like substrings in the syllabus text T. The parser applies a function detect(T) → S, then a normalization function norm(s, ctx) → ˆd that resolves s into an ISO date using the syllabus context ctx (course term, year hints).

We can express normalization heuristically as:

  • If s contains a full year, use it.
  • Else infer year y from the syllabus term; if the inferred date is earlier than the syllabus start, increment y. A concise rule used in code is: $$\hat{d} = \text{norm}(s, y_{\text{term}})$$ where norm resolves month/day and chooses $y \in {y_{\text{term}}, y_{\text{term}}+1}$ minimizing $|\hat{d} - \text{median_term_date}|$.

What I learned

  • Robust text parsing needs layered heuristics. Simple regexes find candidate dates but fail on context-dependent phrases like "first Monday after spring break" or relative deadlines ("two weeks after lecture 4").
  • Testing with real-world syllabi (PDFs, DOCX, and HTML) revealed many edge cases: nonstandard date formats, typos, and date ranges.
  • OAuth and calendar sync flows are straightforward in development but require careful token refresh and permission scoping handling for production.
  • Building a pleasant UX for verifying parsed dates is as important as the parser itself—users must be able to correct mistakes before events are pushed to calendars.

Challenges faced

  • Date ambiguity: phrases without a year require inference from context; term boundaries (e.g., fall term overlapping the new year) can lead to off-by-one-year errors.
  • Document formats: PDFs often need OCR or PDF text extraction; varied authoring styles mean the parser must be tolerant and resilient.
  • Privacy and permission scope: only request necessary OAuth scopes and clearly explain why calendar access is needed.
  • Edge-case scheduling: recurring items, ranges ("due Oct 10–12"), and relative deadlines required special handling and sometimes manual user confirmation.

Notable implementation details

  • The parser's normalization pipeline uses a priority of cues: explicit year > month name > month abbreviation > numeric-only dates. When ambiguous, it uses a small scoring function that considers proximity to the course term midpoint: $$\text{score}(\hat{d}) = |\hat{d} - m_{\text{term}}|$$ and picks the $\hat{d}$ with the smallest score.
  • For date ranges the system creates a single event at the end date by default but exposes the range to the user so they can create multi-day events if needed.
  • Supabase stores parsed events with a confidence score; the UI surfaces low-confidence items for manual review.

Future work / improvements

  • Add more NLP: use a lightweight transformer or rule-enhanced model to interpret relative phrases ("two weeks after midterm").
  • Improve PDF parsing with a robust OCR fallback to handle scanned syllabi.
  • Add calendar templates or course import (CSV) for instructors to upload canonical schedules.
  • Handle recurring events and series more intelligently (e.g., weekly labs).

Closing note

SyllabusSync grew from a simple frustration into a practical toolchain: file ingestion → heuristic parsing → user review → calendar sync. The core trade-offs were between automation and user control; deliberate confirmation steps help keep calendar data accurate while saving a lot of manual work. config.ts.

Share this project:

Updates