College students face the dread of course registration every season. We take on the dilemma of how much time we're willing to spend crafting the "perfect" schedule. Some students want tighter schedules while others want more time between classes; some students want fewer class days with longer times while others want shorter class times with more days; some students want a choice.

What it does

Simply enter in your classes/preferences and our web-app will analyze and filter through every possible schedule combination to then return 25 of the best schedules that fit your customization. From various tests, most schedules are created and rendered in less than 2 seconds, though the application is able to analyze more than 2,000,000 schedules/second. Currently supports UCSB, SDSU, and UC Berkeley registration.

How I built it

I started with a top-down approach by reminding myself to keep "universality" in mind. I'm most familiar with my own school's registration so I began by designing a dynamic web-scraper to extract desired registration data. I observed what pieces of data are necessary for students to consider registering (CourseID, Schedule #, CourseTitle, # of seats, and days/times).

After completing the SDSU web scraper, I stripped it to its essential roots and created an abstract GeneralScraper for registration data that I later used for UCSB and Berkeley. I spent the majority of this project trying to figure out a way to iterate through class schedule combinations in a reasonably timely way. Once I made the process work for SDSU I moved onto UCSB, UCB, and implementing a user-friendly online interface through a basic admin-page style of design.

Challenges I ran into

SDSU's registration was straightforward to parse with many labels in the HTML. Berkeley and UCSB were whole other nightmares in terms of their public registration data. I was close to giving up until I noticed Berkeley has an Enrollment Data API. However, it's only accessible to students with an department-approved individual API-key. I ended up needing to reverse-engineer how this API makes requests by inspecting multiple minified Javascript files. After piecing together elements of the API and their public registration-search web-app, I was able to get the information I needed.

UCSB's public registration-search web-app gave the bare minimal amount of parse-able helpers in their page source HTML. They also generate a Base64 encrypted event-validation key hidden in the document form, making remote queries very hard. I decided to use a server-side Selenium WebDriver to act like a user, fill out and submit the form, and then download the resulting page source to be parsed.

While the program can permute through schedules quickly, there are major holdups in retrieving/parsing registration data from UCSB and UCB. These holdups can last all up the way up to 3-4 seconds per entered course (about 30 seconds total time for user to get results).

Accomplishments that I'm proud of

At first, my method for permuting through possible course schedules was slow and very brute-force. I was able to significantly speed it up to an average of 1.5 - 2 million schedule permutations / second.

What I learned

Sleep is more important than finishing the feature you've been working on for hours. As I'm writing this I have gotten less than 3 hours of sleep within the last 36 hours and it is painful.

In retrospect, I spent more time working on the UI than my dead tired eyes should have allowed.

What's next for College Schedule Optimizer

More schools, better domain name!

Share this project: