As I was applying to my classes last week I realized how hard it was to get the information needed to make an educated choice on which professors and classes to pick. When I would finally find the perfect one, it had been too long and the class would already be full. I decided I wanted to make a tool for students everywhere to be able to get into the classes and professors that matched the student best. I saw how many kids at my school were asking for information about teachers but no one was able to provide that.
My goal is to make a tool that others can use and will benefit them, this was the perfect opportunity. I wasted no time in getting started.
What it does
ProfesSort does three things.:
- First it parses through a school's site to get all the information it needs.
- Then it pushes all that information to a database
- Finally it presents the information in a way that is easy to navigate and provide actionable data.
- Gives a rating out of 5 about a class and professor based on Administration surveys ## How I built it
I started off by figuring out how my school's site was structured. And then build a web scraper with Python using Selenium to download all the data points.
My school's system was very slow and unreliable so I calculated it would take over 4 hours to get all the data.
I scraped only one page. And then while it was scraping everything else, I used the one page to develop the method of uploading to my database and then building the web app.
This way as soon as the rest of the data was collected, my site would simply scale to accommodate it.
Challenges I ran into
Web scraper would stop due to Schools website failing - I fixed this by converting the data from each page into a file so that if it did fail, I could just resume from where it stopped.
Having all the features work at the same time - The site requires a lot of sorting through data, and I had so many different things that I needed to make did not interfere with each other
Accomplishments that I'm proud of
- Turning a massive site into 14mb worth of information on a DB with 27k entries
- Using a mix of different technologies to make this project work, if it didn't work in one language. Then just transfer the data to another and continue!
- Having a tool I can send to my friends and have it work perfectly without bugs!!! (At least that I found)
- Getting feedback from friends and integrating the changes based on what they would use!
What I learned
- Web scraping strategies and using jQuery to find exactly what I need.
- All about web element nodes
- Dynamic sorting of Arrays of Objects
- Each school has its own data labeling, storage, and access permissions. Even between CUNY's
What's next for ProfesSort
- Get teammates and people to work with
- Send it out to everyone at my school and get feedback
- Add professor images
- Add forum where students can give more subjective feedback on professors
- Get a student from every school possible onboard to collect data from every school so the tool can be used by anyone and even can be used to pick schools!
- Get big enough where schools will give us direct access to all the metrics used to evaluate professors