Syllabus Reader

Inspiration

Our inspiration came from our day to day experience as university students. We know many students that have always wished the school would have a calendar built in that collected all important dates from classes for us in one place, but that doesn't exist. Most of us don't want to take the hours of hour day to go through each and every syllabus and painstakingly put them into some calendar, so we decided to make a program to do it for us.

What it does

This program will take a folder full of PDFs and convert them into text files so that it can parse through the contents of the document and locate certain dates and keywords in order to collect them and apply them to Google Calendar using Google's API.

How we built it

We built this program primarily on the framework of Tika-Python and the Google Calendar API. We used Tika-Python to convert PDFs into plain .txt files that we then parsed using primarily regular expressions. We used the Google Calendar API to take the organized and parsed data from those text files in order to apply them all to Google Calendar automatically.

Challenges we ran into

We ran into a lot of challenges with parsing so much raw data from the text files while trying to find only a handful of specific dates and keywords. It was also very challenging to ensure we associated the right dates with the right keywords and knew what values were important versus which were extraneous

Accomplishments that we're proud of

We are very proud of our text parsing program, it was very complicated and required accounting for a large number of various cases and keywords, as well as using regular expressions to allow us to cleanly search for information. This required a lot of time, effort and learning to pull off and we are quite proud of how it turned out.

What we learned

We learned a lot about using regular expressions as well as learning a little bit about working with API's. We also got the experience to learn about collaborating with team-mates using Git and Github.

What's next for Syllabus Reader

Implementing the app into a web-based program so that others can access it. We also would like to make the parsing cleaner and more accurate as well as allowing the program to support more formats rather than just Real PDFs, allowing it to work with word documents and implementing an OCR in order to allow scanned PDFs as well.

Built With

google-calendar
python
regex
tika
tkinter

Submitted to

HackED Beta 2019

Created by

I took lead on the Google Calendar API Implementation as well as creating the primary script that utilized all of our teams functions.

Jadon Latta
I took lead on the part of the program that parses through the raw text to find and organize the keywords and dates associated with them.

Dennea MacCallum
I took lead on the portion of the script that turns PDFs of a specific path into .txt files for parsing

Steven Jiao

Updates

Jadon Latta started this project — Nov 10, 2019 11:16 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.