Hearing about the wonders of Assembly AI we were inspired to look at how useful this could be to a student. Obviously, transcribing recorded lectures could be a great way to later review the content, but what happens when you need to look up a topic in your 639-page textbook? We wanted to make a code that could listen to the lecture and report pages of your selected textbook where keywords show up. Saving time against searching the index or, worse, searching page-by-page of the a chapter.

What it does

Page Pal uploaded an audio file to Assembly AI to be processed where the returned transcription is saved to a new file. Python then takes this text and redundant and common words to find the keywords spoken. This list is passed through to the final step where Python searches your specified textbook pdf file for each of these keywords, marking the page number along the way. Finally, you are left with a dictionary of the keywords and lists of found page numbers so you do not waste time looking something up.

How we built it

Working separately, we split the task in twain. One part of the team used Python and sample audio to push a request to Assembly AI. The part of the team focused on pdf processing in Python.

Challenges we ran into

The biggest challenge was the issues of pdf processing, we quickly learned that now all pdf files are created equal and there is great variation in how a pdf is made. Some pdfs are just a collection of images while others can be beautifully organized for pdf mining.

Accomplishments that we're proud of

We are proud that it works and that we have a starting point for a new project (see "What's next for Page Pal").

What we learned

We learned how to access and use Assembly AI, which opens up many possibilities going forward.

What's next for Page Pal

The next steps for Page Pal is to incorporate an image searching function. We want to isolate images from the pdf textbook and assign them keywords based on the "context clues" of the page the image is found on. Then these images can be accessed based on the transcribed text from Assembly AI.

Built With

Share this project: