Inspiration

The COVID-19 pandemic has upended traditional education and presented unique challenges in the form of digital learning. A survey from the University of Waterloo shows that 75% of students reported that lack of motivation is the biggest challenge [1]. This lack of motivation snowballs to the point where students feel more overwhelmed online compared to in-person lectures. This lack of motivation is emphasized by a feeling of disconnect, and fewer opportunities to ask questions in an online setting. This is backed up by COVID-specific research which states that 41% of students reported that the instructor’s knowledge of their personal strengths and weaknesses was worse during remote instruction [2].

What if we could build a tool…


to help students survive the stress of not knowing what they don’t know?
to help students know what they don’t know?
to help connect teachers with struggling students?
to increase the teacher’s understanding of where individual students stand on course material.
Supplement student’s education with information tailored to their needs.

What it does

  • Compare teacher content to student notes to find keywords that the student may not have picked up on.
  • These keywords are presented to the user as ‘Suggested Study Terms’.
  • Uses machine learning to pull out those keywords to save time and energy.
  • For keywords that students may have missed, they will be presented with low-pressure practice quiz questions about these missed terms so that they can continue learning.

How we built it

  • Our backend is built in Python
  • The frontend was generated through Steamlit
  • Users can upload text files (.txt) of their notes and their teacher’s notes using a subclass of BytesIO.
  • Each file is parsed and keywords are extracted using the Natural Language Toolkit (NLTK) and keyphrase extraction toolkit (PKE).
  • Knowledge from our team member (Ph.D. student in Applied Linguistics and Technology) combined with peer-reviewed academic journals, led us to use the unsupervised TopicRank() model due to its [relatively high precision] when filtering out topic-redundant [keyword] candidates” [3]
  • Classified keywords based on the context of the sentence using Word sense disambiguation (pywsd) + NLTK tags
  • Split text into sentences using nltk.tokenize, locate and replace the keywords with a [BLANK]
  • Distractors are pulled from the classified keyword groups
  • Distractors are other keywords within the NLTK tags from the teacher’s notes
  • Words pulled from WordNet grouped into sets of cognitive synonyms (synsets) [4] [5]
  • Words pulled using ConceptNet API (semantic network)

Challenges we ran into

  • We used a very old library: pywsd, so we ran into problems trying to compile our code. With the help of a mentor, we figured out that we needed to rollback other dependencies for the pywsd to be compatible.
  • It was also difficult choosing the best PKE model since all models were similar in their effectiveness. We tweaked a few of the parameters and eventually concluded on using TopicRank().
  • It was difficult trying to use the NLTK outputs in their nested format. Eventually, we decided to use numpy to flatten the array so it was easier to work with.
  • We were interested in app development and wanted to try Flutter so that users could take pictures of their hand-written notes. However, we were short on time and decided to use Steamlit straight from Python instead.

Accomplishments that we're proud of

  • Our team had several first-time hackathon participants.
  • For several team members, this was our first time using Python.
  • All team members challenged themselves and used technologies they had never tried before.

What we learned

  • Natural language processing algorithms (NLP) and the many tools that python offers.
  • Unsupervised and supervised keyphrase extraction models (PKE): TextRank(), SingleRank(), TopicRank(), PositionRank(), MultipartiteRank().The documentation provided many useful examples.
  • Lesk Algorithm and how it’s used to determine the definition based on context.
  • UploadedFiles are different from static ones: A subclass of BytesIO (“file-like”) so must be handled with care.

What's next for Student notes checker

  • Allow more fine-tuned control for teachers over the topics/keywords
  • Currently only works on typed notes (.txt files) - implementing OCR technology will allow students with hand-written notes to upload theirs as well. Use fuzzy logic to account for errors in image -> text translation, typos, intentional short forms
  • Automatically flag students with a low similarity score and notify the teacher so that teachers aren’t left in the dark about their students' knowledge.

Resources


[1] https://uwaterloo.ca/institutional-analysis-planning/sites/ca.institutional-analysis-planning/files/uploads/files/spring_2020_student_survey_results_final_final-ua.pdf
[2] https://digitalpromise.org/wp-content/uploads/2020/07/ELE_CoBrand_DP_FINAL_3.pdf
[3] https://www.aclweb.org/anthology/N18-2105.pdf
[4] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6294274/
[5] https://www.c-sharpcorner.com/article/lesk-algorithm-in-python-to-remove-word-ambiguity/

Built With

Share this project:

Updates