Inspiration

I like to think that I am an extremely fair student. Even if I scored the best, I will go to the teacher and get my scores changed if I spot my own mistake. In times of the pandemic, the education sector, particularly the correction and submission part, has suffered a setback. Me being a student and my mother being a teacher (in different institutions) gave me both perspectives of online checking.

Usually, submission and correction process proceeds as follows : Teachers assign a question paper to students on platforms like google classroom, and students submit photos of their answers. Seeing my mother having to open each image and compare it to the original answer, and squint to see students' handwriting, literally scouring through the whole answer n-times to find points worthy of awarding marks; it was a reality check. It stopped me from participating in conversations with my friends about how the teachers took so long to check our papers, and drove me to try and make an alternative efficient tool that quickened the pace of checking process without having to compromise in accuracy. Using AI seemed to be a perfect fit. AI was the perfect enzyme for this substrate of a problem, like pepsin is to proteins. I thought I could ease this community problem faced by all teachers around the world, and also help students correct their own answers independently. Also, it helps that this tool is paced with current technology and also helps in modernizing education sector.

What it does

My solution is an offline grading tool. Teacher specifies correct answer and how many marks does it weigh. The image of an answer is uploaded. A CV model detects text from handwriting in the image. A textual answer is obtained for text processing and is inputted to a document similarity implementation to check the percentage of similarity of answer to question and a user-specified standard correct answer. Then, according to threshold values in the code, it calculates how many percentage should the answer be awarded. According to the marks alloted to the question (specified by the user), marks to be given are outputed.

How I built it

I used flask for frontend. Pytesseract extension for python, to convert image answers to text form. CV for processing image to make it suitable for handwriting recognition NLP (cosine similarity) for percentage similarity. Specifically, Scikit learn library to perform nlp cosine similarity to compare the correct answer to the students answers.

Challenges I ran into

I couldn't get the correct threshold for marking scheme, yet finally, after asking teachers' opinions, I devised one. I couldn't write code for a proper handwriting to text, but I used OCR (optical character recognition) to prove my idea, and give a working demo.

Accomplishments that I'm proud of

It works quite well for an implementation of a multi-faceted use of AI for the first time by me. It is a project perfect for demo use, with a simple frontend. I am happy that my idea is unique as well, and is relevant to present scenario.

What I learned

The working of various document similarity techniques, and on a broader scale how NLP actually works. Also, why pre-processing of images before running a handwriting recognition tool on it, is required, and various python libraries for doing so. I also learnt how to harness the Tesseract-OCR using python (using its python wrapper) How to make a frontend for a python code. How to use the css stylesheet provided by bootstrap for a more attractive front-end.

What's next for Grade-It!

First off, I would like to use a full-on handwriting detection library and not an ocr. I want to make the grading threshold to be decided using AI itself, and not manually encoded. The numbers thus obtained after running a cosine similarity model on the correct answer and student answer is graded using logistic regression and a data prediction model. This functionality needs a large csv database containing percentage similarity of standard answer to question, of standard answer to many candidates' answers, and of candidates' answers to the question (all float values). The data to be predicted is the percentage of marks to be awarded from allotted marks. For example, an 80% predicted score on a 5 mark question, means it is awarded 4 marks. Finally, I would like to convert this into a chrome extension that automatically takes in the image submissions from Google Classroom, and then gradually add in other submission facilities.

Share this project:

Updates