Inspiration
Our project was centred around theme of 'augmentation' along with consideration of the coding challenges 'sustainability'. Our program offers a new method of study to those who are perhaps unable to afford / access conventional education. We developed it during DurHack 2022.
What it does
Our project culminates into a smartphone app that scans (using smartphone cameras) worksheets problem-sheets, textbooks and all other such study media. The scan highlights keywords, topics and points of interest within the text and returns to the user with possible areas to research. For example as seen in the demo pitched to the assessors, the program might scan a maths question and deduce which topic of maths best describes the question - whether that might be arithmetic, algebra, geometry etc. Another example may be using the program to scan a history textbook, which will in-turn highlight key names, places and events and return to the user with information that might be of interest to them. The most important factor of this program, would be that this program helps students to learn and gain a curiosity that inspires learning. This means that the program would not provide the user with a direct answer to a question but only point them in some of the right directions - which turned to out be one of the most challenging part of this project; more difficult than giving the user a direct answer.
How we built it
In the 24 hours of this Hackathon, we started with creating a camera-scanner app using React Native in JavaScript that can capture and crop pictures of words and symbols. The app then connects to and sends the image to a simple API built in Python's Flask module. This API calls on a function we designed using the PyTesseract module (a Python wrapper of the Tesseract image-to-text framework) that allowed us to read the characters and symbols in a given image and recite the text observed within. This new data is then fed to a neural network. It was originally developed and trained using Wolfram Functional Programming, but we decided to train a model using Python's {module here} as it integrated much more easily into the rest of our project. This neural network would then feedback to the smartphone app, a list of possible topics (commonly known to practitioners of the observed subject) that a certain question (in a worksheet) might be alluding towards. This app would then report its findings back to the user. The images attached show the prototype we arrived at by the end of the Hackathon - an app that was able to capture and translate images into text.
Challenges we ran into
There were two main challenges that cemented themselves as the biggest issues we had to solve. The first of the two was creating and training a neural network to observe a dataset of maths questions (in plain english) and deduce what topic each question fell under - for example, this might be rounding, arithmetic, logarithms, geometry etc. The other was integrating all the parts of the project together. We originally wanted to develop this network using Wolfram Functional Programming that advertised a very capable and power system. The struggle came with integrating this neural network with an API. Since the smartphone app was written in React Native, it made sense to build the API in JavaScript and integrate the neural network with JavaScript. But this proved to be very time-consuming and extremely problematic in execution, which led us to taking a different approach. We instead developed a neural network in Python that was trained in a similar way to carry out the same task. We also developed the Tesseract-API in Python as it happened to integrate very easily with the smartphone app - much better than the Wolfram neural network did with JavaScript.
Accomplishments that we're proud of
We were especially proud of creating the React Native smartphone app that would scan and take photos for our project. We were especially impressed at how quick it was to develop a Python API and train a Python neural network - especially with gigabytes worth of datasets. We are very proud of how we got as a team in this Hackathon and are very happy to submit what we have.
What we learned
We learnt that is difficult to train a neural network in Wolfram Functional Programming and integrate it with JavaScript.
What's next for Study Buddy
This prototype currently covers Mathematics, so we would aim to improve the dataset of the neural network to handle other subjects - such as Chemistry, History, English, Music and others. We would also want to optimise our application and its API for security and performance.
Ironically, in the last 5 minutes of the event, Enego figured out how to interface with the neural network, from the python server. Sadly, however, we were unable to integrate this into our project in time.
Log in or sign up for Devpost to join the conversation.