First off, as students, with little to no time, we constantly seek resources to aid us in furthering our education. One effective method of improving our coding skill set is writing our code on paper or whiteboard. Unfortunately, this method requires more time due to the slow nature of later reviewing and rewriting the code into the compiler in order to test it. This inspired us to create CodingBoard, an image-to-code scanner and compiler/interpreter that simply takes a picture of your code and compiles it, allowing coders to test their written draft code quickly without actually typing it out.
What it does
Using Google’s Cloud Vision API (OCR), the app is able to accurately pull text out of an image, pass it (as code) to a compiler or interpreter, and run your program within seconds. Furthermore, it is able to recognize the language of the code within the image through a custom-built language recognizer. Currently, we support Python and C, but it would be trivial to support more languages.
How we built it
CodingBoard is a mobile + web app backed by a few backend services. The mobile application is built on Objective-C. The iOS is able to take images and crop them to heavily reduce image size and improve OCR performance. The web app uses (React Ace) to allow the user to edit code faster. The web app allows you to quickly upload images and test outputs. It's also handy for writing code and for quick edits. Both of these clients interface with the Go backend.
We have two Python microservices that handle image processing and programming language detection, and a Go service that calls upon these two services. The Go service simply acts as the data frontend for both the web and the client.
Challenges we ran into
One challenge we faced was handling multi-server apps. Of course, the app becomes highly network-bound, and teamwork becomes more important.
Tuning the image processing was also a challenge that was resolved by filtering the images to pop out the text.
Accomplishments that we're proud of
We’re really proud that we got to finish all the core features of the app. We were able to effectively use the Google Cloud Vision API, set up the compiler / interpreter API’s, and create both a web and mobile app.
What we learned
[Chau] learned how to integrate multiple services together.
[Hoa] learned how to connect to multiple API’s from the iOS app.
[Tuong] learned how to use multiple React libraries such as Material-UI and React-Ace.
[Georgio] learned how to use Flask for Python and how to edit images through libraries.
What's next for CodingBoard
Where do we start? There’s so much that we want to incorporate into the app, whether it’s features or bug fixes.
We would firstly like to add more programming languages. Second, we would like to be able to spend more time developing machine learning models to train text recognition for handwritten words. We also want to explore other ways of processing the images in a way that OCR can more accurately extract text.