Create an OCR & StudyBot specifically for educational content, with a focus on identifying interactive elements such as fill in the gap/cloze test, tick boxes to turn non interactive content digital and mobile optimised.

Technology is still very under-utilised in a lot of segments in education, I studied Spanish in 2009, 10 year later I’m study Turkish and at physical school very little has changed, yet as a student things have changed a lot.

The goal is to use A.I to accelerate student learning by improving learning content and make it easy and fun to get help via Sato the StudyBot.


There are 3 issues I’m trying to solve:

1) Publishers and education institutes need to adapt to changing student behaviours

In a mobile/digital world delivering classes with photo copied handouts or expecting students to carry large/heavy textbooks is still surprisingly common and not going to cut it for much longer, this will be a key competitive advantage for modern student looking for places to study,

Pearsons (one of the world's largest publishers) recently announced they are no longer going to produce paper textbooks.

Most publishers and educational institutions, small and large have a wealth of paper based learning assets that are not suitable for digital consumption and not easy to convert.

2) Study notes, either hand written or typed are inefficient.

Normally when I learn something new I have several sources of learning, currently I go to a physical language school, I take online classes and I use a language learning app.

Although the use case is language focused, the app works well for any subject (math, history sciences, anything that has textbooks and study notes either typed or handwritten).

Math Example Math Example

I quite like hand writing notes and studies indicate hand writing versus typing improve ability to recall. But I find dealing with handwritten notes really ineffective, I can’t search, not easy to edit and have have to carry around.

Apps are super convenient but very limited in how you can take control with your own learning, there is no easy way to combine so you end up taking parallel learning paths, some repetitions and re-enforcement is good.

Online classes are great, you have someone to hold you accountable you have personalised homework you learn at the best pace.

The problem is all these sources have pros and cons, they very siloed ending up with fragmented study notes.

When it comes to the challenges of learning, streamlining and removing and barrier makes a huge difference in motivation/avoiding procrastination and achieving goals.

3) Building a meaningful education focused dataset is hard, by integrating with study notes and course content I have an inbuilt method of creating a fit for purpose data set for the Study Bot.


Paper Magic Ai

Paper Magic AI provides a central location for all my study notes that works with any method I use that is easy to search.

It’s agnostic with what I provide:

  • My studies notes or someone eases (typed or handwritten)

  • Photos of pages from a textbook/handouts

  • PDF of a textbook

  • Photos/screenshots of any other apps I’m using

  • Screenshots of online class

  • Photo of teacher’s whiteboard

This is more than just study note repository, using A.I and Machine learning it makes all my notes interactive not just the text but also any known interactions types such as fill in the gap/cloze tests, true or false exercises.

So by simply taking a photo of the content I get a version that is better, searchable, enriched with translations, text to speech etc and helps me study by automating quizzes based on the keywords and phrases in the notes.

Compliment this with an Azure powered “study bot” to help and I’ve built myself the perfect study companion app.

This also provides a very organic way to seamlessly build out an education focused dataset over time.

For publishes it provide an easy way to make their content "digital", once it’s digital you can make it interactive and enrich the content to obtain learning analytics to improve content in real-time.

Simply uploading a textbook as PDF or even taking photos page by page will produce a digital, mobile optimised interactive version.

How I built it

Front End:

Ionic single cross-platform (web/android/iOS) code base, VueJs Javascript Framework, FabricJs Html5 Canvas Library, H5P interactions, Cloudinary


Python, Django, Django-Rest-Framework, Open-CV, Wolfram Alpha, Spacy.io, Node.js for Azure Bot

Microsoft Services:

  • Azure Virtual Machine
  • Azure Database for PostgreSQL server
  • Azure Cognitive Services
  • Computer Vision
  • Custom Computer Vision
  • Azure Bot Builder
  • LUIS
  • QnA Maker
  • Bing Search
  • Bing Entity Search
  • Bing Answer Search
  • Translator
  • Speech to text
  • Text to speech
  • Immersive Reader SDK
  • Adaptive Cards


1) Student/Publisher/Teacher provide some content. This content can be in any format PDF, Powerpoint, handwritten, typed, it will even work with a photo of a teachers whiteboard.

2) Using cloudinary I convert the content into a PNG image (important for PDF and Powerpoints)

3) I use Azure Computer Vision API to extract the text from either the printed or hand written document

4) Runs a Custom Computer Vision function to identify known interaction types, currently fill in the gap/cloze test and tick boxes.

5) Clean and tag data, using a combination of Azure ML and spacy.io

6) Render in UI: Using the bounding boxes I overlay the text and recognised interaction on top of the original image using HTML5 Canvas(fabric.js)

7) As a student I can highlight any keywords or phrases, which are then stored in the database.

8) The user completes the interactions and highlights any keywords or sentences they want help studying

9) Using some custom NLP functions meta data is generated (nearest neighbour, synonyms, part of speech tagging etc) to help generate suitable questions and answers and align with the correct "intent"

10) Using several datasets Wikipedia QnA, Netflix subtitles, we look for similar usage of the word/phrase, we also rank of frequency to try and find suitable matching phrases and answers. (This is part of the process to build out the education focused dataset )

11) Quiz questions are automatically generated and new data is added to the QnA for so the Study Bot is primed to answer potential questions (using Azure Bot Builder , LUIS and Azure QnA Maker) I used a combination of H5P a Microsoft Adaptive Cards to create the interactions.

12) If no matches are found within a certain level of confidence I use combinations of the following Bing Entity search and Wolfram Alpha and Bing Web Search, Bing Answer search to retrieve potential quiz answers on the fly, the back end order depends on the "intent" and "entities" detected within the text/question.

13) We now have personalised study notes and quizzes based on the keywords and phrases highlighted and interactions automatically identified.

14) Study notes and textbook data is now easily accessible on mobile devices searchable and interactive no matter their original source.

15) Study bot is now capable of providing better help, as the scope of possible answers is narrowed to the subject (history, math, language, science etc ) and suggestions can be more accurate.

Paper Magic Ai

Challenges I ran into

Auto generated quizzes and Study Bot needed a lot of data, it produces some incorrect or “awkward” results sometimes but combining LUIS and QnA Maker with some fallback external apis works pretty well, and hopefully I can continue to build out a good corpus set over time.

Main takeaway from the A.I/Machine learning perspective is the importance of identifying the right tool for the job...the Azure custom computer vision API and Machine Learning Studio are awesome! However to get satisfactory results for detecting some content interactions combining with open-cv gave me a better outcome, having a mix of both seemed ideal for my use case.

Accomplishments that I'm proud of

Functional web and mobile apps (Waiting for approval in app stores iOS and Android)

Using Azure backends and APIs to do a lot of the heavy lifting making it easy as an individual developer to do a lot! As of submission date I achieve roughly 85% of what I had in mind for the first iteration of this App.

Mobile app works better than I imagined, I've started using it for my own studying needs, also testing out some digital versions of textbooks that I now have access to on my mobile.

By encouraging students to upload educational content I have an implicit way of obtaining the data to create the corpus set for the StudyBot, over time this should improve the apps ability to auto generate new content and provide more meaningful help via the StudyBot, the more students use it the better it gets.

This was my first use of a “chat bot” and I’m pretty excited about the potential uses of this in the future to help students stay motivated using a fun and engaging interface.

Motivated students are good student, Sato Study Bot has inbuilt motivation memes :-P

Study Bot makes learning fun

What's next for PaperMagic AI

Over time I hope to turn a lot of the data into purpose built dataset to improve the quiz generation and answer data for the Study Bot.

Get mobile version published in app stores

Improve course builder to allowing "theming" and add an easy way to convert PowerPoints files into interactive courses.

Public API for content publisher to provide an easy way to make their paper based content digital first, once it is digital you can enrich the content and also provide feedback on usage.

Based on the users interactions detailed learning analytics could be extrapolated to further improve the content and optimise to improve learning outcomes

Try it out: link

Username:azure password: azuredemo

Built With

Share this project: