Sumaritan

Inspiration:

Our inspiration came from the people around us - the elders and peers with visual impairments that hinder their everyday life and students with endless amount of readings and not enough time.

What it does:

Sumaritan is a program that captures image and gives the user either a voice or summary. Users are able to use the camera to capture an image from which the text is extracted through machine learning models provided by the Google Vision API. They have the option of either receiving a summary from the text or receiving the audio function that will read the text within the image so that the visually impaired users would be able to "see". After receiving the summary, they will also have the option of sharing their information with acquaintances through Gmail.

How we built it:

As a team, we worked together initially by brainstorming ideas for our project. Then we drew a diagram of the different steps in the sequence and assigned each team member their part. We would work separately and after we were done we connected the programs. The first part was coding the image capture function with camera portion. While coding for the program, we used the Google API for the text recognition function.

Challenges I ran into:

One of the major challenges we had was finishing on time. We had hoped to include more functions but it was not possible within the time we had.

What's next for Sumaritan:

We hope to add more functions that we were not able to due to the limited amount of time. For example, we had the idea of allowing students to highlight text that would then create a separate document with just the highlighted text. But most importantly, we hope to develop its usefulness further by connecting the software with GoPro and the user's phone in order to allow the visually impaired users to have the program at all time.

Built With

aylien
gmail
google-cloud
google-cloud-vision
opencv
talk-to-text

Created by

I helped formulate a plan for the project in terms of what challenges we should pursue and how the audio representation of the text would add to the benefits the application can have for the visually impaired. I created the initial interface design (panel with responsive buttons). I researched different types of summarizing api documentations. I created a summary for the project and defined the critical points to articulate in a pitch.

Alisha Sehgal
Brainstormed idea with team and mapped how to approach solution. Changed camera input to picture when necessary. Used Google Cloud Vision API to perform machine learning analyses on camera inputs and extract words from image. Looked through deep API for text summarization through deep learning, but found that Aylien performed the job better. Used Mary TTL to change the given text to speech for audio learners and visually impaired. Finally, connected Gmail to send extracted information to acquaintances through the form of email.

Regina Wang
I collaborated with my team to brainstorm and plan the elements of our application. I suggested that we include the opportunity to share to friends through our application to allow students to share their summaries with other students. I also introduced the idea to include the auditory element of our app so that auditory learners can listen to their readings as well as read them. Also, students can review their readings auditorily while performing other acts in their daily lives. I helped search for a text to summary API and helped contribute to the initial screen interface of our application. Additionally, I contributed to planning and presenting our project pitch.

Emily Ferguson
Angela Wang

Updates

Regina Wang started this project — Oct 14, 2018 10:03 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.