It's the 5th hackathon for Vektor-Tech team. We had always been making web apps and wanted to expand to a new platform. We had recently tested a real-time object detection model called YOLOv3 and wanted to apply to real life problems. Computers now have the ability to 'see' in a reasonable amount of time, and we thought it was perfect to aid visually impaired people by providing people with a product that makes their life easier.
What it does
Currently, we just made an application on a mobile phone. We imagine a future where the visually impaired have a fixed camera in their body (ex. glasses, head) for better UX. It can detect objects in the camera's field of view and translate that to basic speech for the user to hear.
How we built it
We build the mobile application using the Flutter SDK. We also used Google's Object Localization API to detect objects in the frame. And the speech was spoken by native device API.
Challenges we ran into
We ran into a lot of challenges, mostly focused on UX. We wanted a near real-time object detection on the device, but the mobile phone was too slow to process the frame feed. And Natural Language Generation (NLG) was a whole another beast. We had the vision to have similar responses to the likes of Google Assistant & Siri but it was too complex to do in 24 hours.
Accomplishments that we're proud of
We are proud that we stuck with the idea even though it was challenging and felt like there were no options left.
What's next for Seeker
We want to train an NLG model to generate natural responses to the user's query and include a better and faster object localization model.