ML Model for object detection and narration for blind people

ML Model for Object Detection and Narration for blind people

Inspiration

Physically challenged people faces a lot of challenges in their daily life. Specially people who are blind have difficulties to go out alone in the road or to drive and they are restricted to many such activities so we want to give them computer vision and translate it through audio so that they become capable of doing day-to-day activities with ease.

What it does

Explore how the ML model detects objects using advanced algorithms like Deep Neural Network and the mechanism behind narrating the objects to visually impaired individuals in a comprehensive and intuitive manner.

How we built it

Tensorflow, python3, ImageDataGenerator, Keras, ResNet50

Challenges we ran into

We faced many challenges in setting up the linuxOne server, importing and installing the essential libraries and frameworks like yolov8, ultralytics, opencv e.t.c. The yolov8 and its library ultralytics were not avaialable in the linuxOne platform so we had to use another library like Tensorflow, keras, deep neural networks for object detection which was very much challenging phase.

Accomplishments that we're proud of

We established a successful model for the object detection by using deep neural networks and image to data generator with accuracy of almost 99%. We also developed the model for text to audio converter which is although in working phase and has a minute error. This model will help the impaired vision people to see them nearby objects by detecting.

What we learned

Learn about the utilization of TensorFlow's DNN model for accurate object detection, the training process on IBM LinuxOne community cloud, and the integration with the comprehensive coco dataset.

What's next for ML Model for object detection and narration for blind people

Delve into the possibilities of future advancements and enhancements for the ML model, aiming to further improve accuracy, expand functionality, and cater to specific needs of blind individuals.

Built With

imagedatagenerator
keras
python
resnet50
tensorflow

Submitted to

IBM Z Datathon 2023

Created by

Worked on the neural network and setting up of the server.Made the presentation.

Srimanth Dhondy
I created an informative and visually engaging video presentation for the project, explaining the concept and implementation. Demonstrated expertise in executing the terminal commands and running the Python script for the object recognition model.

Nisarga G R
I worked on the setting up the linuxOne server, neural networks, uploading the project with submitting all the details about it and managed all the stuffs, collaboration and team working.

Raj Goyal
Priya Nandi
Bhuvana reddy Kancharla