The VisionX Project
Written without a drop of AI, looks so only due to formatting.
What it does
VisionX is a portable device to assist individuals with visual impairments, autism, and dyslexia. It uses AI to analyze images and provide real-time audio descriptions of the content. For example:
- It can scan text from books, newspapers, or signs and read it aloud to the user.
- It can describe objects in the user's environment, helping them navigate and understand their surroundings.
How we built it
Hardware:
- Raspberry Pi: The central processing unit of the device, running the software and controlling the camera.
- PiCamera Module: Captures images of the text or objects.
- Button: Triggers the image capture and audio output.
- Power Source: A rechargeable battery pack to power the device.
Software:
- Python Programming Language: Used to control the Raspberry Pi and interact with the AI.
- AI Model: Utilizes an AI model (like the one from Gemini) to analyze images and generate descriptive text.
3D Design:
- Enclosure Design: Created using Tinkercad, a 3D modeling software.
- Focused on protecting the internal components.
- The 3D-printed case houses all the hardware safely.
Challenges we ran into
- AI Accuracy: Ensuring the AI model consistently provided accurate and informative descriptions.
- Image Processing: Optimizing the camera and algorithms to handle various lighting conditions and text sizes.
- Size of Power Source: On the STEAM day event, we realized the power source was too big to fit with the other components, requiring us to shred some wires to make space.
Accomplishments that we're proud of
- Successfully developing a functional prototype that meets the core objectives of assisting individuals with visual impairments.
- Mastering the use of Raspberry Pi and Python for hardware and software development.
- Designing and 3D-printing a custom enclosure for the device.
- Integrating AI technology into a real-world application with a positive social impact.
- Learning about the challenges involved in developing assistive technologies.
What we learned
- Technical Skills: Python programming, hardware interfacing, 3D modeling and printing, AI integration.
- Project Management: Teamwork, time management, problem-solving, and iterative development.
- Interdisciplinary Collaboration: Working effectively with team members possessing different skills and expertise.
- User-Centered Design: Considering the needs and perspectives of target users throughout the design and development process.
- Social Impact of Technology: Understanding the potential of technology to address real-world challenges and improve lives.
What's next for The VisionX
- Miniaturization: Reducing the size and weight of the device for improved portability and ease of use.
- Enhanced AI Capabilities: Integrating advanced AI models for improved accuracy, multilingual support, and real-time video processing.
- User Interface Enhancements: Developing a more intuitive user interface, potentially with voice control options.
- Expanding Use Cases: Exploring applications beyond text reading, such as object recognition, scene understanding, and navigation assistance.
- User Testing and Feedback: Conducting extensive user testing to gather feedback and refine the device based on user needs and preferences.
Repository: https://github.com/lordshyam/The-VisionX-Files
Log in or sign up for Devpost to join the conversation.