Vishy

Vishy User Interface

Challenge

Challenge 3 - AI for Social Good

Inspiration

We noticed that many images don't have an alternate text that describes that image. This is problematic for visually impaired users. Therefore, we created a tool that describes an image.

What it does

One uploads an image to the tool, types any other information they want, and then a description appears. One can press a button to read the text in the description.

How we built it

We built this by building two separate programs: the front-end and the back-end. The front-end contains the user interface, while the back-end uses the Gemini API in order to come up with a description for the image.

Challenges we ran into

We ran into multiple challenges, including trying to figure out which model to use, model not running on one of our computers, and having to work with rate usage limits on Gemini's API.

Accomplishments that we're proud of

We are proud of having come up with a functional prototype, that can accurately describe an image in only a few hours.

What we learned

For us, it was either our first or second hackathon, so we learned a lot on working in hackathons. We also learned technical skills, such as creating a back-end and front-end.