Inspiration
Having seen many other programmers attempting to bring good to humanity, we decided to pursue other endeavors. Why use a 100-megabyte computer vision dataset and OpenAI's generation-shifting artificial intelligence models if not to create the greatest ROASTING bot to ever grace the face of the Earth?
What it does
Using a large database, the user's face is mapped onto 68 different points to form the basic facial grid. These points are then transferred to an algorithm that converts the data into human-readable features, such as large eyes or a small nose. These features are then used to derive the user's mood, tiredness, and (tentatively) even their personality. Upon choosing a ROAST or a compliment, an OpenAI prompt is created to deliver to the user exactly what they want. Of course, the AI is trained not to cross any boundaries and to keep everything light-hearted and fun.
How we built it
The entire backend is written in Python, including the API calls. A few libraries, such as dlib, cv2 and imutils were used to import and ease the use of the computer vision database all using a Python virtual environment to keep it easy to run on every platform. The frontend has both a web app and (again, tentatively) a React Native setup. Aariyan, the man writing this DevPost at this very moment, has no clue how frontend programming works and so will leave it to the rest of the team to explain those shenanigans.
Challenges we ran into
On the backend, the most difficult challenge was debugging the boilerplate for three hours before even starting the project. Having never set up a virtual environment before, and due to our team having two MacBooks and two Windows computers, we had to figure out how to develop the project in a platform-agnostic setting. For the frontend, React Native was also notoriously difficult to set up due to none of us veritably knowing how to use React (our skillset only barely extended to JavaScript.)
Accomplishments that we're proud of
On a more serious note, we are genuinely really proud of our project. Though it seems like we're messing around, this computer vision system that we created could be transformed into an API and used for tracking very important things such as notifying drowsy truck drivers or identifying someone at the front door looking with malicious intent. It is an extremely flexible system and can prove very useful when given the circumstances.
What we learned
Having been our first computer vision project, we have learned how to integrate the webcam quite well with any of our future projects. Again, the boilerplate being one of the hardest parts made it so that we learned a whole bunch about how different computers and OSes organize their components. Also, our frontend team basically learned React, Expo, and React Native from scratch by doing this project. This entire project will greatly help us in our future endeavors, and we had a lot of fun doing it!
What's next for Roast.me
Roast.me could definitely use a change of names, and could integrate more computer vision techniques such as having memory for identification and color detection (the dataset is trained on monochrome pictures to keep everything light) to transform it into an application that could be used for cybersecurity, identification or even something as simple as an auto responder at the door based on who shows up. Modules such as an IR sensor or an Ultrasound sensor can be used to detect humans in the dark as well. We see many future avenues open for this kind of technology and we are extremely happy with the project we've made for it being our first dive into computer vision.
Built With
- css
- cv2
- dlib
- expo.io
- html
- imutils
- javascript
- openai
- python
- react
- react-native
- typescript
Log in or sign up for Devpost to join the conversation.