Inspiration
Programmers use rubber duckies when thinking, but they don't talk back. We wanted to build something that sounds and thinks _ just like you _ through a few conversations. Speaking to the right people is key in order to get the best ideas, and often that is yourself.
What it does
Shdw allows you to speak to an AI voice clone (that you make using ElevenLabs or use our own ones) that learns from your conversations and slowly adapts to your speaking and thinking style. Over time, it can adapt and become a near-perfect replica of who you are. You can also interact in 3d and the model can pick up any actions you make in your 3d space and learn from it to give itself a better understanding with more accurate details about who you are.
How we built it
We started simple by making a very basic MVP for face recognition and tracking using OpenCV. From there, we split up and started iterating. Each of us worked on a different section. The Avatar-based system, UI/integration, and the speaking and camera system. Starting small and iterating quickly was useful since all we had to do at the end was connect the database and the microservices together.
Challenges we ran into
Running these AI models cross-platform was quite a bit of a pain since packages like JAX would be Mac-based. We wanted to keep the app open for all users, and this required thinking quickly and finding solutions ASAP.
Accomplishments that we're proud of
These past 36 hours were some of the most innovative ones we've ever had since this was _ our _ opportunity to make absolutely anything. Due to the heavy dependency on AI/ML models, cross-compatibility is something that took quite a bit of time, and we were happy to find workarounds for issues with cross-compatibility.
What we learned
Due to our heavy reliance on vision trakcing we learnt a lot about AI models for tracking. For example, A few of our teammates got to use Mediapipe for the first time, and we got a bit of experience. The same goes for technologies such as MongoDB and Next.js
What's next for Shdw
We would like to take the video feed feature with Mediapipe and go insane by allowing the user to experience it by being in the 3d space itself. Another extremely important quality-of-life change we would like to make would be to actually ship the project. Users don't want to run npm start, they just want to click an executable and be on their way. While it is quite difficult to package such a large and complicated project into an executable, we believe that it is possible.
Log in or sign up for Devpost to join the conversation.