Inspiration
The pandemic changed our lives one way or the other. Education is one of those sectors that went under a rapid transformations within last few years. The experienced teachers are facing difficulty to teach in an online environment that either force them to make slides over writing and explaining things on a whiteboard or requires and additional hardware device to write or draw. We want to breach this barrier and provide a paradigm change in teaching and learning by enabling teachers to write on the go over the screen. The process of learning must be focused on understanding and analyzing while attending a class over note-making. I guess all of, at least once in life, missed a piece of information while making a note. And we want to put an end to it.
What it does
Our Application will allow hands gesture based drawing/ writing on the live screen while in real-time class/lecture/seminar, making it easier to explain intricate concepts those require high imagination power, which has been made easy by this feature of gesture based virtual board. Alongside, it also takes the real-time audio feed which is transcribed and fed into an LLM to create a summary notes and provide to students, which allows them to focus and be more interactive and imaginative during the lectures.
How we built it
Python is the primary programming language that is adopted for development. Flask is used to create backend server for the project. The frontend client-side is developed using ReactJS and NodeJS using JavaScript. The Project uses Speech-to-Text API from Google Cloud Platform and PyAudio to create a transcript file. This transcript file is then fed into Gemini-Pro, a Large Language Generative AI. Which in turn provides a Summary. We incorporate Computer Vision and Machine Learning which uses various data stack such as MediaPipe, Tensorflow, openCV, and CUDA to process images and detect for hand gestures and write on-screen.
Challenges we ran into
We tried to embody Apache Kafka and WebRTC to implement this over cloud. But not felt it is not feasible at this point of time.
Accomplishments that we're proud of
Hand gesture recognition which enables illustration on the go during a class. Text summarization that uses GenAI which provides a stress free learning environment to students.
What we learned
Using Google Cloud Services, multithreading and multiprocessing. Resolving finding optimal ways to integrate client and server which involve various technologies.
What's next for PragatiPath AI
Deploying on cloud. Draw geometric shapes. Implement a peer-to-peer network for streaming.
Built With
- computer-vision
- docker
- flask
- gcp
- gemini-pro
- genai
- machine-learning
- mediapipe
- natural-language-processing
- node.js
- pyaudio
- python
- react
- speech-to-text
- tensorflow
Log in or sign up for Devpost to join the conversation.