Inspiration
Volunteering at a School for people with hearing disabilities open my eyes on how hard the communication barrier can be. Which is why I wanted to use AI to help tackle the problem.
What it does
The project is a website that provides tutorial lessons to teach the user basics of ASL All tutorial lessons is composed of four parts:
- Demonstrate the user a maximum 3 seconds video to teach how to do a certain sign in ASL
- Allow the user to record a video of themselves performing the sign
- A program inside front-end starts checking whether or not the sign is performed correctly
- If sign done correctly, the website moves on to the next word(sign). Otherwise, the website generates a video with user's face doing the sign correctly, and restart from part 2.
How we built it
- Use react to build the frontend
- Use fastAPI to build the backend
- A Kaggle notebook for removing unnecessary frames and extracting the necessary frames
- A Kaggle notebook for generating the landmarks of user for each sign language
- Preprocessed with intent of building a neural-network
Challenges we ran into
- Finding a good API to do face swap in an efficient and versatile way.
- Create a text as the start of an accordion that also contains a link to another page of the project (which we gave up)
- Github stopped working at the last hour
- Could not get the webcam to work in deployment
Accomplishments that we're proud of
- Use FILM to extrapolate a group of frames into a video.
- Managing to develop a website using Typescript without having used the language before.
What we learned
Simon:
- How to use FastAPI
- How to use FILM to add new frames between two frames. Hannah:
- How to use Typescript to develop a webpage
- How to add style, links, toggles, enabling use of webcam to record video. Kevin:
- How to pre-process data
- How to analyze videos using opencv and mediapose
What's next
- Train a large model based on all the ASL dataset (ASL Citizen or ASL Youtube)
- Try to 'predict' the next position of user in order to better nudge him/her during the learning process
- Instead of face-swapping, use stable diffusion with controlNet so that it will be more seamless and 'believable'
Built With
- fastapi
- film
- html
- javascript
- kaggle
- media-pipe
- python
- react
Log in or sign up for Devpost to join the conversation.