Inspiration
Imagine a little kid going on YouTube just to see a comment that says, "Hey this is MrBeast, click on this link to win a million dollars." This actually happens as fake giveaways, obscene links, and scams are commonplace on YouTube and run rampant in the comment section. When I was brainstorming ideas, I realized that I wanted to make a project that is Futuristic and benefits society. I came up with CyberBuddy, a novel app that removes spam comments from YouTube comment sections.
What it does
My app uses a custom built and tuned Binary Classification ML Model to detect spam from YouTube comments and then removes the comments through YouTube's API and is a StreamLit Web App.
How we built it
It was built using Python and Sklearn's Multinomial Naive Bayes classifier and a TfidVectorizer to convert comments into machine-readable/numeric data which is passed into the model. The model was mad into a Pipeline which then had Hyperparameter Tuning performed onto it in order to get the best possible model. The model was saved using Pickle and the Web App was made using Streamlit, so the user sends a link and submits a button which uses YouTube V3 Data API to get the comment section and individually checks each comment for spam.
Challenges we ran into
I ran into multiple challenges building the App and had to do continuously debug. Challenges I ran into were with the OAuth and API key as the YouTube API requires these. Another challenge was training the model as I had initially overfit it, and it was giving me false positives.
Accomplishments that we're proud of
I was able to train a model to 94% accuracy! I also used technologies such as Streamlit, Youtube's API, and OAuth for the first time. I am also proud that I was able to create a non-trivial project by myself.
What we learned
I learned how to train an ML Model using Sklearn, how to create Web Apps using StreamLit, and work with the YouTube API as I had never worked with any of these technologies.
What's next for CyberBuddy
I will integrate CyberBuddy into other sites such as TikTok which need it as well as use frontend technologies like React, TailwindCSS, and Typescript paired with a backend API server(FastAPI) in order to make the application full-stack and then I will deploy it. These were features I was unable to do because of the time limit.
Log in or sign up for Devpost to join the conversation.