Inspiration
Inspiration comes from the social media giant TikTok. It's algorithm boosts user engagement and retention. Therefore, our predictor hopes to allow creators and companies to reach larger audiences by determining which posts will go viral based on metrics such as duration of the video, whether the poster is banned or not, and whether the poster is verified or not.
What it does
It's able to predict how likely a post will get viral before it is posted. By inputting video metrics, you're able to get likeliness score on whether it will go viral or not.
How we built it
I was able to build the predictor using Python and scikit-learn. I started by cleaning and preprocessing the TikTok dataset, then selected key features like video duration, text, and engagement metrics. I trained a machine learning model to predict virality based on these inputs and evaluated its performance using accuracy and other metrics.
Challenges we ran into
Some challenges I ran into involve finding the right dataset. I really hoped there was a dataset that had all of the metrics I wanted before and after posting. For example, the sound that the video is using before posting is very important to how well it will do as a post.
Accomplishments that we're proud of
Accomplishments that I'm proud of is learning many different skills in machine learning, being able to develop a minimum viable product, and be able to do this all on my own.
What we learned
I learned how to use kaggle for datasets, scitkit learn for machine learning, pandas for data cleaning and manipulation, tfidfvectorizer for text feature extraction, and matplotlib for data visualization.
What's next for TikTok Virality Predictor
I'm interested in expanding the capabilities of this predictor. As mentioned above, I want to utilize other features such as the caption and sound of a video to make the predictor better. I also want to be able to measure the likeliness of success to a video after posting. Things like the number of likes, comments, shares, and the time after posting are really important in tracking the trajectory of a video. This feature would be best used my creators to expand on their skills in content creation.
Built With
- kaggle
- matplotlib
- natural-language-processing
- numpy
- pandas
- python
- scikit-learn
- tfidfvectorizer
Log in or sign up for Devpost to join the conversation.