Almost all moderation tools work to moderate after a post has been placed. For large groups such as Hackathon Hackers this can be a problem in terms of allowing banned users back into the group or allowing new users to post. Usually this is remedied via waiting for approval by admin, but this creates more lag.
If there was a way to predict how controversial a post might be based on it's content, we could create automated moderators that could offer advice to make posts better recived in a given group.
What it does
By analyzing the posting history of any given facebook group, parsing the text, the comments, the likes, and the reactions to those posts, we can train a bot using Bayes probabilities at the how on a scale of 0 to 1 each post would score given categories like controversial, success rate, negative feedback, etc.
How we built it
We used facebook Graph API to fetch posts from Hackathon Hackers. We then trained a Naive Bayes (independent probability model) bot on a set of data marked popular / not popular by a simple 'likes' algorigthm. We then ran it through a set of test data for which we were able to analyze the accuracy at which it correctly identified good posts from bad.
Challenges we ran into
Turning python into a decent front end
Accomplishments that we're proud of
Given a varied dataset, we were scoring 80-100% accuracy at determining the the likelihood of a popular post.
What we learned
Naive Bayes modelling and machine learning
What's next for PHP
-Add more categorical options and nicer GUI features -Grab data with more than 1 thread to bypass facebook's time limit. Train on thousands of documents -Use full Bayes (words depending on each other instead of independent modelling) for adding comments analysis and/or larger n-grams (currently 1 word gram) to analyse phrases to make more sense in terms of natural linguistics