When it comes to social network, the reliability and justice of the posts are more problematic than ever before. In fact, we have seen the cases in which users posted and shared conspiratorial or discriminatory claims. Especially in Japan, people tend to shy away from using real name for their accounts and approximately 75% of users are anonymous , posting unfounded tweets.
As a matter of fact, we see users, such as, who
- post unwarranted prediction of earthquakes and tsunami in order to incite people's anxieties.
- propose conspiracies to make people cast dubious eyes to vaccine against coronavirus.
- aver discriminatory idea that women are inferior to men and vice versa.
Now that we face a lot of problems such as climate change, geopolitical risk, pandemic, etc., people have come to crave correct information. Twitter is still one of important sources so that we want to make twitter more and more trustworthy.
What it does
Wikipedia, which has existed from the early days of the Internet, adopt verifiability , the policy that makes sure the validity of the contribution, emphasizing the source of the information.
Our Twitter Bot employs the concept of verifiability and enables users to verify the posted contents.
When users see skeptical tweets, by replying to those tweets mentioning our twitter bot, it receives a request which check the contents.
Then, the bot extracts the keywords in the contents and search for the related news or books using Google books' and Bing news' engine. In the meantime, IBM Debater API evaluates the hit resource, considering the relevance to and the resource's stance(positive / negative) on the original tweet.
Eventually, the bot replies the highest-rate source, whether it is positive or negative. Users also can designate the stance beforehand and search for the resource based on it.
How we built it
- Reply Event Discovery
The request which require to search for sources to this bot uses Filtered stream (Twitter API v2) function. It allows the bot to discover the request using persistent HTTP Streaming connection.
- Related Source Search
When receiving a request, the bot cuts out the main part of the claim and extracts the keywords contained in it. This process uses several features of the IBM Debater API . Based on the extracted keywords, the bot searches for related documents. The Bing News API is used for news search and the Google Books API is used for book search.
The search results of the related document include relevant sections of the text extracted by the respective search engines of Bing and Google. Then, scores are calculated for these relevant parts. The score is calculated by taking into account the stance of positive and negative, the strength of the text's claim, and the relevance of the text.
The individual ratings required to calculate these scores are calculated separately by several functions of the IBM Debater API.
Lastly, the document with the strongest score is replied as a source of information that strongly supports or denies the content of the original tweet.
Challenges we ran into
When we developed this app, Node SDK on Azure did not run correctly and in Google Books, there are some books whose titles, authors, or contents were not registered. We had difficulty in dealing with these anomalies.
Accomplishments that we're proud of
At the earlier phase of this project, we could not find the way to calculate the score so that the search function did not work well; however, by virtue of the intuitive usability of Twitter API, we could focus on changing the logic and succeeded in refining it.
What we learned
Through this hackathon, we could be in familiar with Twitter API. We found that developing a Twitter App is more easy than we expected.
What's next for Debatter_bot
Although now only authoritative sources are included as the object, users who have verification badges are also supposed to be included in the future. Handling anonymous users' posts, we want to make discussion on Twitter more active.