FAQTR © - A revolution in verifying statistical claims
Inspiration
All of us were tired of listening to claims made by politicians, celebrities, friends and even the household maids!
We already had enough on our plates to deal with; we didn't want to spend time debating about what someone else said!
That's when we thought of creating FAQTR.
What it does
FAQTR is a prime example of the confluence of imagination with computational power.
It works with a lot of functionalities - NLP, Neural Networks, Speech Recognition, Web Parsing, only to provide you with reliable information about statistical data.
You can speak to it -- just say a statement and if it contain any statistical claims, it will be searched for extensively in the internet, and the results will be passed through complicated algorithms to provide you with the needed metrics.
How we built it
We built FAQTR primarily in Python. It uses a custom Speech Recognition with google APIs to detect your speech, which is then analysed with an LSTM built on a CNN to determine the type of your claim.
If the claim is accepted, it uses Bing APIs to search the web and relevant results are received.
To determine whether the retrieved results are of any use, we took the help of another CNN, along with a homemade algorithm!
Add a little bit of mathematics and you get a nice little formula for verifying statistical claims!
Challenges
The primary challenge we had to deal with was acquisition of data. It was extremely difficult to find sentences which had statistical claims, and so we had to make our own datasets.
The second challenge we faced was to develop a semantic analysis module which could run without any existing corpora. We racked our brains the entire night trying to solve this problem, and finally were able to train a CNN to do the same.
We also faced challenges with conflicting libraries -- using NLTK somehow made Speech Recognition not work. We had to find workarounds and patched for all sorts of bugs.
Accomplishments we are proud of
We are extremely proud of FAQTR, primarily because a lot of discussion groups told us it was very, very difficult to get through with the idea.
Although the broader idea (verification of general claims) is considered to be a NP-Hard problem, we are very happy to display a proof-of-concept for a very limited subset of the problem.
Through FAQTR, we are able to integrate LSTM, CNN, NLTK, RegExp, Speech Recognition and Semantic Analysis all under one roof.
What we learnt
The first thing we learnt in these two days is that no problem is too difficult to approach.
We also learnt the importance of having a team -- it couldn't have been possible for FAQTR to exist if any team member was not present.
We learnt to take pride in what we are doing, and will not hesitate to show our creation to the world.
Future of FAQTR
Currently, the neural networks used in FAQTR are trained using very limited datasets. There is a huge potential for FAQTR, once the right datasets are found.
Also, we currently rely on bing for search purposes. Integration of Facebook, Twitter or similar APIs would render the search function highly competent, making FAQTR even more reliable.
We truly believe that FAQTR is a unique approach at dealing with claim verifications, and with the right guidance, we are sure we will propel FAQTR to unscalable heights.
Team TeenLaddu
Log in or sign up for Devpost to join the conversation.