Building a Question Answering System

Quiz game between AI and human
Buzz csv file for QA system

Inspiration

QuizBowl

What it does

Question - Answering machine takes different data using the tf-idf representations and uses concepts logistic regression and one hot encoding to compete against a human in a quiz bowl setting

How we built it

We create a tf-idf vector representation for the questions and answers and create a dataframe with the vector and the provided data. Then we convert the answer categories and subcategories into a binary code using One Hot Encoding and split the dataframe into train and test in a 80:20 division. Then we use Logistic Regression to fit and transform the curve according to the train data. Then we predict the answer for the test questions given and the probability of getting that answer according to the learned logical regression model. This coded in python using numpy, pandas, sklearn.

Challenges we ran into

This is our first time using python on a project, and we started the project without any knowledge on machine learning or natural language processing. So, it took us time to learn the syntax and the ML/NLP concepts to understand the project.

Accomplishments that we're proud of

As first time users, we implemented code in python, trained a model and got the training and testing accuracy although improvements are needed.

What we learned

We learned python and its libraries and how to use them for ML/NLP. We learned machine learning concepts and how to collaborate and contribute to a software project.

What's next for Building a Question Answering System

We want to further narrow down the search and tokenization to each word using granulation and also include more features to improve the accuracy of the guessing by the computer