Inspiration
QuizBowl
What it does
Question - Answering machine takes different data using the tf-idf representations and uses concepts logistic regression and one hot encoding to compete against a human in a quiz bowl setting
How we built it
We create a tf-idf vector representation for the questions and answers and create a dataframe with the vector and the provided data. Then we convert the answer categories and subcategories into a binary code using One Hot Encoding and split the dataframe into train and test in a 80:20 division. Then we use Logistic Regression to fit and transform the curve according to the train data. Then we predict the answer for the test questions given and the probability of getting that answer according to the learned logical regression model. This coded in python using numpy, pandas, sklearn.
Challenges we ran into
This is our first time using python on a project, and we started the project without any knowledge on machine learning or natural language processing. So, it took us time to learn the syntax and the ML/NLP concepts to understand the project.
Accomplishments that we're proud of
As first time users, we implemented code in python, trained a model and got the training and testing accuracy although improvements are needed.
What we learned
We learned python and its libraries and how to use them for ML/NLP. We learned machine learning concepts and how to collaborate and contribute to a software project.
What's next for Building a Question Answering System
We want to further narrow down the search and tokenization to each word using granulation and also include more features to improve the accuracy of the guessing by the computer


Log in or sign up for Devpost to join the conversation.