Playing quiz-up got really boring, because we were limited to predefined topics. What if we could input any topic and, using the immense wikipedia knowledge base, the game would automatically generate questions?
What it does
It basically scraps a given wikipedia topic and tries to generate multiple-choice questions from them.
How we built it
We used a number of nlp libraries for python, and a RESTful server for the actual game. First, we filter the sentences to strip out the citations and other un-needed characters. Then, we eliminate the longer sentences, that would be very difficult to parse. After, we pre-process every sentence, using a hidden-markov-model algorithm to tag the parts of speech for every word. Breaking the sentence by noun phrases, verb phrases and prepositional noun phrases gives us a possibility to crudely generate a question
Challenges we ran into
This subject is not too well documented, as there are only a few attempts at this, and even those made by PhD students for their dissertation. We also encountered several library bugs and dependency problems.
Accomplishments that we're proud of
What we learned
A lot more than we expected
What's next for _scrapr
Improve the question generation algorithm, and we also plan to implement a classifier to rank the generated questions by their syntactic structure. If we had more time, we'd have implemented an algorithm to check the grammar and correct any mistakes.