disclaimer : didn't have the time to polish it but its all there!

Inspiration

  • 'World-Understanding' or 'World-modeling' is a major shortcoming of modern AI. Can we build an AI to understand relationships between concepts to navigate the huge NLP Knowledge Graph of Wikipedia?

  • WikiRace was a game we used to play in elementary school when all other online games were blocked.

  • The game involves 2 players agreeing on starting and destination pages then trying to bounce between links in wikipedia articles in order to get to the destination. WikiRace is a delicate balance of speed reading and strategy/wisdom, picking a strategic route quickly is the name of the game.

What it does

  • NLP has come a long way in 'understanding/modelling' text, can incorporate NLP into this game to train an agent to 'understand' relationship between articles in order to find its way to the destination page?

How I built it

  • Fetching data from sqlite dump and wikipedia api

  • Formulating the problem :

1) Given current page, vectorize the text of all linked pages from current page

2) Concatenate these vector_representations with the vector_representation of the targetArticle

3) Breadth-first search to measure the actual 'distance' that article is from the target (this is training only)

4) Build model on top of [vectTarget, vectLinkedArticleFromCurrent, distanceBetweenArticles]

5) Learn relationships between 'features' of text in articles and 'graphDistanceFromTarget' to intelligently navigate wikiRace

Challenges I ran into

BIG_DATA --> melts my computer

DATA_MANIPULATION --> melts my brain

Accomplishments that I'm proud of

Getting it all to (almost) work together (almost).. so many hurdles (:

What I learned

An interesting new way of thinking about NLP/graph problems. Looking forward to continuing this and see where it goes!

What's next for Wikied Fast!

  • Swap out the current sklearn models for fully deep-learning approach (leverage BERT, UMLFit, etc)
Share this project:

Updates