What it does
Using Graph Convolutional Network to predict whether two web pages on the graph are connected.
It is a mini-implementation of GraphSAGE, a popular learning algorithm for graph data.
Accomplished ~91.5% classification accuracy on the test set!
How we built it
Data
- nodes (number id): webpage
- (22470 linked, 1655 isolated)
- edge: exists if two pages link to each other (132039)
- Page’s text description
- Page type (label)
Pre-processing
- Node features
- labels: provided, 4 types
- Embedding text one-hot vectors
- Use Doc2Vec, decide the output feature dimension based on the raw sentence length
Problem Abstraction: Link Prediction in Graph
- Small model — increase complexity
- Deeper GraphSAGE
- higher number of channels
- longer text embedding
Graph
- Nodes: pages
- Edges: connectivity of pages
- Node feature: label + (embedded) text
Log in or sign up for Devpost to join the conversation.