Inspiration
The supernode problem is a common issue in the graph database space. Being able to do the following query efficiently on nodes with thousands, millions or more relationships would make a whole new range of use cases possible with Neo4j.
MATCH (node)-[r:MY_REL {i: $i}]-(other)
WHERE id(node) = $id
RETURN r, other
What it does
Define the index
CALL ga.index.create('Person', 'VISITED', 'date')
Lookup using the index:
MATCH (n:Person {name:'Frantisek'})
CALL ga.index.lookup([n], 'VISITED', 'date', $date) YIELD r,other
RETURN r,other
How we built it
- A transaction handler checks all modifications to relationships in each transaction before commit.
- For each node with more relationships than defined threshold it creates an explicit index where it indexes all relationships of that node. The index is local to that node.
- The lookup procedure then checks if a node local index exists, and if it does it uses the index to get the relationships, otherwise it iterates over all the neighbours.
Challenges we ran into
Databases are hard.
Accomplishments that we're proud of
We consider this proof of concept successful and look forward to making it production ready.
What we learned
Lot's of Neo4j internals.
What's next for Node Local Relationship Indexes
Move index definition from the db itself (node with special label) to a better place
Support direction of the relationship - now the relationship is indexed regardless of the direction, provide ability to define an index on outgoing/incoming direction
Use new native indexes instead of explicit indexes - we considered this at the beginning, but decided against it due to the complexity of such solution. We hope Neo4j will expose more of the native indexing functionality in 4.0.
Support range queries - due to the limitation of explicit indexes we didn't implement range queries, it should also support returning results in sorted order
Log in or sign up for Devpost to join the conversation.