Wednesday, 4 June 2014

Graph Local queries - revisited

Recently I had some feedback from Chuck Daniels on my blogpost on Graph Local queries. In the comment, Chuck challenged me in saying that really I should be comparing apples to apples, and that I should make sure that I was applying the same kind of locality.

Basically his question was as follows: once we go from the small dataset to the large one, shouldn't the query that we test performance for also include a richer, more complex pattern. Essentially he wanted me to compare these two queries:

match 
(eqt:EQUIPMENT_TYPE)<-[:IS_OF_TYPE]-(eq:EQUIPMENT)-[:LOCATED_AT]->(ol:OBSERVATION_LOCATION)<-[:OBSERVED_AT_LOCATION]-(o:OBSERVATION {id:1001})
return eqt.name, ol.name, o.id; 

and

match 
(eqt:EQUIPMENT_TYPE)<-[:IS_OF_TYPE]-(eq:EQUIPMENT)-[:LOCATED_AT]->(ol:OBSERVATION_LOCATION)<-[:OBSERVED_AT_LOCATION]-(o:OBSERVATION {id:1001}),
(eq)-[:USED_FOR]->(ot:OBSERVATION_TYPE)<-[:IS_OF_TYPE]-(o)
return eqt.name, ol.name, o.id; 

The second query obviously being a bit more complex, as it adds a few more "hops" to the traversal. 

So lets test this out

I have created a little gist for you to try this yourself. Look at this one for loading the data, and testing the results yourself. We start with an empty database, of course, and load the small dataset first.

Then we run the sample queries (both of them: the easier one AND the more complex one) and create the index on the OBSERVATIONS.

Next up is adding the larger dataset, and running the queries again:

And surprise surprise, the principle of graph locality and index-free adjacency still survived - the queries are still lighting fast. 

Hope this clarifies the point that Chuck raised - and reinforces the fact that graph local queries are GREAT for many different use cases!

All the best

Rik

No comments:

Post a comment