That's why I started experimenting with how a graph database like Neo4j could help with this. Some of the tracing problems that we will face, are uniquely well suited for a graph database approach: it allows for us to see and understand the indirect contacts that healthy and sick people may have had with one another, and the effects that this could cause in our environments. It also allows for some unique predictive analytics: the structure of our contacts, the network/graph that it constructs, actually says a lot about the importance that parts of the network may play in the evolution of the pandemic. Graph Data Science can give us pointers as to where this should direct our policies.
This has ended up being quite an extensive piece of work. In order to keep it readable, I have cut it up into 4 blogposts, which I will put up all at the same time:
- Part 1: how I go about creating a synthetic dataset, and import that into Neo4j
- Part 2: how I can start running some interesting queries on the dataset, making me understand some of the interesting data points in there and questions that one might ask
- Part 3: how I can use graph data science on this dataset, and understand some of the predictive metrics like pagerank, betweenness and use community detection to direct policies
- Part 4: a number of loose ends that I touched on during my exploration - but surely did not exhaust.
There's so much potential in this dataset, and in this problem domain in general. I feel like I have gone into the rabbit hole and have just resurfaced for some air. But who knows, maybe I will dive back in and do some more digging - after all, this is interesting stuff, and I love working on interesting topics.
Hope this is as interesting for you as it was for me.
All the best
Note that these demos will require the following environment:
- Neo4j Desktop 1.2.7, Neo4j Enteprise 3.5.17, apoc 188.8.131.52, gds 1.1.0, or
- Neo4j Desktop 1.2.7, Neo4j Enterprise 4.0.3, apoc 184.108.40.206 (NOT later! a bug in apoc.coll.max/apoc.coll.min needs to be resolved)