What happened before
As you may remember, I created a little beer graph some time ago to experiment and have fun with beer, and graphs. And yes, I have been having LOTS of fun with it - using it to explain graph concepts to lots of not-so-technical folks, like myself. Many people liked it, and even more people had some questions about it - started thinking in graphs, basically. Which is way more than what I ever hoped for - so that's great!One of the questions that people always asked me was about the model. Why did I model things the way I did? Are there no other ways to model this domain? What would be the *best* way to model it? All of these questions have somewhat vague answers, because as a rule, there is no *one way* to model a graph. The data does not determine the model - it's the QUERY that will drive the modelling decisions.
Taking the AlcholPercentage to the next level
So in my new version of my beergraph, I have done something different. I used the example of Peter to create an in-graph index of AlcoholPercentages - a bit like the picture of the new model that you see here.
Essentially what I am doing is I am connecting all the alcohol-percentages into a chain of alcholpercentages - using the [:PRECEDES] relationship. In Cypher-style ascii-art that would be something like
... -(alcperc-0.2)-[:PRECEDES]->(alcperc-0.1)-[:PRECEDES]->(alcperc)-[:PRECEDES]->(alcperc+0.1)-[:PRECEDES]->(alcperc+0.2)- ...
To do this, I of course did have to modify my beer-spreadsheet a little bit. You can find the new version over here. But from the screenshot below you can see that all I did was create another tab that had all the alcoholpercentages and that "PRECEDES" relationship between them. Easy peasy.
Nice. So what? The resulting dataset is very similar to what we had before - it's just a little bit richer. You immediately notice it as you start "walking" the graph on the WebUI: the links to the AlcoholPercentage-chain gives me a new and interesting way to explore the graph.
But what else what can we do with this? Well, querying it is the obvious answer. Let me give you a couple of examples:
- how can I find beers that have the same beertype and a "same or similar" alcoholprecentage (let's say + or - 1%) as a beer that I really like (Orval). That's now become very easy:
start
orval=node:node_auto_index(name="Orval")
match
orval-[:IS_A]-beertype,
orval-[:HAS_ALCOHOL_PERCENTAGE]-alcperc,
alcperc-[:PRECEDES*0..10]-otheralcperc,
otherbeer-[:HAS_ALCOHOL_PERCENTAGE]-otheralcperc,
otherbeer-[:IS_A]-beertype,
otherbeer-[:BREWS]-otherbrewery
return
otherbeer.name, beertype.name, otherbrewery.name;
Or another example:
- how can I find other beers from the same brewery that have a similar AlcoholPercentage as a beer that I also like (Duvel)
start
duvel=node:node_auto_index(name="Duvel")
match
duvel-[:BREWS]-brewery,
duvel-[:IS_A]-beertype,
duvel-[:HAS_ALCOHOL_PERCENTAGE]-alcperc,
alcperc-[:PRECEDES*1..10]-otheralcperc,
otherbeer-[:HAS_ALCOHOL_PERCENTAGE]-otheralcperc,
otherbeer-[:IS_A]-otherbeertype,
otherbeer-[:BREWS]-brewery
return
otherbeer.name, otherbeertype.name, brewery.name,
otheralcperc.name
order by
otherbeer.name;
Both of the queries above gave me some new, interesting insights that I did not know before, allowing me to discover even more and nicer Belgian beers. But what's important is of course that these in-graph indexes are fantastically interesting. By "pulling the data out", normalising even further, and then indexing the normalised data as a subgraph of it's own, we can much more easily derive new and interesting insights. And that, my dear friends, is what graphs are all about :) ...
Hope this was useful. If you like this post and want to discuss more about graphs and beer, please come to our Graph Café in June in Antwerp or Amsterdam - or at a pub near you?