In this unbelievable release, there are so many new features, it's kind of hard to keep track of everything. But the ones that I can most easily get my head around are clearly
- multi-database support - finally, Neo4j actually has this concept of running multiple databases on one database server. A multi-tenancy solution, that has been requested and anticipated by many of our users and customers.
- a VERY advanced schema-based security module, that allows people to extend the existing role-based security model of Neo4j even further - and make it crazy powerful. We'll spend a lot of time on that in this blogpost.
- the manual pages on managing multiple databases
- the pages on authentication and authorisation
- a great video by Louise Söderström from Neo4j engineering:
Adding the Belgian BeerGraph as a "tenant"
I decided to revisit my dear old beloved Belgian BeerGraph, and to fire up the 4.0 server, and create a a couple of databases on it - among which the one with all the Belgian Beers. This was super easy, following some of the instructions on the developer site://create database
create database beergraph;
And in the Neo4j browser I could see the new database appear:
I quickly added the indexes to this database:
//create the indexes
create index on :BeerBrand(name);
create index on :BeerType(name);
create index on :Brewery(name);
create index on :AlcoholPercentage(value);
and that made it super easy to get started:
All I needed to do was to run this query to import the beers:
//Import the beergraph
load csv with headers from
"https://docs.google.com/spreadsheets/d/1FwWxlgnOhOtrUELIzLupDFW7euqXfeh8x3BeiEY_sbI/export?format=csv&id=1FwWxlgnOhOtrUELIzLupDFW7euqXfeh8x3BeiEY_sbI&gid=0" as csv
with csv
where csv.BeerType is not null
merge (b:BeerType {name: csv.BeerType})
with csv
where csv.BeerBrand is not null
merge (b:BeerBrand {name: csv.BeerBrand})
with csv
where csv.Brewery is not null
merge (b:Brewery {name: csv.Brewery})
with csv
where csv.AlcoholPercentage is not null
merge (b:AlcoholPercentage {value: tofloat(replace(replace(csv.AlcoholPercentage,'%',''),',','.'))})
with csv
match (ap:AlcoholPercentage {value: tofloat(replace(replace(csv.AlcoholPercentage,'%',''),',','.'))}),
(br:Brewery {name: csv.Brewery}),
(bb:BeerBrand {name: csv.BeerBrand}),
(bt:BeerType {name: csv.BeerType})
merge (bb)-[:HAS_ALCOHOLPERCENTAGE]->(ap)
merge (bb)-[:IS_A]->(bt)
merge (bb)<-[:BREWS]-(br);
If you have followed the beergraph story on this blog, you know that I also have an "in-graph" alcoholpercentage timeline in there. Not as useful anymore as it used to be, but I still added it:
//create the in-graph index
MATCH (ap:AlcoholPercentage)
WITH ap
ORDER BY ap.value ASC
WITH collect(ap) as sorted_ap
FOREACH(i in RANGE(0, size(sorted_ap)-2) |
FOREACH(sorted_ap1 in [sorted_ap[i]] |
FOREACH(sorted_ap2 in [sorted_ap[i+1]] |
MERGE (sorted_ap1)-[:PRECEDES]->(sorted_ap2))));
So there we are: the database was imported.
So far, I must say, there's not that much difference here. All you notice is that there is a system database and a user database, very similar to what other dbms' do.
Now let's get started with the schema based security parts - which is super interesting.
Securing the Belgian Beergraph tenant
Now, if you flip through the manual and the movie mentioned above, you will find that these new schema based security features are super advanced. Based on new super easy definitions of users, roles, and privileges, you can really get started super easily with this.For some of these tasks, it's actually super easy to use Halin to configure things graphically, but I have found that it's actually almost as easy to do this in Cypher. Mind you, you always have to take care in the Neo4j Browser to select which database you are using: the system database, or your "beergraph" user database. Please take care with this - as the privileges all need to be granted in the system database.
Part 1: restricting the reading of properties
Here's the problem that I wanted to explore: in the Belgian Beergraph, we have 4 entities:So, as everyone knows I care very deeply about anything related to Political Correctness (please watch this clip if you think this is true - as it is NOT), I thought it would be fun to try and hide the AlcoholPercentage of the graph from children. Totally realistic and completely sensible - so here goes.
First, we had to create a user in the database that would represent a child (the "childreader" user), and assign that new user a role (the "childreaderrole" role):
CREATE USER childreader SET PASSWORD "changeme" CHANGE NOT REQUIRED;
CREATE ROLE childreaderrole AS COPY OF reader;
SHOW ROLES;
GRANT ROLE childreaderrole TO childreader;
That was easy:
Then, we would go on to change the permissions/privileges in the graph, and change how different users would see the values of the AlcoholPercentage nodes. This is the privilege that I created:
DENY READ {value} ON GRAPH `beergraph` NODES AlcoholPercentage TO childreaderrole;
If I would then run the specific query below as admin or as reader:
MATCH (o:BeerBrand {name:"Orval"}), (d:BeerBrand {name:"Duvel"}),
path = allshortestpaths ((o)-[*]-(d))
RETURN path;
I would get the following result:
However, with new "politically correct" privilege in place, protecting our vulnerable children from knowing the values of the AlcoholPercentage nodes, runing the EXACT same query as user "childreader" would yield this result:
The difference is subtle but important: the same graph structure is returned by the query, but the values that the user is not entitled to see, are simply omitted / greyed out. Pretty cool.
Part 2: restricting the traversal through the graph
But here's where I think everything gets a bit more fantastic. In 4.0, we can now not only restrict access to properties (or entire subgraphs, if we would so desire), we can also restrict the behaviour of traversals based on schema based security privileges. Look at this privilege:
DENY TRAVERSE ON GRAPH `beergraph` RELATIONSHIPS HAS_ALCOHOLPERCENTAGE to childreaderrole;
As you can tell, this basically allows us to say that "children are not allowed to traverse the HAS_ALCOHOLPERCENTAGE relationship", meaning that if we rerun the above query to look for all the shortest paths between two beers (Orval and Duvel) we are very likely to find a VERY different result. Look at what happened:
As you can see the traversal all of a sudden becomes a lot deeper. Instead of having Duvel and Orval be 4 hops away from one another (see the traversal above), the privilege has now doubled that and made the traversal double as long - 8 hops away. Really cool.
Wrap-up and conclusion
Finally, wrapping up this post, let's see how we can easily remove these privileges if we want to show the before/after scenarios: you can remove the read/traversal restrictions really easily:
//in case you want to remove the restrictions and start over
REVOKE DENY READ {value} ON GRAPH `beergraph` NODES AlcoholPercentage from childreaderrole;
REVOKE DENY TRAVERSE ON GRAPH `beergraph` RELATIONSHIPS HAS_ALCOHOLPERCENTAGE from childreaderrole;
I personally was extremely impressed with the power and flexibility of both the multi-database and the schema-based security features of 4.0. Take it for a spin yourself. All of the code above is on github as usual, so it should be super easy to test for yourself.
Hope this is useful.
Cheers
Rik
No comments:
Post a Comment