Friday 20 October 2017

Podcast Interview with Marco Falcier and Alberto d'Este, Neo4j Versioner

Just before GraphConnect, I thought I would publish another podcast on a subject that many people have been pondering about - and even struggling with. Ever since Ian Robinson wrote about on his blog and in the O'Reilly Graph Databases book, it's been a great topic of interest for many people in many different use cases: how can I keep track of versions in a graph? How can I look at the state of the graph at a particular moment on time, in other words, travel through time. Aleksa Vukotic presented some of this in a real-world application too, but the guests on this episode of the podcast decided they wanted a more generic solution - and so they got out their coding hats and got cracking. Here's my conversation with them:
Here's the transcript of our conversation:
RVB: 00:02.467 Hello everyone. My name is Rik. Rik Van Bruggen from Neo Technology. And I keep making the same mistake. It's no longer Neo Technology. It's Neo4j now. That's our name. And here we are recording another weekly podcast for the Graphistania podcast. And tonight, I have two people from Italy on the other side of this Skype call. And I'm really jealous of them because they're in the lovely Venice, north of Italy. And that's Marco Falcier and Alberto d'Este. Hello guys.
MF: 00:35.840 Hi. 
Ad'E: 00:36.521 Hi Rik. 
RVB: 00:37.355 Hey, thanks for coming online. I really appreciate it. 
MF: 00:40.151 Thank you for having us here. 
RVB: 00:42.336 Fantastic. So guys, we got introduced by Mark Needham who runs our online meetup and he showed me some fantastic stuff about what you guys have been doing with Neo4j.

We're really impressed and we'll talk more about that. But before we do, would you mind introducing yourselves? Who are you and what do you do? What's your relationship to the wonderful world of graphs? 
MF: 01:05.175 Okay. I start that I am Marco Falcier. I am a software engineer at Pixartprinting. And I've been introduced with Neo4j and the graph database two years ago during a Van Gogh meetup with a sort of art and graph database topic that was introduced by Lorenzo Speranzoni from LARUS (note: I interviewed Lorenzo back in 2015 for the podcast). And yes, I was of course in love with graphs because I fell in love with art, with the graph because they were quite a lot expressive and the every daytime situation was something that I could really transpose on graphs as well. So this was a, yes, a love at the first sight [laughter]. 
RVB: 02:02.158 Exactly. 
Ad'E: 02:02.879 Exactly. 
RVB: 02:04.054 Fantastic. What about you, Alberto? 
Ad'E: 02:06.234 So I am Alberto d'Este, as you said. I am a software developer at the Gruppo PAM, which is an Italian retail company. And I get in touch-- I can speak for only local, so with graphs, while developing JDBC connector for Neo4j, I also work at LARUS together with Marco. At this period of time, we were developing such software and I also fall in love with graphs because, as Marco said, it's a way you think about real things, but also the way you draw it. When you think of something, you start drawing things, and then you connect them, and that's a graph. And if you also just convert it to Cypher for this, you've done your software interpretation, so it's really crazy way of developing for me. 
RVB: 03:01.955 Absolutely. I always talk about WYDIWYS, what you draw is what you store. Right [laughter]? 
MF: 03:06.727 Yep. 
Ad'E: 03:07.300 Yeah, exactly. 
RVB: 03:09.579 That's the way it is, right? So and you guys worked on the JDBC connector? Is that what I heard? That was when you worked at LARUS, right? 
MF: 03:15.829 We both worked for the board module of the Neo4j JDBC driver while we were in LARUS of course. 
RVB: 03:24.520 Oh, okay, very cool. I think there are a lot of people using that. That's really interesting. 
MF: 03:28.147 Well, we hope so [laughter]. 
RVB: 03:29.079 Yeah, yeah. No, but seriously I mean, it's the way that lots of people integrate their legacy tooling or older relational tooling with Neo4j, right? Because those older tools, they usually know how to speak JBDC. So hey, that's a really easy way to make the bridge, right? 
MF: 03:47.759 Yep. Between us and the future [laughter]. 
RVB: 03:51.099 Exactly. Exactly. This is actually a very good segue because past and the future, that means different versions, right? 
MF: 03:59.420 Yes,  of course [laughter]. Since we make [inaudible]. We also love bridges, so [laughter]. 
RVB: 04:05.075 I see. Yes, exactly. But you guys worked-- I mean, this was-- lots of people have asked us this, right? So how do I deal with time in graphs and evolving graphs and versions of graphs and those types of thing? And I always refer people to articles written by Ian Robinson and other people in the Graph Databases book that really explain how to do that from a modelling perspective. But as I understand it, you guys have written some software now, some APOCs, some procedures that help with that. Do you mind explaining a little bit of that to us? 
Ad'E: 04:41.247 Oh, yeah. So we started the developing, thinking up the entity state model. And it's like you have a node, your a node. The one you want to version, which is going to be the entity. And all the properties you're going to version is you make a snapshot and you store the snapshot in a connected status node. And you can repeat this as many time as you want. To do a new version you do a new snapshot, a new state, and so on. We also thought out the performances when doing this, so we develop the two ways of connecting the entity node with the state node. But I want Marco to explain this a little bit more. 
MF: 05:38.118 So yeah, basically what we did with the relationships and entities and states was creating two different models merged in one. So basically, for example, every entity node got one state or more state, which is some kind of relationship that came after graphically like a flower. And also, all state node are connected together with a link at least, which improve in some kind of our performance depending on which procedure you are going to code. For example, we will have some punctual query for one node or some other queries for retrieving, I don't know, a list of the states for example.
Ad'E: 06:36.954 This state and the previous and the previous and the previous and the previous. 
MF: 06:39.341 Yes, a typical linked list. 
RVB: 06:41.918 Interesting. 
MF: 06:42.171 Also, we created some rollback situations, so adding another type of relationship which also connect states together. So we will never delete states, but we're always adding new ones. So, of course, people can have the world history of the entity and the states. 
RVB: 07:07.086 So does that allow them to travel through time and look at the state of a graph at a certain point in time? 
MF: 07:13.270 Exactly, yes, exactly. So, basically, right now the core module which is the Neo4j graph version is time-based. So we're thinking about creating different integration for it. For example making something different, not only time-based, but I don't know, like a sequence or something else because this way it will be more flexible for other kind of situation. I don't know, maybe for something like storing events on a domain-driven environment like an event store or something like that. 
RVB: 07:59.813 Well, I mean, what I'll do is I think in the transcription of the podcast, I'll include a link to the online meetup of you guys on the topic and a couple of links [crosstalk] repos and stuff like that. 
Versioner Core 
Versioner SQL
And then people can start looking at it because I've heard so many people ask for it. I think this is a really fantastic contribution that you guys are making. So could I ask a little bit, why [laughter]? Why did you guys-- well, you already mentioned why you got into graphs, but why did you get into this versioning topic? Why is that so important to you? 
MF: 08:33.413 So, basically, we were really curious about start developing procedures. So first of all, it started like an experiment for us. So I wanted really to create something that works inside Neo4j, not just using Neo4j. Not a web application or some REST API, for example. And then we just figured out that there were different data model implemented with the procedure or server extension, for example. But versioning wasn't one of them. So we just took one simple model, which is the entity state one, and said, "Okay. Are there any real use cases for that?" And there were entity states everywhere. Like graph are everywhere, entity states model were everywhere. So we picked that topic and started developing it. 
RVB: 09:37.743 That's good. This is just after hours, right? You're doing this in your own free time? 
MF: 09:42.349 Yes, yes. 
Ad'E: 09:42.984 Of course. Free time, of course. 
MF: 09:45.035 Nights and late afternoon time [laughter]. 
RVB: 09:48.602 Graphs are everywhere and they're addictive [laughter]. 
MF: 09:52.145 Yes, yes. 
Ad'E: 09:52.459 Oh, yeah. Yeah, a little addictive. 
RVB: 09:54.998 Very much so. Once you get into it, it's like they're everywhere really, right? Because you can't get them out of sight anymore. So, well, what's your plan with this? What's your plan with graphs in general, but also with this, the versioner that I can see [inaudible] already. What does the future hold there? 
Ad'E: 10:15.350 Actually when doing the online meetup with Marco, we found that we were missing a versioning of relationships. So probably this is going to be a topic we are going to manage. It's going to be new. And we also find it useful for the development of the other sub-projects that generated from this which is the SQL versioner data base importer's tool, which allows you to look at your data base at a graph way. And it also would benefit of the versioning of relationships. 
RVB: 10:55.476 Also there's a version of this for relational database as well? 
Ad'E: 10:59.469 No, it's a sub-project. There are the main core API, which is the graph versioner core, which is the project we talked about and we're implementing the entity state model and so on, unlike the graph versioner tool which is using the core versioner but allows you to do much more things database related, like import past graph database. 
MF: 11:29.972 It's like an expansion of the [crosstalk] developed. 
RVB: 11:37.346 Oh, wow I didn't know that. Okay. 
MF: 11:37.779 So, basically, this one and maybe-- 
Ad'E: 11:41.348 You also find your sources on GitHub. 
MF: 11:42.847 Yes, yes. They're both on GitHub on our organisation page. Yeah. 
RVB: 11:46.445 Oh, cool. Sweet. Any other huge projects on the horizon or is this keeping you up late enough? 
MF: 11:57.318 Ideas are everywhere. So both on my company, we looked at Neo4j and the graphs and all my colleagues were really excited to work with that. So who knows? 
RVB: 12:11.706 That's good to hear. Okay, fantastic. Well, what we'll do with this podcast, we'll make a webpage with links to both the project and the documentation and maybe some additional capabilities as well. And then I look forward to hearing more. Maybe you guys should do some talks around this at conferences and stuff like that. That would be great. 
Ad'E: 12:36.899 Oh, we'll do great. 
MF: 12:38.366 It will be great and we'll think about that. 
Ad'E: 12:40.421 Of course [laughter]. 
RVB: 12:41.680 All right, guys. Thank you so much for coming online, You know that we want to keep these podcast recordings fairy short. So I want to thank you for coming online and doing this and I look forward to meet you guys face-to-face sometime.
MF: 12:56.089 Thank you. We hope so. 
Ad'E: 12:56.630 We're grateful for the opportunity. 
MF: 12:57.308 And thank you Rik for having us. 
RVB: 12:58.915 Thanks guys. Talk to you later. Bye. 
MF: 13:01.058 Bye. 
Ad'E: 13:01.604 Bye.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

No comments:

Post a Comment