Friday, 26 January 2018

Podcast Interview With Konstantin Lutovich, Neo4j

Just before 2017 came to an end, I read this blogpost by my colleague Konstantin that was talking about how we were including an asynchronous API in our Java driver. That triggered my attention - I have always thought of the non-blocking characteristics of asynchronous systems to be very interesting for lots of different use cases. Just think of the difference between email (async) and phone (sync) communication - there's great use cases for both of them, right? 

So I decided to try and get Konstantin on the phone, which he very graciously accepted. We had a great chat - which is what this next podcast episode is all about. Hope you enjoy!


Here's the transcript of our conversation:
RVB: 00:00:03.511 Hello everyone. My name is Rik Van Bruggen from Neo4j and here I am again recording another episode to our Graphistania Neo4j podcast. And this morning I've got a wonderful colleague of mine in Malmö, Sweden, on the other side of this Skype call and that's Konstantin Lutovich. Hi, Konstantin. 
KL: 00:00:23.786 Hello, Rik. Thanks for having me here.
RVB: 00:00:28.410 Thank you for coming online. I appreciate it. Konstantin, you've been a software engineer at Neo4j, I think, for a couple of years, but you triggered my attention in a very peculiar way, right [laughter]? It was an email that you sent out announcing a new version of one of the Neo4j drivers that exposed some async APIs. And I told you this story about one of my old professors at university that basically said that "async works and sync does not". And I wanted to have a chat with you about that. Is that okay? 
KL: 00:01:06.339 Yeah, absolutely. So-- 
RVB: 00:01:06.550 So maybe you can introduce yourself first a little bit. Who are you, what do you do, and how you got to the wonderful world of graphs? 
KL: 00:01:16.366 Yep, absolutely. So my name is Konstantin. I've been a software engineer at Neo4j for slightly more than three and a half years, I think. I've spent most of my time, like two years or something, working in the kernel team. That's the team that's responsible for transaction handling, store files, indices. I also did a bit of HA-clustering maintenance there. So then I spent a short amount of time in the Cypher team, working on initial version of the compiled runtime, where we tried to generate Java code and byte code for query execution, which proved to work and give quite a significant performance improvement. And right now, I'm a member of the drivers team and I'm mainly responsible for Java driver, JavaScript driver, and the server-side component which handles network connections, Bolt, and everything. 
RVB: 00:02:17.326 That's a pretty impressive CV, my friend. You've been around [laughter]. 
KL: 00:02:22.174 Yeah. I find it quite fascinating to rotate once in a while and experience different parts of the company. 
RVB: 00:02:30.100 Yeah, absolutely. But it's like going from the bowels and the internals of the kernel all the way to basically one of the product [surfaces?], right? The thing that end users interact with, right? 
KL: 00:02:40.911 Yep. Yeah, that's correct. So in kernel team, everything is quiet, low level. And in drivers team, we are actually responsible for the surface that users interact with, so we get to decide what the API shape will be and everything. And also one would never experience JavaScript in the kernel team which is quite a special thing. 
RVB: 00:03:08.373 Good and bad, maybe. 
KL: 00:03:08.861 [crosstalk] drivers. Yep. 
RVB: 00:03:11.500 Hey, Konstantin, but you triggered my attention with this new AsyncAPI in the Java driver, I think, right? What's that all about? Can you tell us a little more about that? 
KL: 00:03:21.473 Yep, absolutely. So before 1.5, the only driver that provided AsyncAPI was the JavaScript driver, and that's only because that's the nature of the language and that's the only way to do IO. So for 1.5 release, we added AsyncAPI to C# driver, which is actually really good with handling asynchronicity. And the language itself has really good support for async programming. And we're almost done adding support for AsyncAPI in the Java driver. We've released 1.5 RC release so the AsyncAPI was quite a popular request because people are using async frameworks and async systems like Scala actors and Spring Reactive repositories. 
RVB: 00:04:22.546 And what does that offer, actually? What does the async functionality bring to a layman user? Can you explain that a little bit more? 
KL: 00:04:31.589 Yup. So I think with AsyncAPI, users are able to achieve more throughput with less resources, so they will need a smaller amount of threads to handle a large amount of network connections and interactions with the network, so less amount of threads will be able to execute more queries in parallel. 
RVB: 00:04:57.314 And that's why I-- basically because in a async paradigm, you can fire something off, some kind of a request, and just be notified when the response comes back instead of having to constantly monitor when the response comes back. Is that fairly accurate? 
KL: 00:05:13.663 Yup. That's pretty accurate. So thread will be able to execute the query and then it will be notified when query is executed and the response came back. So the thread won't have to be blocked, actually, waiting for a response to come back. 
RVB: 00:05:30.621 I mean, I was telling you about it earlier, right? It's this programming paradigm that I think has been really popular in like highly available telecoms and mainframe systems over the years. A very robust way of working, I think, right? 
KL: 00:05:48.833 Yup. 
RVB: 00:05:49.705 It seems like it. 
KL: 00:05:50.840 So it gained a lot of mainstream popularity in the last three, even five years, I think. 
RVB: 00:05:57.590 Yup, very cool. So how did you get into Neo4j, actually? What's your history with working on graphs and graph databases? So anything that you want to call out there? 
KL: 00:06:11.760 So my main exposure to graphs happened when I joined Neo4J. I did not have any production experience with graphs or graph databases or Neo4j before joining Neo. But I like it so far and I really like the model, the graph data-- database graph model and the expressiveness of this model. So I always felt like when you when you try to put your data in some other kind of database, you always have to kind of tweak your data to make it fit the model. But with graph databases, you can just align your data and it lays properly on the graph model which I find quite nice, and especially-- this is really hard with relationships in the data. So I guess, it's pretty easy to fit any entity in any kind of database but relationships are really hard. 
RVB: 00:07:16.942 Could not agree more, absolutely. And, now, you've been around and working on different parts of the product. Right? So what's the coolest part that you've worked on, so far. What do you like about the different parts? 
KL: 00:07:30.868 Right. So I really like kernel team and the fact that kernel team works on the really low-level stuff. So I guess, it's quite uncommon for Java developer to actually experience this kind of low-level programming. It's not a lot of companies that do this. Mostly, I guess, Java is about, like, enterprise kind of applications and it's quite rare to be involved with this kind of lowest level of tooling and platform.
RVB: 00:08:03.239 What you're saying is you're a real geek, right [laughter]? 
KL: 00:08:06.167 Yeah. I guess so [laughter]. 
RVB: 00:08:09.835 In a good way. In a good way. Right [laughter]? 
KL: 00:08:11.096 Yep. Yeah, yeah. At the same time, in the drivers team you get to experience all this kind of modern technologies, and you get to feel where the world is going, like, the world is going async. So you get to experience this immense JavaScript infrastructure and the amount of frameworks and the amount of technologies that arise there. And you get to feel how people want to interact with Neo4j and which kind of APIs they prefer and they would like to have. 
RVB: 00:08:47.269 Any kind of bizarre requests that you've had recently from end users where you thought that, "Wow, that's really something that I didn't expect"? 
KL: 00:08:57.272 I didn't have anything like that recently, but I definitely have one prominent example. So at some point, we've received a request to remove locks from the database because users were experiencing deadlocks. And they said, "Yeah, we do not want those deadlocks. So let's just remove locks." So that was quite an interesting request. 
RVB: 00:09:23.184 Yeah. Well, I can see how you would not want deadlocks, but that's not really the [laughter]-- that's not really the-- 
KL: 00:09:28.170 Yeah. That's probably not the best way to get there. Yeah. 
RVB: 00:09:32.339 Yeah. Exactly. Okay, okay. Well, I think that will have led to a good conversation [laughter]. 
KL: 00:09:39.261 Yeah. That's correct. 
RVB: 00:09:40.523 So, Konstantin, we always end these podcast recordings with looking at the future. So I was just wondering, where do you see this going? Where do you see your work on the drivers going? Where do you see our industry going? What's in your crystal ball? 
KL: 00:10:00.314 Yep. So I think with drivers, we'll be going more and more towards async and reactive. We'll definitely need to adjust the Bolt Protocol for better support of async and-- asynchronicity and reactivity. We will also-- 
RVB: 00:10:19.579 Is that something that's already in the works, you think? Or is that being planned? 
KL: 00:10:23.351 Yep. That's being planned, too, and I think we will try to address some parts of this in upcoming 2.0 release of the drivers, like a major release of drivers. So there, we'll hopefully be able to change protocol to support more async-related features like, for example, back pressure. 
RVB: 00:10:43.716 Okay, yeah. Absolutely. 
KL: 00:10:46.864 So I think drivers will be used more and more to process large results or large streams of data. And for these, we'll definitely need something like reactive streams or reactivate APIs. As for the whole Neo4j, I guess we're going towards a platform solution. So Neo4j will not be just a standalone database but, instead, will provide like a framework. And within this framework, users will be able to solve their complicated data issues like storing this data, querying it, processing it to form an analytics, and figuring out insights from this data. 
RVB: 00:11:30.968 Exactly. That's what we're driving for. Right? I mean, we made some big announcements on that at the last GraphConnect conference. So 2018 will be a big year for that. 
KL: 00:11:41.027 Yep. I hope so. 
RVB: 00:11:42.064 Absolutely. All right. Constantine, thank you so much for coming on the podcast. We'll wrap up here but, as usual, we'll put a bunch of links in the podcast transcription. It's been great having you as our guest. And thank you so much for being here. 
KL: 00:11:57.264 Yep. Thank you so much for your time, Rik. Thank you. 
RVB: 00:11:59.929 Yeah. Absolutely. And I'll look forward to seeing you, probably in the new year [laughter]. 
KL: 00:12:05.611 Yep. 
RVB: 00:12:06.326 Absolutely. Thank you. 
KL: 00:12:06.765 Happy New Year. Merry Christmas and Happy New Year. 
RVB: 00:12:09.122 Same to you. Bye. Cheers. 
KL: 00:12:10.756 Bye-bye. 
RVB: 00:12:11.428 Bye.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

No comments:

Post a Comment