Tuesday 10 April 2018

Podcast interview with Johan Teleman, Neo4j

I had a great time chatting to my colleague Johan Teleman, recently. Johan works at the Neo4j Engineering team in Malmö, and has been doing some great work - on Cypher performance among other things. As it turns out, there's a LOT that has been done already (look for some spectacular stuff in Neo4j 3.4), but there are so many interesting plans for the future as well. Here's our chat:


Here's the transcript of our conversation:
RVB: 00:01:39.301 All right. Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j, and here I am again recording another episode for our Graphistania podcast. And today I am very happy to have one of my Malmö colleagues on the other side of this call. That's Johan Teleman. Hi, Johan. 
JT: 00:01:59.902 Hi, Rik. Happy to be here.
RVB: 00:02:01.526 Hey. That's great to have you here. I know we had some scheduling issues [laughter], but thank you for taking the time. Johan, most people don't know you yet, so I know you work in Neo4j engineering, but you probably want to introduce yourself a little bit. Who are you? What do you do? And what's your relationship to graphs? 
JT: 00:02:19.527 Yeah. So I'm currently heading the Cypher performance effort in Neo4j on the engineering side. And that, essentially, means that we try to take the Cypher that we all know and love, and we want to make it faster in all kinds of ways. And, yeah, that's about it. That's what I do [laughter]. 
RVB: 00:02:45.346 That's been a long-standing project, right, to make Cypher faster again? 
JT: 00:02:51.304 It always is supposed to get faster. But we have a long-term effort. This has been going on since last whatever, this project since last Easter, I think, and it's going to go on for a very long time. Otherwise, we try to mix it up, and we do some spikes here and there, but this is really from the ground up, supposed to be fast. 
RVB: 00:03:19.468 Well, I remember Cypher from the early days when it didn't have write capabilities yet [laughter]. 
JT: 00:03:25.861 Yes. 
RVB: 00:03:26.422 So it has definitely come a very long way. But what are some of the most recent things that you guys have been working on? Can you help us understand that a little bit better? 
JT: 00:03:37.928 Yeah. So the interesting thing about performance is that all the things we're working on are invisible to the outside, except that it's faster. But we made some very big reworking of the value system under the hood for 3.3, which improved throughput in highly concurrent situations by quite a lot, something like 40%. 
RVB: 00:04:03.438 And, Johan, what's the value system? 
JT: 00:04:05.736 Yes, the value system. That's like all the things which are interesting. That's your data. Well, it's not the nodes, and it's not the relationships, but it's everything else, all the things you can put as properties, as property values, and also things which you can have in Cypher, like lists, and maps, and paths. 
RVB: 00:04:26.016 So if I ask a question-- if I do a Cypher query, and I get result set back, that results set is part of the value system or is using the value system? 
JT: 00:04:34.622 Yes. Yes, like all the things, all the nodes and the-- yeah, well, you actually get back node values, so to speak. Yeah. If you don't have values, then you have a pretty boring query, I would say. 
RVB: 00:04:48.148 So, and that was in the 3.3? I think you guys called that the slotted runtime, right? Is that what it was? 
JT: 00:04:54.565 Yeah, kind of. The slotted runtime is related, but it's not quite that. In the slotted runtime, which is a sort of intermediary runtime that sits like-- it's not as fast as compiled, but it's broader, so it covers more queries. And there we have an improvement where we essentially change the representation of the rows in Cypher from being maps to being arrays. And this makes it way faster. 
RVB: 00:05:26.755 I think I got it mixed up, right? You did the compiled runtime before, and now you're working on the slotted runtime. That's true, right? 
JT: 00:05:33.412 That's true, yes. 
RVB: 00:05:34.286 And slotted runtime would be part of the 3.4 release, which is currently going through its alpha stages, right? 
JT: 00:05:41.084 Yeah, exactly. We got a little bit of slotted into 3.3, but not so much. But in 3.4, it's going to be covering all possible Cypher queries. 
RVB: 00:05:50.831 And what types of performance improvements are we looking at? Any guesstimates yet? 
JT: 00:05:57.087 Yeah, the best guesstimate currently is a 70% increase throughput in a sort of realistic, concurrent query situation. 
RVB: 00:06:07.358 Oh, wow. This is being recorded. You know that, right? 
JT: 00:06:09.772 Yeah, sorry [laughter]. 
RVB: 00:06:13.039 Very cool. So, Johan, super interesting, but why did you get into it? What's driving your effort in graphs, and how did that come about? 
JT: 00:06:25.050 Yeah, actually, I don't know. I mean, I listened to some of the other podcasts, and people keep saying that, "Oh, that fantastic graph model," and, "I completely love it," and so on. But for me, it wasn't really like that. For me, it was more a question of a very attractive company with very competent people, and engineers, and sort of nice place to be, and then I sort of took graphs as a side dish to that [laughter]. But then as far as it comes to what I'm doing right now, I did a bit of security work for 3.1, and then I did some composite index work for 3.2. But I think working with compilers and performance, these are some extremely core computer science and programming parts, and they're super fascinating and complicated. 
RVB: 00:07:18.704 What you're saying is you're a hardcore nerd that wanted something difficult to do. 
JT: 00:07:22.867 Yes [laughter]. Of course. It's not fun if it's easy. 
RVB: 00:07:27.594 Yes, exactly [laughter]. Very, very cool. But have you done any experimentation on your own with graphs, or is has it mostly been from an engineering site? 
JT: 00:07:39.489 I don't think I ever worked with graph; I just build the database [laughter]. 
RVB: 00:07:46.327 Let us work with it, and we'll give you feedback. That works, too, right? 
JT: 00:07:51.664 Yeah, that's perfect. I mean, if you ask me to write a Cypher query, I get really stumped. It's [laughter]-- 
RVB: 00:08:00.177 You can come to one of my courses if you want. 
JT: 00:08:02.176 Oh, thank you [laughter]. 
RVB: 00:08:04.872 And then what's the most exciting thing that you've been working on? Is it the Cypher performance stuff that you're doing now, or what part of the engineering has attracted you most? 
JT: 00:08:16.813 Yeah, I think the performance work is really fascinating, and I think we've been doing a lot in terms of tooling and understanding performance and sort of what's fast and slow. And we have some very ambitious plans and things we want to get out of cutting-edge computer science research and try out and see if that works and so on. I think that's, for me, the most interesting part that I've been working on. 
RVB: 00:08:44.150 You're teasing me now. I want to talk a little bit about the future, so why don't we talk about that a little bit now? What's the story? How do you look at the future, and what are some of the exciting things that you want to be working on? 
JT: 00:09:02.753 Yeah. I think a very important part of what we're going to go for as soon as we get 3.4 ready is to start working on parallel execution of Cypher. And that means that one query will be executed over multiple cores, if they're available, but also that single thread or core will execute multiple different queries via timesharing. That's a sort of goal, and that will greatly help the server cope with really massive query loads but also with more analytical queries which need more CPU. 
RVB: 00:09:45.214 Is that similar to what other distributed systems do? Like if you look, like, map/reduce and those types of things. That's like a distributed computing framework, right? Is that what you're thinking with parallel Cypher. Is that something similar, or is it anything related? 
JT: 00:10:05.210 Well, I think it's a-- I mean, we don't want to distribute in that sense, I think, over multiple machines. But rather, we want to sort of use-- because Neo4j is really good at using the resources it has available. We're known to require way less hardware than many other systems, and we want to keep it that way. We want to be as good and as performant on a sort of low-spec environment as possible, but still, it would be very valuable if we could sort of take the course we have, and we could apply all of them in the best way possible right now. So that's sort of it. And then also to handle very massive-- like thousands of concurrent transactions, for example, is not that easy to handle at the same time. You can't really execute all of them in parallel if you only have some fixed number of 64 cores or whatever. 
RVB: 00:11:05.617 Great. I think you still have a job to do [laughter]. 
JT: 00:11:09.201 There's a little bit left, yeah. 
RVB: 00:11:11.456 A little bit of work left. I'm looking forward to seeing the results. And, for now, I think what we'll do is we'll post a couple of links to some of the performance improvements that we've had in the past couple of releases on the transcription of the podcast, and I'm sure that people will be excited to learn more as things roll out. Thank you so much for helping us know more about that, Johan. 
JT: 00:11:41.328 Yeah. Thank you for having me. 
RVB: 00:11:43.814 Fantastic. Thank you so much. Have a good one. 
JT: 00:11:46.168 Right. Bye. 
RVB: 00:11:47.529 Bye. That was.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

No comments:

Post a Comment