Showing posts with label nigel small. Show all posts
Showing posts with label nigel small. Show all posts

Friday, 8 July 2016

BOLTing Podcast Interview with Nigell Small, Neo Technology

Little over a year ago, I had the chance to interview my friend and colleague Nigel Small for the Graphistania podcast. Great conversation, but time has gone by quickly - and Nigel has been very hard at work inside Neo4j's engineering team to create additional and exciting functionality for the World's leading graph database. Specifically, Neo4j's BOLT protocol has been one of the fruits of Nigel's labour - so it was a good time to have another chat and get another coffee-laden conversation in.

Here's the transcript of our conversation:
RVB: 00:02 Hello everyone. I'm name's Rik, Rik Van Bruggen, from Neo Technology and I'm at the London coffee shop of choice near our office in London doing another interview for the Neo4j podcast and I'm interviewing someone who I have interviewed before [chuckles] a while ago - Nigel Small from our engineering team. Hi, Nigel. 
NS: 00:27 Hello. 
RVB: 00:28 Hey, it's good to have you here. The reason for inviting you, somewhat unexpectedly, to talk a little bit on this podcast interview is that I know you've been hard at work for the past, I don't know, 18 months or so on the Bolt interface to Neo4j and the new drivers for the different development languages. I wanted to talk a little bit about that. Can you tell us a little bit about what is Bolt and what are the new uniform drivers? Let's start there. 
NS: 01:01 So Bolt is the new approach for interfacing with Neo4j to eventuate probably replace the old REST interface, but for now, sits side by side it. It's a binary protocol, it's developed entirely in-house, and we've built a set of four drivers initially that we've released that interface with Java, Python, Java Script and .Net, to try to broaden the reach of the software that we're offering in-house and offer something that's supported for those platforms. 
RVB: 01:37 Absolutely. So Bolt, from what I understand it, it's a binary protocol as opposed to the REST interface which is ASCII-based, I suppose. It's clear text. 
NS: 01:49 It's text based. So the old interface - the HTTP interface - using JSON to transmit its payloads and as a data transfer format, JSON has challenges, let's say. There's certain limitations in what you can express. So we've developed a custom-serialization format and it's very much in line with the Cypher type system. It's inspired heavily, let's say, by MessagePack, which is a similar set-up, but it was developed from scratch to, as I say, work with a type system and work efficiently in the way we want to transfer data to and fro. 
RVB: 02:34 So what was the primary goal of Bolt? What are some of the main advantages of using it with Neo4j 3.x going forward? 
NS: 02:46 As I say, the type system is certainly one of them, so you get a much more native type system. You're sending fewer bytes to and fro, and while we haven't focused very heavily on optimization at this stage, we want to get the features set, fully fleshed out, there's a lot of optimization ideas we have going forward. Already, generally speaking, we're going to have a much faster experience with Bolt than you have done with HTTP in the past. 
RVB: 03:13 As I understand it, it makes the server mode of Neo4j, as opposed to the embedded mode of Neo4j, a lot more feasible for high performance applications, or is that not really the case? 
NS: 03:29 No, there are certain advantages. You've got a stateful session that you know you use, as opposed to HTTP which uses a stateless set-up. So each time you make a request when using HTTP, you're sending often the same set of metadata across, you're sending your user agent, you're sending your authentication information with each request that you do. With Bolt, you send that information at the start of a session and then that's used throughout, so you don't need to resend the same data. So, yeah, you do get some efficiencies. 
RVB: 04:03 Very cool. But as I understand it, some of the side-effects from implementing the Bolt protocol, that hasn't been more on the drivers side? There has been a lot of work that you've been doing also on the uniform drivers that we've been providing with three.org. Can you talk to us a little bit about that? 
NS: 04:23 Yes. The uniformity has been a key part of this. We wanted to provide a clean uniform experience across different languages. We picked these four to start with because they were four of the most important ones to us. 
RVB: 04:37 Which four are those? 
NS: 04:37 The four are: Java, Java Script, Python and .Net. And we wanted to make the experience as similar as possible. So we've made sure that we've unified the use of terminology and concepts that the drivers use across the board. So we have a session, we have a transaction, we have a results set. And they're all handled and described in the same way across all the drivers. In fact, the developer manual - the new developer manual - actually has one story it tells of how to use the driver with simply a difference in the sample code that's embedded. You can just switch the tab and it shows you the same code in different languages. 
NS: 05:19 So we wanted to get this uniformity in place, but a lot of the difficulty has been around making sure that we get the right balance between uniformity and idiomatic language use. So we didn't want something that was exactly the same in every language, but for alien to the developer in that language. And actually getting that balance right has been quite a lot of work, it's been a real challenge at times, and we've tried to respond to feedback where developers have come to us. And we had a particular incidence with that with the .Net users telling us that the methods we were using for filtrating through results didn't feel natural. So we've gone back and we've reassessed how we were doing that - this was still pre-release of 3.0 - and we went back and we reassessed and we redesigned and we actually shifted the balance there much more towards idiomatic and away from uniform. I think that's been something we couldn't do entirely in-house, we needed to talk to the users in those ecosystems and tell us how best that we could fit that in. I think now we've got something that's pretty solid and should work well in most languages. 
RVB: 06:32 In the past, the language drivers around the REST API in the past, they were mostly developed by the community, right? They were mostly developed by people like yourself with other people from the community have been contributing. Has that changed as well now with the uniform driver set? 
NS: 06:50 We still have some driver authors. This is interesting actually. I've been involved with doing this for about five years or so now in terms of developing drivers, and seeing some drivers that have been born and had a life and then died off somewhere, other drivers that have carried on. But now we have some official drivers, actually we have to kind of work out how we want those drivers to sit alongside the community efforts. We don't want to go along and rid ourselves of any community efforts we have because the community's very very valuable and we don't want to build all these comprehensive idiomatic features in every single language. We want to provide a base, I think, is where we've left this. We want to build a base core driver that handles all the plumbing, doesn't let you have to worry about the type system and the protocol detail, provides a base API on which you can build other layers, build an OGM, build other things that are specific to the language that you're in. You've got Link and .Net, the things that are specific to the language. So ultimately, we're hoping that the community drivers will be something that will actually sit alongside the official drivers and perhaps as a set of plug-ins or something that can extend the official driver. 
RVB: 08:19 So then the official supported drivers will be like the infrastructure for more added features, feature reach and implementations by the community? 
NS: 08:27 Absolutely. Exactly how we go about that, we haven't entirely decided yet. As I'm still running the Py2neo Project, I've got a few ideas of how we can kind of combine the official Python driver with Py2neo with the extra features that I've added in in a way that I don't have to duplicate my efforts and actually do the same thing at home that I do in the office. There are a few challenges there and how we fit that in, but we want to make sure there's room for both, I think, ultimately. 
RVB: 08:56 Very cool. Well I think people can find a lot more information on Bolt and on how to write drivers and everything online. We'll include some of those links on the transcription of the podcast. I think it's a great evolution that we're really taking this forward. So thank you very much for taking the time and spending some time with me. Thank you for the coffee is what I wanted to say as well. And in the meanwhile, I think England has scored here. I heard a big row outside, so that's probably good news. 
RVB: 09:31 All right. Thank you, Nigel, and I'll talk to you soon.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

Tuesday, 19 May 2015

Podcast Interview with Nigel Small, Neo Technology

Waw. Seems like I have recorded 22 (!) podcast episodes in the past 2 months - that's pretty sweet! So here's another one that will make you smile: great conversation with the inimitable Nigel Small, aka Technige, aka Neonige. You may know Nigel from his work on the superb Python language driver for Neo4j, py2neo. What you may not know is that he was one of the original (co)inventors of Graph Karaoke, and that he is a generally super sweet and smart guy. He's currently working on some super interesting stuff at Neo's engineering team - but let's have him explain that himself:


Here's the transcription of our conversation
RVB: Hello everyone. My name is Rik - Rik Van Bruggen from Neo Technology, and here we are again recording another podcast session. Today I am joined by Nigel Small all the way from the UK. Hi Nigel.
NS: Hello Rik. 
RVB: Hey. 
NS: How are you doing? 
RVB: I'm doing very well, and you? 
NS: Yeah, not too bad. Thank you. 
RVB: The sun is shining over here. I hope it is over there as well. 
NS: It's pretty bright here as well, actually. 
RVB: Fantastic. Nigel, welcome to the podcast. We always talk about a couple short things here. The first thing is, who are you? 
NS: Well, Nigel Small [chuckles]. I joined Neo Technology last year - last August. And that was after being a groupie for about three years prior to that. I built one of the python drivers, so I've been hanging around the community for some time, gathering uses for the driver and gradually getting more and more into the database itself. 
RVB: Absolutely. Well, You know the py2neo is very popular it seems, right? That's a-- 
NS: It's definitely become a lot more popular than I ever expected. It kind of fell out; it was an accident really but [laughing] it's become reasonably successful. I'm quite pleased. 
RVB: Fantastic. Would you mind telling us a little bit how you got into graphs? And why you got into graphs and, of course, why do you get into py2neo? 
NS: All right. Well, it all started due to Jim Webber. 
RVB: Oh, no. Not Jim again. 
NS: Yes. His name keeps cropping up. I worked with Jim briefly when he was consulting in a previous life, and we stayed in touch, and I remember having a conversation with him at some point about databases and him telling me that the odd relational type of table-based databases were a bit passé and I needed to look at [chuckles] these graph databases. So, having no knowledge really of what these were and no knowledge of graph theory at all - it's not something I'd ever come across - I spent some time looking into it and decided to try to apply it to my family tree, which was a hobby of mine at the time. So I started looking at how I could store some of my family tree data into a graph. Python was the language which I enjoyed using anyway, from a hobby point of view. So played around with the REST interface which was quite new at the time. Wrote a few bits of python code to get some data in and out and ended up getting rather distracted on the mechanism for actually putting data in and out and forgot about the family tree side of it [chuckles]. And ended up developing those bits of code into what's now py2neo. 
RVB: Oh, wow. So it's basically a wrapper around the REST API that you built, right?
NS: Exactly. Yeah. It's-- 
RVB: What's called the language driver. 
NS: Absolutely. One of the early ones. I think because the REST interface was reasonably new at that time, I was one of the early pioneers I think of writing drivers. 
RVB: It's called a guinea pig, Nigel [laughter]. 
NS: [laughter] It's been rewritten several times since to correct a lot of the errors I made in the early days. 
RVB: Oh, okay. So what do you like about working with the graph database? At the first instance, what attracted you? 
NS: I think the fact that it was something different. It was good to get my head in something that was a lot different to anything else I'd used before. I'd worked very heavily with databases for some time. I'd worked as a DBA and programmer for about 15 years prior to that. But had only ever been exposed to standard tables. So it was nice to get my head in something else and see what it was like. It was a challenge to start with. Because as I say, I knew no graph theory at all. Didn't really, at first, see quite how this was going to apply to the vast majority of data that I'd ever worked with before, because I was still thinking very much in tables. It took quite some time to undo everything that I already knew and reapply it to graphs. But now I think I'm looking around at most bits of data-- I was recently putting together some political data for a session that I'm doing and realising that it actually fits very, very naturally when you're talking about politicians who belong to a particular party and who stood in a particular election. All of those things are very much objects you can represent as notes with relationships between them. And a graph now feels very natural for most kinds of data modelling. 
RVB: Yeah. Absolutely. Yeah. To be honest, it's funny that you mention the family tree. Hierarchies are graphs, right? I actually did a family tree of my own one day and I discovered I was Dutch. [laughter] Which was a hilarious meetup presentation actually. 
NS: Was that a good or a bad thing? I don't have much opinion on-- 
RVB: Let's talk about something else [laughter]. 
NS: Okay [laughter]. 
RVB: So let's talk about where is it going, Nigel. I mean, you've been working on some really exciting stuff at Neo. Where do you see graph databases in general and some of the work that you've recently been doing as an engineer at Neo-- where do you see that going? 
NS: Well, the work I've been doing for the past few months has been on what we've loosely branded our new remoting project, so, given that I've come in with some knowledge of drivers and the interaction between the clients and service, it's been quite nice to fall into a project that's very closely related to that, to rebuild a lot of the protocol and the client server capabilities for the database itself. So we're looking-- 
RVB: Is that an alternative to REST then? Or what is-- 
NS: It will be eventually, yes. We're looking at something that's going to be, hopefully, well-- more performant. Something that's much more in the order of magnitude of performance that we see the embedded databases. Ultimately, yes, replace a large number of the used cases for the REST interface. I don't know whether we'll end up replacing everything, because there are still some good uses of having an http interface for a very low barrier of entry. But the vast majority of applications, I think we'll end up using our new protocol. And one of the things I particularly want to do is to try to level out the experience across languages.
Traditionally, Neo's been very Java-centric for obvious reasons. This is where it came from. This is it's back-- but coming from a Python world I want to make sure that we've got the same kind of performance capabilities in Python and then the same in PHP and Ruby and all the other languages that we want to be able to connect to Neo. You almost shouldn't have to know that the underlying database has been written in Java. It doesn't matter what you're using - what stack you're using - you're going to find the Neo performs blistering fast regardless. 
RVB: So that's the first point of evolution, right? Where we're going with the binary protocol like that. That's a big new thing, right? 
NS: Absolutely. 
RVB: Any other new things that you see coming up on the horizon that you think are really exciting? 
NS: There's a lot of work going on with the big graph side of things. So not only are we making access faster, but there's a team working on scalability as well and making sure that we can add new servers, make things perform [chuckles] in a linear way, faster with each server that you add. So I think the capabilities of the platform itself are growing very, very rapidly and I think we're going to see a little more installs. It's going to become a much more mainstream product than it has been in the past. 
RVB: Yeah, absolutely. Well, thank you so much, Nigel, for taking your time to come on the podcast. It was a pleasure talking to you. 
NS: Thank you. 
RVB: I'm sure ever one will have a chance to meet you at the GraphConnect right? 
NS: Yes. I'm going to hovering around London on the 6th to 7th and the 8th, so we've got-- the 7th is the graph connect day but on the 6th, we have an eco-system day where I'm doing a couple of sessions talk about the new remoting project as well. It will be good to see as many people as possible. 
RVB: Yes, super. Thanks, Nigel. Talk to you soon man. 
NS: Great stuff, Rik. Thanks very much. Bye bye. 
RVB: Bye.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik