Bruggen Blog: 2018

Friday, 21 December 2018

Podcast Interview with Emil Eifrem, Neo4j

We've been around the block a few times in this podcast. It has been a fantastic experience to develop this thing from a crazy idea at a booth at Qcon (thanks again, Michael!) into something that has really gained a life of its own. I for one still enjoy making them time after time - I hope they are a good listen/read as well.

As part of that journey, I have course also interviewed our "Fearless Leader" and "Chief Instigator", aka Emil Eifrem, a few times. We talked in summer of 2015, and then again around this time of the year in 2016 - and swore expensive oaths not to let another episode be too long after. And we failed miserably. So it felt right to do a longer, more elaborate episode now, which we actually published with video on Youtube too:

Of course the Podcast feed and Soundcloud have the same show episode too:

Here's the transcript of our conversation:

RVB: 00:00:54.475 Hello, everyone. My name is Rik, Rik Van Bruggen, from Neo4j and here I am recording another podcast episode. And I think the other person on the other side of this call is going to agree that it's been way too long.

Podcast Interview with JEP, the Graph Database

Alright - here's something special for you. For the past couple of days, I have been listening to - no, I have been DEVOURING all the episodes published in the "Everything is alive" podcast series published by Radiotopia. It is such a great show. Funny, interesting, sad, thoughtful, and ... inspirational. Because - what would happen IF EVERYTHING WAS ALIVE??? I could not not think about that - and specifically, I thought about Neo4j instances... what if THEY were alive? What if I could interview a real, live Neo4j instance - what would that be like???

I decided to find out. Here's my lovely chat with imaginary JEP, the Graph Database. Turned out to be a fine chap, really. Here it goes:

Here's the transcript of our conversation:

RVB: Good morning everyone, my name is Rik, Rik Van Bruggen from Neo4j - and today I am doing a really special episode of the Graphistania podcast series. It's an episode that I got the inspiration for by listening to Everything is Alive, a podcast hosted by RadioTopia. They did some amazing work interviewing some of the most interesting characters ever - and I would love to continue that tradition here today on Graphistania.

Podcast interview with Will Lyon, Neo4j

Been a while since I have been able to publish more podcast episodes - sorry about that. Will try to keep up a regular pace - but no guarantees. However, I must say that the conversation that I had with my colleague Will Lyon made me think that I really should keep it up... the Neo4j ecosystem, and company is full of people with lots of interesting things to say - and talking to them is just a blast.

So this conversation was long overdue, because Will has done SO MUCH for the Neo4j Community in the past couple of years - it's pretty crazy. How do you start a conversation like that? Turns out it's really easy. So nice. Listen to it over here:

Here's the transcript of our conversation:

RVB: 00:00:00.615 Hello, everyone. My name is Rik Van Bruggen from Neo4j, and tonight I have a very long overdue guess on this podcast episode. Someone that I've been dying to talk to, actually, for quite sometime because he's done such an amazing job in the Neo4j community over the past couple of years. And that's my colleague, Will Lyon. Hi, Will.

Working with the ICIJ Medical Devices dataset in Neo4j

Just last weekend our friends at the ICIJ published another really interesting case of investigative journalism - tracking down and publishing the quite absurd and disturbing practices of the medical devices industry. The entire case with all of the developing stories can be found at https://medicaldevices.icij.org/ - take a look as it really is quite fascinating. Of course that meant that I wanted to see what that data looked like in Neo4j, and if I could have a play. I didn't have time for a full detailed exploration yet - but hopefully this will also give others the opportunity to chime in. So let's see.

The Medical Devices dataset as a graph

This turned out to be surprisingly easy. Just download the Zip file from the ICIJ website: https://medicaldevices.icij.org/download/icij-imddb-2018-11-25.zip, unzip this, and then we get 3 comma-separated-values files:

one for the Devices that are being reported on
one for the Events that are being reported (whenever something happens to a device (eg. a recall) then that is logged and reported)
one for the Manufacturers of the medical devices.

That's easy enough.

Data Lineage in Neo4j - an elaborate experiment

For the past couple of years, I have had a LOT of conversations with users and customers of Neo4j that have been looking at graph databases for solving Data Lineage problems. Now, at first, that seemed like a really fancy new word used only by hipster technovangelists to try to appear interesting, but once I drilled into it, I found that it’s actually something really interesting and a really cool application of graph databases. Read more on the background of it on wikipedia (as always), or just live with this really simple definition:

“Data lineage is defined as a data life cycle that includes the data's origins and where it moves over time. It describes what happens to data as it goes through diverse processes. It helps provide visibility into the analytics pipeline and simplifies tracing errors back to their sources.”

That’s easy enough. Fact is that it’s a really big problem for large organisations - specifically financial institutions as they have to comply with regulations like the Basel Committee on Banking Supervision's standard number 239 - which is all about assuring data governance and risk reporting accuracy.

Here’s a couple of really nice articles and videos that should really give you quite a bit of background.

Poring over Power Plants: Global Power Emissions Database in Neo4j

In the past couple of weeks, I have been looking to some interesting datasets for the Utility sector, where Networks or Graphs are of course in very, VERY abundant supply. We have Electricity Networks, Gas Networks, Water Networks, Sewage Networks, etc etc - that all form these really interesting graphs that our users can. Lots of users have specialised, GIS based tools to manage these networks - but when you think about it there are so many interesting things that we could do if ... we would only store the network as a network - in Neo4j of course.

So I started looking for some datasets, and maybe I am not familiar with this domain, but I did not really find anything too graphy. But I did find a different dataset that contained a lot of information about Power Plants - and their emissions. Take a look at this website:

and then you can download the Excel workbook from over here. It's not that big - and of course the first thing I did was to convert it into a Google Sheet. You can access that sheet over here:

There's two really interesting tabs in this dataset:

the sheet containing the fuel types: this gives you a set of classifications of the fuel that is used in the different power plants around the globe
the list of 30,5k power plants from around the world that generate different levels of power from different fueltypes. While doing so, they also generate different levels of emissions, of course, and that data is also clearly mentioned in this sheet. Note that the dataset does not include any information on Nuclear plants - as they don't really have any "emissions" other than water vapour and... the nuclear waste of course.

So let's get going - let's import this dataset into Neo4j.

Podcast Interview with Michael Simons, Neo4j

For this week's episode of our Graphistania podcast, I had the great pleasure of spending some time on the phone with Michael Simons - one of the talented Neo4j engineers that build our products. Michael only recently joined our team, and we actually got talking on our internal channels about something we both love dearly... Bikes. I did a ride in Belgium recently that Michael found interesting and then he rode it himself as well - and hey, we got talking. One thing led to another, and before you know it we are recording the conversation... Here it is:

Here's the transcript of our conversation:

RVB: 00:00:01.418 Hello, everyone. My name is Rik Van Bruggen from Neo4j, and here I am again recording another episode for our Graphistania podcast. And today, I have one of my dear colleagues on the other side of this Google Hangout again, and that's Michael Simons from Neo4j engineering. Hi, Michael.

MS: 00:00:19.623 Hi, Rik.

Podcast Interview with Michael McKenzie

Why spend my evenings/weekends/empty hours creating a podcast? Well that's very simple: I love talking to like-minded people in the graph community. There's something about this community that attracts people that are equally fond of "connections" and building relationships that is just too awesome to explain. I love it. So when Karin told me about this guy in Washington that was doing awesome things with Neo4j and was helping out with community activities (he wrote about it over here), I was all too keen to have a chat with him. Meet Michael McKenzie, from Washington DC - here's our chat:

Note: I recorded this with Michael before our fantastic GraphConnect conference in New York a few weeks ago - but did not have the time to publish it earlier... apologies...

Here's the transcript of our conversation:

RVB:00:00:00.000 Hello, everyone. My name is Rik. Rik Van Bruggen from Neo4j and here I am again recording another Graphistania Neo4j podcast. And today, I have a wonderful community member on the other side of this Google hangout and that's Michael McKenzie from Washington, D.C. in the US. Hi, Michael.

Podcast Interview with Karin Wolok, Neo4j

Next week is GraphConnect New York City 2018, and that's of course a big highlight for all of us at Neo4j. You should really be there if you can :) ...

One of the reasons why GraphConnect is such a great event, is because it allows us to connect all the nodes in the graph and have a great couple of days of real-world conversations about this fascinating topic called graphs. Again, we are going to have a great line-up, not in the least because of all the great community content that we will be presenting and working on during the event.

On top of that, we have had a LOT going on in the Neo4j Community recently - with the launch of a new community site and more. That's a good enough reason for me to invite Karin Wolok, our Community Manager at Neo4j for a good chat. Here it is:

Here's the transcript of our conversation:

RVB: 00:00:00.819 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j. And here I am again recording another episode of our Graphistania Neo4j podcast. And today's a little bit of a special episode I think because it relates to something very dear to my heart and many people at Neo4j's heart, which is our Neo4j community. And for that, I've invited Karin Wolok on the podcast. Karin is our community manager. Actually, you have a very different and more expensive-sounding title, right, Karin? But maybe you can introduce yourself to our listeners.

Podcast Interview with Johannes Unterstein, Neo4j

A couple of months ago, we had a great Online Meetup that was all about scaling out Neo4j using containerisation and container orchestration technologies. You can see the recording over here:

That was really cool, and a great execuse to invite my nowadays *colleague* Johannes Unterstein to the podcast. Johannes has a really interesting history and a lot of expertise in these technologies, and could really talk about them for our audience. So here's our chat:

Here's the transcript of our conversation:

RVB: 00:00:00.399 Hello, everyone. My name is Rik Van Bruggen from Neo4j, and here I am again after the holiday period recording another Graphistania podcast. And today I have the pleasure of welcoming one of my dear engineering colleagues on this podcast episode, and that's Johannes Unterstein from Germany. Hi, Johannes.

ESCO database in Neo4j: Skills, Competencies, Qualifications and Occupations form a beautiful graph!

Just a few weeks ago, I was discussing with Neo4j users that are active in the domain of "labour", or work. While talking to these users, they mentioned that there are standards out there that classify different types of work into different buckets (a taxonomy, if you will), and that there are two competing standards to do so out there. There's

the ESCO standard: the European Skills, Competences, Qualifications and Occupations, and
the ROME standard: the "Répertoire opérationnel des métiers et des emplois (ROME)"

The ESCO seems to be promoted by the European Commission, and the latter seems to be a Belgian/French initiative of some sorts. Surely they overlap, but I am not sure by how much. As luck would have it I started looking at the ESCO material first, but I am sure we could have written this post about ROME as well. It's the principles that matter.

And in principle, I figured that using these standards would be a really cool thing to do in Neo4j. Skills/Competences and Occupations form really interesting graphy structures, and I could see how you could use a taxonomy like that to do some really interesting recommendations and other data workloads. So I wanted to give it a poke around.

Loading ESCO into Neo4j

The entire ESCO dataset can be downloaded from the European Commission's portal site: https://ec.europa.eu/esco/portal.

It's really easy: you just select the data that you are interested in - the topic, format, and the languages - and put together a download package.

In terms of format, you can choose between

an RDF format, which basically gives you a large (500MB) Turtle file. Turtle - the Terse RDF Triple Language, see https://www.w3.org/TR/turtle/ - is probably more comprehensive, as it contains everything. But it's also quite a bit more difficult to manipulate and get your head around. I was able to import the Turtle file really easily using Jesus' "neosemantics" plugins, and had it up and running in minutes. But I found it more difficult to use - most likely because I am not an RDF afficionado. Sorry.
CSV format. That's easy enough - we know how to import those. So all I needed to do was write a few Cypher scripts and import the data in a few minutes. I will put the scripts below, but you can also see them on github.

In any case, I opted to continue with the CSV files, and spent a little time importing the different files and connecting them together - in different languages. There's basically 5 files:

the Skills
the Skillsgroups, grouping the above together in groups
the Occupations
the ISCOgroups: this is a standard of the International Labour Organisation (ILO) that provides an International Standard Classification of Occupations.
and then a few files with relationships between Skills and Occupations, different ISCO groups, and different Skills/Skillsgroups.

I wrote the script pretty quickly - it's really not that hard - and then I ...

... ended up with a few Neo4j databases:

one full of RDF triples - complicated!
one with English Skills, Skillsgroups, Occupations and ISCOgroups.
one with Dutch Skills, Skillsgroups, Occupations and ISCOgroups.

In the Neo4j Desktop that looks a bit like this:

This is where the scripts are on Github.

Working with the ESCO database in Neo4j

Now that all that is imported, we can take a look at it. Let's start by looking at the model that we have imported. Pretty straightforward:

We can also just start looking at some data by just visually exploring it in the Neo4j Browser:

But it get's a lot interesting when we put Cypher to it, and start querying the data. For example, let me grab these two nodes here:

And look at the paths between them:

As always, the things that are located on the path, tend to be pretty interesting. Even more so when I think a bit more about the data, and start looking for the ESSENTIAL FOR relationship links. Let's see what comes back when I look for the links between a "software developer" and a "beer sommelier", when I ONLY traverse the relationships that define really important / ESSENTIAL relationships between Skills and Occupations:

Interesting. I am sure that a domain expert could do lots of other things here, especially if we could give that expert some non-technical tool like Neo4j Bloom.

All in all, this was a really easy and interesting experiment. I am sure there's a lot more to do here - but this was yet another example of a cool application of Neo4j in a surprising domain.

Hope this was useful.

Cheers

Rik

Thursday, 5 July 2018

Podcast Interview with Matt Casters, Neo4j & Kettle

A couple of years ago, I got to know another Belgian data aficionado that was doing quite a bit of work in the open source community, called Bart Maertens. For a while, we actually met at Antwerp Airport when we were both "commuting" to London City Airport for business - and we got a conversation going. Bart was organising a Pentaho Community Meeting in Antwerp, less than 500m from my home, and invited me to come along and talk a bit about my favourite subjects: beer and graphs :) ...

So one thing lead to another, and Bart started to do some interesting work integrating his data integration tools with Neo4j. He wrote the code, and blogged about it in some detail.

Fast forward to early 2018. Neo4j is more and more in the Enterprise market, with very large organisations seeing the value of graph databases and the platform around it. But most of these environments are NOT greenfield environments - they almost always require some kind of data integration work to make the tools work effectively. So it became very natural for us to start look for architects and experts that could help us... and that's effectively what brought my next Podcast guest to the Graph: Matt Casters has worked together with many other Neo4j people in a previous life, and is now the Chief Solutions Architect in our professional services organisation.

Here's my chat with Matt:

Podcast Interview with Estelle Joubert, Dalhousie University

One of the coolest things about Neo4j is just the sheer breadth and diversity of applications that we see for connected data and graph databases out there. I think I have said it before, but it truly continues to baffle me. Very frequently, I will have a morning conversation with a user about battling financial fraud, a lunch conversation about using graphs in biotech to fight world hunger, and an afternoon conversation about real time recommender systems in retail. And of course finish it of with a beergraph conversation in the evening :) ...

Really - it's just amazing. And the next podcast episode is a true testimony to that. I got to have a chat with a lovely lady all the way over in Canada recently, Estelle Joubert from Dalhousie University. She and her team have been using Neo4j in her amazing field of research, which is all about understanding how music and opera came to be what they are today in a historical perspective. She is best at explaining it herself - so here's our chat:

Here's the transcript of our conversation:

RVB: 00:01:20.209 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4J, and tonight I am joined by a guest on our podcast all the way from Canada, someone that has been working with, and experimenting with, Neo4j for quite some time in a very interesting domain that I hadn't heard of before. And that's Estelle Joubert from Dalhousie University. Hi, Estelle.

Exploring new datatypes in Neo4j 3.4 and the Open Beer Database - part 2/2

In the previous blogpost I imported the Open Beer Database into Neo4j and added some new fancy spatial data to it. Now in this post I would like to explore that data. As a reminder, you can find the full

Let's take a look.

First we will just look at the basic OpenBeerDB data. The schema is quite straightforward:

Exploring new datatypes in Neo4j 3.4 and the Open Beer Database - part 1/2

Recently, I gave a talk at the Amsterdam, Brussels and London Neo4j meetups about some of the new and exciting features in Neo4j 3.4. While preparing for it, I was looking for material and I found some very cool stuff that powerfully explains the new features. The best resource is probably this post by Ryan Boyd, and the video that goes with it:

Ryan does a great job at explaining the new features, and goes into some detail on the new temporal and spatial data types that you can now use in Neo4j 3.4. You can explore these new features yourself by accessing the Neo4j Sandbox developed specifically for this purpose. Or you can just do what I did, and use the Neo4j Desktop to spin up a Neo4j instance, and access the "guide". You do that by typing

:play https://guides.neo4j.com/sandbox/3.4/index.html

into the Neo4j browser, and then you can access the entire guide, add some data to your dataset, and play around.

Podcast interview with Jeffrey Miller, ICC

Here's a podcast episode that I have been wanting/needing to publish for a long time .Jeffrey A. Miller works as a Senior Consultant in Columbus, Ohio as a consultant in effective software development practices with lots of organisations. Jeffrey has delivered presentations at regional technical conferences and user groups on topics including Neo4j technology, knowledge management, and humanitarian healthcare projects - and that of course became a great setup for our conversation.

Also - I found this really interesting: Jeffrey and his wife, Brandy, are aspiring adoptive parents and have written a fun children’s book called “Skeeters” with proceeds supporting adoption. Learn more about the project at http://skeeterbooks.com/.

Using Neo4j Bloom for fraud detection, discovery and visualization

Over the past couple of weeks, I have been discussing and showing Neo4j's new Bloom graph discovery and visualization product to everyone that would have a moment to spare. It's soooo much fun to show a tool you love, and Bloom is definitely one of those. And I have also recorded some odf these demo-sessions - you can find part 1, part 2 and part 3 of these recordings on this blog. All of these recordings use my (in)famous Belgian Beergraph dataset - and that's all good fun...

But of course, exploring a beergraph is not really a "business-y" use case. So I decided I would record a Bloom demo using a realistic dataset that centers around using Neo4j for Fraud Detection purposes. You will find all of the important concepts of the Beergraph demos here as well:

navigating the graph using graph patterns
using nifty selection / deselection techniques to only show what you need in the graph
creating better visualizations with colours and icons
editing the graph straight from the Bloom interface
creating custom search phrases for business uses and giving them near-natural language graph search capabilities.

Here's the recording:

Graphs are blooming - again!

A few weeks ago, I wrote a two part blog post about Neo4j Bloom and how I was playing around with my BeerGraph and figuring out some cool features of the new Neo4j product. You can find these posts over here

It included a first little demo video,

that seems to have been liked by a bunch of people :) ... thanks for that.

Podcast Interview with Iryna Feuerstein, Prodyna

Finally. It seems I am getting increasingly bad at getting great podcasts episodes that I have actually recorded a while ago, out there. This is one of them: I had a fantastic chat with Iryna Feuerstein from Prodyna some weeks ago. She has done some amazing talks on Neo4j, and on related subjects (like for example her work on toxicogenomics). So I am very happy to get this episode out there - and hope you will enjoy!

Here's the transcript of our conversation:

RVB: 00:00:02.736 Hello, everyone. My name is Rik Van Bruggen from Neo4j, and here I am on this wonderful Tuesday evening recording another podcast with someone from Germany, and that's Iryna Feuerstein from PRODYNA. Hi, Iryna.

Part 2/2: Graphs are Bloom-ing

Earlier I wrote about how I connected the newly announced preview version of Neo4j Bloom to my good old faithful Belgian BeerGraph. See part 1 of this 2-part series for that story. I actually split up the story into two parts, because I feel like there's a super interesting and powerful part to Bloom that deserves a bit more attention: the mechanism of the custom Search Phrases.

As we mentioned in the previous post, Bloom structures your exploration and discovery into specific "views" on the graph data, called "Perspectives. You can select the perspective you find most appropriate from a dropdown - and customize/tweak/create perspectives yourself if you are not happy with the auto-generated starting point.

Part 1/2: Graphs are Bloom-ing

Last week something happened that really excited me. We, Neo4j, finally announced Bloom and demonstrated our own Graph Visualisation and Discovery tool, Neo4j Bloom. This is a technology that we have long been pondering, have experimented with in a number of ways, and have long looked to find and develop an offering that would be interesting and differentiated in what is already a very well looked-after marketplace.

I am not exaggerating when I say that is truly exciting. Not only do many of our customers want to be able to visualise the results of their graph queries, but the graph data model is also unique in the way that it provides such an intuitive, easy to understand data model that lends itself so well to a GRAPH-ical representation. It truly fits into the Graph Platform vision that Neo4j has been advocating since 2017.

Podcast Interview with Irene Iriarte Carretero, Gousto

This week's guest on our podcast is someone that has been writing and speaking about their use of Neo4j quite a bit. Irene Iriarte Carretero, from Gousto has been writing really cool blogposts, (like this one) and presenting the story at GraphConnect as well. Here's a recording of her presentation:

Some of her excellent slides are over here:

So it goes without saying that I wanted to interview Irene for the podcast, and at the London GraphTour event, I finally had the opportunity. Here's our chat:

Here's the transcript of our conversation:

RVB: 00:00:03.459 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j. And here I am recording a face-to-face podcast, which is the first in a long time. We're at our GraphTour conference in London. And I'm actually here joined by Irene, Irene, from Gousto. And Irene just did a presentation here at the GraphTour about how you guys have been using Neo4j, right?

Podcast interview with Johan Teleman, Neo4j

I had a great time chatting to my colleague Johan Teleman, recently. Johan works at the Neo4j Engineering team in Malmö, and has been doing some great work - on Cypher performance among other things. As it turns out, there's a LOT that has been done already (look for some spectacular stuff in Neo4j 3.4), but there are so many interesting plans for the future as well. Here's our chat:

Here's the transcript of our conversation:

RVB: 00:01:39.301 All right. Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j, and here I am again recording another episode for our Graphistania podcast. And today I am very happy to have one of my Malmö colleagues on the other side of this call. That's Johan Teleman. Hi, Johan.

JT: 00:01:59.902 Hi, Rik. Happy to be here.

Podcast Interview with Niklas Saers

I had the chance to have a chat with Niklas Saers recently. Unfortunately the recording quality was not always that good - but I think you will be able to make out most of what we said :) ... Niklas works for Unwire, and specialises in iOS development. He is the author and maintainer of a Swift driver for Neo4j (see below) and a mobile-based iOS browser for Neo4j data. Really good conversation - with a Flemish twist. Here it is:

Here's the transcript of our conversation:

RVB: 00:00:07.247 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j and here I am recording another episode for our Graphistania podcast. And today, I have, well, should I say a fellow Flemish countryman? I'm not sure really. There is a little bit of history there, right? I've got Nicolas Saers on the phone here.

Podcast Interview with Dilyan Damyanov, Snowplow Analytics

Here's another great podcast for you: I had a chat with Dilyan Damyanov of Snowplow Analytics, chatting about how you can use a graph database for enhancing your event analytics, specifically for clickstream analysis. I wrote about this myself a while back, but of course there is so much more to it - and Snowplow has really done a great job at enabling it with their toolset.

Here's our chat:

Here's the transcript of our conversation:

RVB: 00:00:14.000 Hello everyone. My name is Rik Van Bruggen from Neo4j and here I am recording another Graphistania Neo4j podcast. And today, I've got someone from London on the phone. That's Dilyan Damyanov. Hi, Dilyan.

Podcast Interview with Jonathan Schmidt, Waykonect

Here's another cool user story for you. I had a great chat with Jonathan Schmidt, founder and CTO of a great French startup called Waykonect that offers intelligent vehicle management based on Neo4j. They have been doing some really smart stuff and the use case seems like such a great fit for a graph - it's like a hand in glove. Listen to his story - it's very cool.

Here's the transcript of our conversation:

RVB: 00:00:01.221 Good morning, everyone. My name is Rik. Rik Van Bruggen from Neo4j. And here I am recording another Graphistania Neo4j podcast. And this morning I have got, well, someone not too far away from me on the other side of this call, and that's Jonathan Schmidt from WayKonect. Hello, Jonathan.

JS: 00:00:23.314 Hello, Rik.

RVB: 00:00:23.952 Hi. Thank you for joining me.

JS: 00:00:26.779 Welcome. It's a pleasure to be here.
RVB: 00:00:28.886 Fantastic. Jonathan, we've been emailing back and forth, and you've been talking to me about your project with Neo4j, but most people probably don't know you, yet. So if you could perhaps introduce yourself a little bit. Who are you, and what do you do, and what's your relationship to the wonderful world of graphs?

JS: 00:00:48.982 All right. Well, I'm Jonathan Schmidt. I'm CTO and cofounder of WayKonect. We are a fleet management company. Telematics fleet management. Our focus is data analysis, actionable intelligence from your vehicles, and we take a very driver-oriented approach to the field. Our belief is that you have to engage drivers if you want any kind of savings, any kind of actions to be successful on your fleet. So that's what we do. Analysing data and engaging drivers.

RVB: 00:01:29.005 When you say fleet management that means lease cars or it means-- what is the fleet for you? Is it any kind of rolling material, or what is it?

JS: 00:01:38.584 We mostly focus on small, light vehicles. So cars, mostly, small trucks, that kind of thing.

RVB: 00:01:52.184 Okay. Very good. And then how does it work, and how does that work with the graph as well, potentially? Could you explain that to us?

JS: 00:01:59.999 Sure. Well, we have telematics dongle that actually give back a lot of flow data about vehicles. And we use graphs to map the relationship between dongle, the vehicle, the account that manages the vehicle, the driver that drives the vehicle, the trips that are recorded, events that might happen on that trip, the maintenance of the vehicle. Basically, we use Neo4j as our metre graph for information. Everything that we collect from the data is stored and recorded in Neo4j in a graph style, which actually allows us to analyse it very quickly, very efficiently because we can link multiple things together, multiple items together, and get very interesting intelligence from it. And it also gives us flexibility to innovate, to improve over time because it's a NoSQL model. So when we need another kind of item to track down, when we need another kind of-- or should I say--

RVB: 00:03:16.511 A property or a relationship or whatever.

JS: 00:03:17.789 A property, relationship, real-world object, a new feature that we want to track, we started checking and keeping track of maintenance for our vehicles. And that was just a new level, new relationships, and that's it. And from maintenance, came appointments, came markers on the map to-- so when was the appointment, where's located, came reviews on these repair shops, came-- and all of this just flows out naturally in the graph. And it gives us flexibility to improve, and the flexibility to analyse very efficiently.

RVB: 00:04:02.049 Wow. That's really cool. So your model has been evolving a little bit as you had more requirements, agile development, those types of things? Is that what I'm hearing?

JS: 00:04:11.271 Indeed. Indeed. It's evolving constantly and basically every month or every two months we add one, two new labels to the graph, new relationships and new way to actually get insight from the data.

RVB: 00:04:24.217 Wow. Cool. And are there any other data components to your application that you're using? Analytical components or other database stores, or is it mostly Neo4j?

JS: 00:04:36.157 Neo4j is our metadata store, so everything we get out from the data is new. For the raw data itself, we actually use InfluxDB as a time series repository. And we are also using Kafka as a messaging backbone for the whole infrastructure and so that all of our services can analyse the data as it comes in.

RVB: 00:05:04.584 Yeah. That's sounds like a great idea. Great architecture. So can you talk to us a little bit about how you got to Neo4j and why you started using a graph for this? What's the main advantage to it for you?

JS: 00:05:18.219 Well, our first proof of concept was obviously [laughter] built on SQL. That panned out good for a few vehicles, for the 100 or so vehicles we had in the test, and but my belief was that it was untenable at scale, storing telemetries, storing raw data, and the metadata in SQL was going to be a nightmare whenever joints would be involved. And I mean, whenever you want a trip, you have to join the accounts that the vehicle belongs to, the vehicle itself, the trip, some events that might be off-scoring but may be related to it, and it's three, four, five-part joints on every request. And that would quickly become problematic. And when I started devising our data model, I had an engineer with me who said, "Actually, this makes me think of graphs and maybe we should check out the field of graph databases." And it basically started this way. We checked a few graph databases, settled on Neo, because it seemed to be the best fit for architecture and in terms of maturity, in terms of functionality. And we did a first proof of concept with the new project client in C# because our whole architecture's on .NET and it was just so easy. You create a class, you put in properties, and you can [inaudible] in one message, and relationships comes at almost no additional costs.

RVB: 00:07:19.950 That's fantastic to hear. I mean, it sounds like a great [crosstalk].

JS: 00:07:23.237 Yeah. It was just so easy, so fitting, it stuck. And it's a choice I've never ever regretted making. Our transition to Neo4j was, I believe today, one of the best decision we ever made.

RVB: 00:07:43.778 Wow. There must be something wrong with it [laughter].

JS: 00:07:47.867 Well, we had a few, of course, there was a few problems along the way, but mostly it came from our data model. We had a few issues with transactions and HAProxy because we are on an enterprise high-availability cluster. So routing transaction correctly from slave to master or keeping a transaction on the same server is a bit of challenge sometimes. But other than that most of our troubles was because we didn't understand technology at first and we actually built it as we learned. And we've made a few mistakes in model along the way and it was actually incredibly easy to smooth them out at a later time to refactor the graph. And that was also something that clenched it for me. I mean, you identify and isolate the problem, you refactor in one or two codes and a few change in your code and that's it, problem solved. That's just so easy.

RVB: 00:09:07.273 Excellent. Well, that was kind of the past, right? So what does the future look like? How do you plan to use it in the future? How do you plan to evolve your application and also maybe any perspectives on what the industry is going to be doing on this?

JS: 00:09:28.361 Well, as for us, my next step is [inaudible] clustering and I'm just a few steps away from having it work correctly [laughter]. Currently having a bit of dependency problems with dot net. But [laughter] that's something that should be easily solved and [crosstalk].

RVB: 00:09:50.075 I hope so. Otherwise, you know where to find us, right [laughter]?
JS: 00:09:52.654 Exactly. And I think [inaudible] clustering will be the next big step and what will take us from really scalable to infinitely scalable [laughter], I guess. As far as the industry, well, it's a bit-- as much as you-- when you start using graph databases, you start seeing application for them everywhere. Even up to research in ancient languages. You can probably index and cross-reference hundreds of texts in minutes or hours just because you can actually cross-reference words and phrase from it. And the language doesn't matter as long as you have a [unique alphabet?] for it. And this is so powerful and it's just one application off the top of my head because I double in that on the side. But--

RVB: 00:11:00.404 Graphs are everywhere, right?

JS: 00:11:01.883 Graphs are everywhere.
RVB: 00:11:02.063 Once [laughter] you start getting into the mindset you start to see them everywhere and that's so fascinating.

JS: 00:11:10.033 And you start to see also what good they could do. You have so many people building complicated software just to solve graph-related question that could be solved in a few Cypher queries and at the time port, so.

RVB: 00:11:28.278 Couldn't agree more, Jonathan. I think we're on the same page there. Absolutely. Thank you for sharing your experience with us. I think that was super nice and super useful for lots of people. We'll put some links to your company and your experience in the transcription of the podcast. But for now, this is I guess where we're going to be wrapping up. Thank you so much for coming online, really appreciate it. And I look forward to meeting you soon sometime.

JS: 00:11:58.764 You're welcome. It was a pleasure. And please, feel free to drop by just give me a holler one point. And, yeah, we can meet definitely.

RVB: 00:12:07.894 Fantastic. Thank you, Jonathan. Have a nice day.

JS: 00:12:10.741 Have a nice day. And thank you, Rik.

RVB: 00:12:12.138 Bye.

Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

Friday, 2 February 2018

Podcast interview with Laura Drummer, Novetta Technologies

Here's another wonderful interview that I had at the end of 2017 with Laura Drummer from Novetta. Laura had presented her work with Neo4j at GraphConnect and was super nice to come online and talk about her work in the middle of her pregnancy leave. It was a great chat, I really enjoyed it - and am of course super happy to share. Here it is:

Here's the transcript of our conversation:

RVB: 00:00:03.935 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j, and tonight I'm here again recording another Graphistania Neo4j podcast episode. And on the other side of this Skype call, I've got a wonderful community member from Ellicott City, Maryland. And that's Laura Drummer from Novetta Technologies. Hi, Laura. How are you?

LD: 00:00:26.915 Hi, I'm great. How are you?

Podcast Interview With Konstantin Lutovich, Neo4j

Just before 2017 came to an end, I read this blogpost by my colleague Konstantin that was talking about how we were including an asynchronous API in our Java driver. That triggered my attention - I have always thought of the non-blocking characteristics of asynchronous systems to be very interesting for lots of different use cases. Just think of the difference between email (async) and phone (sync) communication - there's great use cases for both of them, right?

So I decided to try and get Konstantin on the phone, which he very graciously accepted. We had a great chat - which is what this next podcast episode is all about. Hope you enjoy!

Here's the transcript of our conversation:

RVB: 00:00:03.511 Hello everyone. My name is Rik Van Bruggen from Neo4j and here I am again recording another episode to our Graphistania Neo4j podcast. And this morning I've got a wonderful colleague of mine in Malmö, Sweden, on the other side of this Skype call and that's Konstantin Lutovich. Hi, Konstantin.

KL: 00:00:23.786 Hello, Rik. Thanks for having me here.

Podcast Interview with Jesus Barrasa, Neo4j

Finally - FINALLY. After a very happy and successful end to 2017, we are back in full swing for making 2018 rock at least as much as last year. And that also means getting back into the habit of publishing this lovely little Neo4j podcast. To do that, I asked my friend and colleague Jesus to spend a few minutes talking to me about all the great graphy stuff that he has been working on to make Neo4j succeed even more in the Telecommunications industry. Jesus is leading a "Telecoms Practice" in Neo4j, and is leveraging his domain expertise to create even more value for Neo4j users and clients. So here's a little chat:

Pages

Friday, 21 December 2018

Monday, 10 December 2018

Monday, 3 December 2018

Wednesday, 28 November 2018

The Medical Devices dataset as a graph

Wednesday, 31 October 2018

Monday, 22 October 2018

Friday, 12 October 2018

Wednesday, 3 October 2018

Tuesday, 11 September 2018

Monday, 3 September 2018

Thursday, 23 August 2018

Loading ESCO into Neo4j

Working with the ESCO database in Neo4j

Thursday, 5 July 2018

Friday, 22 June 2018

Friday, 15 June 2018

Thursday, 14 June 2018

Friday, 8 June 2018

Friday, 1 June 2018

Friday, 25 May 2018

Friday, 18 May 2018

Wednesday, 9 May 2018

Monday, 7 May 2018

Friday, 20 April 2018

Tuesday, 10 April 2018

Thursday, 22 March 2018

Friday, 9 March 2018

Monday, 26 February 2018

Friday, 2 February 2018

Friday, 26 January 2018

Tuesday, 16 January 2018

Metricool