Monday 6 March 2017

Podcast Interview with Kristof Van Tomme, Pronovix

Last month I had one of those cool encounters of the graph kind at the Belgian Beerfest that we have been organising a couple of times in the the last few years at the occasion of Fosdem - the amazing open source conference that's taking place in Brussels every year. This year, I got talking to a fellow countryman that has been doing some amazing work on integrating the Drupal content management system with Neo4j - something that has a lot of potential in a lot of areas, I think. So - we just HAD TO have a chat :) ...


Here's the transcript of our conversation:
RVB: 00:03.346 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo Technology. And here I am again the third time in two days, this is wonderful, I'm on a roll here, recording another podcast for our Neo4j Graphistania podcast. And today I have a fellow Belgian on the other side of this Skype call, and that's Kristof Van Tomme from Pronovix. Hi, Kristof. 
KVT: 00:27.466 Good morning Rik. How are you? 
RVB: 00:29.593 I'm really well, and I hope the Skype gods bear with us, because we've had some trouble in the past couple of minutes, but I'm sure it will fine. Hey, Kristof, we met each other at the FOSDEM conference, which was a great experience, and I loved the Beer Fest afterwards [laughter]. But yeah, you told me about some really great stuff that you guys are doing with graph databases. So, first of all, let's start from the beginning, who are you, what do you do and what's your relationship to the wonderful world of graphs? 
KVT: 01:07.202 So I'm a bit of a weird duck because I'm actually a bioengineer who ended up in IT through a biotech startup that did research in schizophrenia. It's a whole other life. But I got involved in the Drupal community a little over 10 years ago when we started making websites for biotech companies. 
RVB: 01:35.332 Okay. Drupal is like a content management system, right? 
KVT: 01:38.557 Yes, Drupal the open source content management system. The other really good Belgian product after beer and chocolates [laughter]. And I got really strongly involved in that community 10 years ago. I helped organise one of the big European conferences, and then we built a consultancy around that. Then, about five years ago, I got really excited about documentation, and reuse of documentation specifically, and how to deliver it and reuse bits and pieces so that you could build deliverables that can easily reuse between different channels. And that's how I got excited about graph databases, and Neo in specifically. 
RVB: 02:32.949 When you say documentation, you mean technical recommendation for software, right? 
KVT: 02:35.667 Yes. Yes, I do. The thing that everybody's like, "Ooh, documentation." 
RVB: 02:41.017 Ah, damn it. Yeah, exactly. 
KVT: 02:44.417 So that's how I got involved in-- because we had one of our colleagues, a long time ago, I think six years ago or something, started playing with graph databases, and actually, he built a first connector for Drupal for Neo. And he's like, "Kristof, I did this thing, and I'm really excited about graph databases, and I think it's cool. Can we do something with this?" And I was like, "I have no idea." So that was the first connector for Neo for Drupal, and then that kind of died because there was-- technically it was there, but then there were no further implementations, and I was not sold, and people didn't figure out how to use it. But then because of the documentation thing, I actually started seeing what you would use a graph database for and that's when I got really excited. 
RVB: 03:46.370 Super cool. Because documentation, I don't know if you notice, but this is where Neo4J started as well, as an open source project, 15 years ago, Viking hackers in a garage. They were all about content management at the time as well because they were working for a media company that was managing digital assets. So it's funny that there's this convergence or link between the two worlds, right? What is the use case all about? How does it work? 
KVT: 04:20.696 So I've been thinking-- I've got this DITA, which is another of those words. It's a standard that's fairly popular in the technical writing community for writing reusable documentation. It's like an XML standard. Some people scratch their heads when they hear about it, and other people are raving mad about it. So in the DITA community, I've been doing talks about consult management systems and open source and things like that. I think two years ago, I started thinking about personalisation and embedding information. What I dream about is this; instead of having a manual that the documentation system knows who you are and serves you the right information when you need it. I did a talk about that at the DITA conference here, I think it was in Europe, and I was thinking, "So how would you do that?" And then I started thinking yeah, actually, probably it wouldn't really work with a relational database because you need to start collecting a whole lot of information and start analysing for patterns. And that's how I started thinking about Neo and graph databases more in general. 
RVB: 05:48.382 So as a personalisation engine for documentation, right? So you wouldn't need to search for documentation as much, but you would have a recommended set of documentations that would be served to you semi-automatically. 
KVT: 06:04.195 Yeah. So it's the idea that, for example, you're in an application, you're in a web app, and you can't find that one damn button that you know is somewhere-- 
RVB: 06:16.996 We've all been there.
KVT: 06:18.043 Yeah, we've all been there. So you're clicking around, and you're going through settings, and I don't know, connections, so you keep going circles and circles and circles because you can't find the damn button. And at that point, the system would say, "This looks a lot like what people do when they're looking for this thing," and then you would get a little pop-up saying, "Are you maybe looking for this?" And similarly, if you're using a certain feature and you're doing something really weird and other people have done that, and then they went through the documentation and found some other feature, then you could shortcut that and skip a few jumps in that graph and immediately serve them the information that they're looking for. So it's kind of like analysing patterns of behaviour that people have inside of a web application and then serving them-- that's patterns of behaviour that they normally do just before going to documentation sites and then serving them that documentation that people normally will find when they go to documentation site after they've done a certain thing, and then serving that information to them. So that's one of the really cool things that I would like to do. 
RVB: 07:32.834 Yeah, I understand. So why is that such a good use case for a graph database? Is that because of the pattern recognition, or what's the secret sauce? 
KVT: 07:44.973 So it's the pattern recognition. So I think CMSs are really good at storing data in a-- storing similarly structured information because most of CMSs use SQL databases and they're pretty good at that, just building up a content model and then reusing that over and over again. But being able to recognise behaviour-- well, that's not something that we are normally doing in the CMS space. We have some very basic things, like there's some recommendation based on the content and shared keywords and things like that, but behaviour analysis is not one of the things that you normally find in the CMS. So for that, we need different technology because in a SQL database you would have to do so many joints to even figure out what's going on, yeah, that I don't think that it would make sense to do it that way. And ideally, it would be a system that you don't have to program everything but that it can start looking for patterns on its own eventually. And that you build this graph of interactions and content and kind of like a graph that combines those two to do things with that. So yeah. 
RVB: 09:04.602 So where are you guys with this? How far along that path are you? I know you've done some prototyping already, right? 
KVT: 09:11.567 Yeah. So we are very, very early. So our main business right now is developer portals. So two years ago we started working-- well, a year and a half ago we started working with APG, that's now part of Google, and they have a developer portal that we are customising for their customers. And we built this whole business around documentation, specifically about APIs, so that's where our core focus is right now. And so the AI and personalised documentation is something that we're doing research on. So the thing we've done currently is we've built a connector for Drupal for Neo - I did a talk about that at FOSDEM - and that was-- 
RVB: 10:02.991 I went to that one, yeah. 
KVT: 10:04.291 Yeah. So that talk was not just about this use case. It was about what could you do if you combine a CMS and a graph database and looking at it from an added-value perspective, rather than a replacement perspective. Because I know that in the DO community people are like, "Just get rid of the stupid SQL databases [laughter]." They're worthless and graph databases can do everything so much better. I think--
RVB: 10:37.056 That's a pipe dream in my opinion. 
KVT: 10:38.368 Probably. You could build a CMS graph database, and I think that could work. But I think that there's so much existing technology already where it's a large amount of extensions and huge communities that it would make more sense to create an add-on instead of a replacement because if you replace it then you have to rewrite everything. 
RVB: 11:05.120 I couldn't agree more. 
KVT: 11:06.124 Yeah. So that's why I think that's their sweet spot for Neo in the CMS community but I think there's two stress facts to this. One is the sweet spot for neo in the CMS community, and that could be recommendation and pattern finding. But then there's also the inverse that you could think about and that's what if you were to put an open source CMS like Drupal in front of a graph database and we use it as an interface to manipulate the graph and to add, maybe, some structured objects into your graph? And then use the CMS to build reports about those objects and the graph to find out which ones you're going to put into your reports. So that was my talk about. 
RVB: 11:57.922 Well, you've already touched on my last question, which is what does the future hold [laughter]? What could we do in the future? And I know that we'll be doing some meet-ups together and I'm really looking forward to those, but where does this go, Kristof? What's in your crystal ball? 
KVT: 12:21.174 So I love thinking about a future. I really love Kevin Kelly's book, The Inevitable. And in that book, he talked about-- I think this is the basic pattern that got me thinking about this, also. He talks about flowing and it's a very, very interesting concept that we're moving from an Internet where we used to have documents to an Internet where we have pages today, where we'll have flows of information tomorrow. And this idea of going from having an object that's structures and it has a context-- has a manual context, or a book context, or a document's context where you put all the information in context of the rest of the book into a very rigid structure. That's how we used to do things. That's how books and manuals were built, even when printing press-- even before the printing press was invited. And what the Internet has been doing, and what search engines have been doing, is that we've been moving towards pages where you can just dive into any object-- sorry, any document, any book, and just find out one page where a certain concept is explained. So you can just jump in. You don't have to read the whole book to be able to understand something. And that's where we are today. But I think that's the next step in this process, and it's also what Kevin Kelly talks about is flows, where you have a flow of information that's much more personalised, and we're just constantly dipping in and out of these information flows around us that are serving us the documentation that we need at a certain time to be able to do what we need to do and that are aware of our contexts so that we don't have to adjust to the context of the documentation, but the documentation adjusts to our own personal context, and I think-- yeah? 
RVB: 14:31.872 So what I'm hearing is you see this graph database integration and everything that you guys are building as a means to that end, to get there somewhere, somehow, to get closer to it. 

KVT: 14:44.560 Yeah. So we have a first customer where I've been talking about this concept, and-- they're an SaaS company. So what I imagine is that we could track users, the administrators as interacting with the software, and then basically serve them the contents this way where you look at their whole experience inside of your tool, and then you serve them the information they need to be able to interact better and get more value out of your system. So it's kind of like the idea-- the way that I describe it going from the context of the manual to the context of the one, like the one person, one single user and how they are interacting with the system. This is very, very-- there's a lot of work to get here [laughter]. But I think that we can take baby steps, start with first implementation. Start with building a graph of the behaviour and how people interact with documentation and with the tools that are documented by the documentation and then use that to start recommending content. And yeah, I'm really excited about it. We started a mailing list about it at one of the meet-ups where I was presenting. We actually had one of the people that worked on the Clippy years and years ago at Microsoft who was also really excited about the idea. Because I think this is actually what Clippy wanted to do, or wanted to be, but it was not possible. And I think that graph databases could be the piece of technology that enables the dream of Clippy [laughter]. 
RVB: 16:40.452 Well, I think on that bombshell [laughter], I think that's a great time to kind of wrap up this podcast. Thank you so much for coming online, Kristof, and we'll be publishing some more details around your work and also the talks that you've been doing with the transcription of the podcast so people can read up about it. And I look forward to seeing you at one of our meet-ups, right? Because we'll be doing some community work together in the next couple of months as well. So really looking forward to that. 
KVT: 17:12.653 Likewise. 
RVB: 17:13.552 Thank you so much. Have a nice day, Kristof. 
KVT: 17:16.253 Yeah, you too. 
RVB: 17:16.990 Bye. 
KVT: 17:17.383 Bye.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

No comments:

Post a Comment