However, the downside of all this fun has been that I have really not had the time or inclination to publish more podcast episodes. In fact, I have to apologize to the guest on this episode that I am publishing today, the super-smart and fun Chief Software Engineer of Tom Sawyer Software, Kevin Madden - because I actually recorded this episode back in June already!!! Seems like an eternity ago - but at the end of June I was just really running short on time, did not find it possible to publish the interview then, and then... summer sunshine got in the way.
But hey - better late than never! So here's a great interview with Kevin - as you would expect, he has many great and interesting perspectives (pun intended!!!). Here's our chat:
Here's the transcript of our conversation:
RVB: 00:02.650 Hello, everyone. My name is Rik, Rik from Neo Technology. And here I am again recording another podcast episode, and this one I've been looking forward to for quite some time. We're going to have a chat with Kevin Madden. Kevin is the chief architect of Tom Sawyer Software, right, Kevin?
KM: 00:19.972 Yes. I'm the chief software engineer at Tom Sawyer Software.
RVB: 00:23.260 So great to have you on the call. Thank you for making your time. Kevin, why don't you introduce yourself to our listeners, Who are you, what do you do, and what's your relationship to the wonderful world of graphs.
KM: 00:36.973 My name is Kevin Madden. I'm the chief software engineer at Tom Sawyer Software, a California-based graph visualization company. I've been doing graph theory since the early 90's. My brother is Brendan Madden and my other brother, Patrick Madden, we started the company that revolved around graph layout and graph visualization. So I've been involved with graph visualizations since the early days of [crosstalk]. Yeah, so--
RVB: 01:15.717 Early 90's, you said? That's a long time [laughter].
KM: 01:17.179 Yeah. It is, it is.
RVB: 01:19.529 Wow, that's fantastic.
KM: 01:20.581 We had several tool kits that we released through the '90's and we used to be embedded into larger applications like Erwin for designing, relational databases, ER Studio, another tool for building relational databases. And we mainly built network visualization platforms for the early days of the networking systems.
RVB: 01:50.197 That's fantastic.
KM: 01:51.437 Yeah, that's pretty much how we got started. And that was basically the early days of networking and network visualization.
RVB: 01:57.816 I remember when I actually-- I think I used it at one of my first jobs in the mid-90s. Yeah, sorry.
KM: 02:08.250 Yeah, it's funny that we use graphs to define relational databases. And we did UML modeling tool kits and all kinds of visualizations for software class, ER diagrams, UML modeling; all those kinds of products that we ended up being embedded in much larger software architectures. So we were like an OEM or original equipment manufacturer. So Tom Sawyer was considered like the secret source and basically we would sign agreements that we wouldn't say that we were in the products. And so, most people never heard of Tom Sawyer. It was like, "I've never heard you." And we've been in many enterprise deployments over the years at Hewlett Packard, IBM, Cisco, these kind of companies. So, heavily involved in graph theory over the decades now.
RVB: 03:04.093 Interesting. So interesting. And as I understand it right now, you guys are really like a graph visualization software that you can plug onto Neo4j these days, right? You just say, "Here's your Neo4j server," point your software to it and start exploring the network. Is that correct?
KM: 03:22.035 We have a product we call Tom Sawyer Perspectives which lets you build web applications without any code. You can connect to a Neo4j and you can define multiple views of the same graph. So you might have a lot of data in the graph database that you don't want to see and we allow you to provide Cypher queries, which allow us to populate the web application with. We've done many applications over the years with Neo4j, like for instance, the Panama Papers Visualization. We did the Cinea Apps. We've done crime network visualizations. Many, many different applications. Commodity flow application, which allows us to track the movement of industrial commodities across the United States. But the Panama Papers was, by far, our most ambitious endeavour with Neo4j. You can incrementally crawl the graph model, which allows us to do different types of graph analytics, centralities, and social network analysis using graphs extracted from Neo4j.
RVB: 04:43.587 Yeah. Well, I mean, I guess you know, right? Panama Papers has been such a game changer for Neo4j, but for our industry, I guess. Everyone knows about it. Don't they?
KM: 04:54.370 Yes. Panama Papers are obviously was-- yes, political implications and tax implications. And it's an interesting subject because it ties in WikiLeaks and all this kind of social phenomenon that are going on around that whole release of massive dataset for tracking movement of people offshoring money and hiding money. And how do you use a graph database to decipher intelligence from something like that?
RVB: 05:26.426 That's interesting. So hey, Kevin, how did you guys get into this then? What's the story behind that? Why did you guys get into this in the first place? Is there a story there? Other than Mark Twain's?
KM: 05:40.197 There is a story there actually. I met Phillip Rathle at a SenTech conference where were demonstrating our Tom Sawyer Perspectives product. And Phillip spent several hours in my booth which was just a stand at that time. It was really small. And we have these very interesting discussions. And I had never heard of the Neo4j at the time. I think it was maybe five years ago. It was very early days for Neo. And he was looking at some of the graph visualizations that we were producing in our layout platform. And he was highly intrigued because we had different types of analytics: reachability, shortest paths, the traversals. All the kind of stuff that you need to basically build up visualization application on the Web. And he's like, "You guys should download and try out our Neo4j. You'll love it." So we started using the REST API, the Neo4j REST API, and we started integrating with the Neo4j to get visualizations. So I wrote the first Neo4j connector into the Tom Sawyer platform. And so we can log in and get the object types and the property graph, and all that kind of stuff, and then be able to interactively visualize it on the Web outside of the Neo4j browser that comes by default. So it's some stand alone application that gets extracted. And then we started doing joint marketing efforts in San Francisco. We started doing meet-ups and we would do joint visualization seminars and all sorts of stuff. We had business relationship meetings with Neo4j. Then we became a visualization solution provider for Neo4j. So we kind of had grown along with the Neo4j.
RVB: 07:34.450 Was there a particular use case that you guys are most focused on or most interested in in those early days? Or was it just more of a generic form of relationship?
KM: 07:48.225 We had done some spatial analysis, so we had-- no, it's more around marketing. Neo, in the early days, was open source. You can download the community edition and just have it up and running in a matter of minutes, and getting people to understand graphs. And so there's a lot of training around "What is a graph?" We would meet people and be like, "I don't get it." And then you'd get these other people and who'd be like, "Oh my God. This is what I've been searching for." And over the years we have worked with multiple graph database vendors, but none on them have gotten the market penetration that Neo has.
KM: 08:30.533 So we were more than happy to do these joint meetups because we were trying to sell our visualization solutions and inform the customers about the possibilities when you tie it with the large scale graph database that can store these graphs on the billions and have managed billions and trillions of relationships. And then at the end, you end up with a graph from your queries, so you still need a way to visualize on the client's side what the resulting graphs that come out of the queries that you are able to-- that's not said very well but you still get graphs out. And so, you still have to be able to do some kind of client side analytics on those graphs to be able to present them to the user in the powerful form.
RVB: 09:19.246 [crosstalk] Otherwise you get a big, fat hairball, right [laughter]? If you don't [inaudible] [laughter].
KM: 09:23.294 Yeah. You don't want the hairball. The hairball is like-- we found that when you tie together very powerful visualization platform that enables the human mind to understand patterns that are in the graphs along with the graph analytics, you get a very powerful paradigm because the human can start to grasp the complexities of the interconnections of the data. And so it's a very powerful metaphor when you tie animation, advanced graph layout, and analytics where you can partition the data and hide complexity. The idea is to hide complexity from the user to see how they get the result. And so when you reduce the complexity-- but there's a lot of hidden complexity that's being processed by the graph database and the visualization platforms, it allows abstractions to be created and high-level meaning and understanding within the data set.
RVB: 10:23.095 I think these [crosstalk]. Yeah, really, really true. And I've seen that a bunch of times in different user cases actually.
KM: 10:29.725 Yes. Yes, it's very important to be able to make sure that user as a concept of context and then they have a drill point where they can start, and then they can follow along like the Hansel and Gretel and the breadcrumbs to understand how they got to the answer.
RVB: 10:47.204 Yeah. Well, I mean, for Neo4j right now, there's a handful of really sweetspot use cases where we see this happening in your cell phone, right? With things like Fraud Analytics, and user recommendations, and those types of things are IT and network management. Is that kind of the same for you guys today? Is that where you guys are most active as well?
KM: 11:09.605 Actually, we're seeing most of our adoption in high-end engineering and on global scale. We're seeing airplane manufacturing. We're seeing auto manufacturers. We're seeing big industrial corporations that have millions and millions of parts that go into products, and we're seeing that they want to be able to put all their engineering design specifications, all the tracking data of where the parts are manufactured, who manufactured them. The whole industrial process is being pushed into the graph. And then the engineers on the design side want to look at them as graphs, what we call SysML or Systems Modeling Language. You can think of it as like a UML for complex systems engineering. Satellite manufacturers, rocket manufacturers, these kind of things we're starting to see a lot of growth in. And drug interaction diagramming, protonomics, there's all sorts of different graph structures that we've been dealing with over the years that are starting to migrate their data into the graph database.
RVB: 12:23.972 Really cool. So where is this going then, Kevin? Where do you think our industry, but also maybe you guys and your product is moving towards? Any ideas about that crystal ball in the future?
KM: 12:41.111 Oh, I think the growth is-- I think we're just scratching the surface. I think you're starting to see a lot of the standard NoSQLs. We saw the big data boom and then the NoSQL, and all of this is migrating towards the graph database. So all those other platforms, you're seeing stuff like Spark, and you're starting to see the proliferance of Neo4j. And then when you couple these all together, I think there's going to be a big migration from relational into the graph database because it's easy to use and the tooling is getting a lot better. The security systems behind it are getting lot stronger and the connectivity layers are getting very high-performance, the redundancy, all these kind of things that were built into relational database. So that when people will go into the relational database, are starting to get support in the graph database land. And so we're really going to start seeing the adoption grow as it has proven itself as a technology. [crosstalk] We're very close for prime time, if not right now, we're ready for prime time. I mean if you're running a Fortune 1000 company and you have to make tough decisions as CIO, you always go to your relational because it was proven that it was flaw tolerant. It was secure. All these kind of things are being addressed. Language standardizations, vendor support. All these kind of things are critical when you're making these kind of critical decisions as an enterprise, and I believe that all this things are lining up beautifully. And I think that we're just literally scratching the surface of what's going to happen in the later 21st century.
RVB: 14:34.511 I am going to look forward to that together with you, I guess. And this I guess where we wrap up the podcast for now. We'll put a bunch of links to your software and stuff that you guys have been making in the transcription of the podcast so that people can find it easily. But for now, I think I'm going to thank you for coming online and spending your time with me, talking about the wonderful world of graphs and graph visualizations. And hopefully, we'll see each other in one of the future GraphConnects or another conference, right?
KM: 15:08.033 Great. Thank you, Rik. I appreciate it. I wish it could have been longer. I can go on and on.
RVB: 15:15.250 It's always something that I struggle with as well but I know that this is just to get a couple of pointers and this is a great way of doing it. So, thank you Kevin. I look forward to seeing you.
KM: 15:29.508 Okay, great, Rik. Thanks.
RVB: 15:30.525 Bye.Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!
All the best