Tuesday, 19 September 2017

Podcast Interview with Chuck Calio, IBM

Last year at GraphConnect San Francisco, we had this great announcement where we were having some of IBM's most senior leaders, Doug Balog, talk about what they were doing together with Neo4j to let the graph database perform like crazy on the Power8 hardware platform:


Doug came on stage and talked to Emil and the audience about all the hard work that was going on there, and now, just before GraphConnect New York - it felt like the right time to check in with friends at IBM to talk about their work with Neo4j and how that might affect the Graph community. So we got Chuck Calio to spend some time with us on the podcast - and here's our chat:

Here's the transcript of our conversation:
RVB: 00:03.212 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo Technology. And it's been a long summer. It's been a really long summer and I've enjoyed it a lot, but it's time to get this podcast show on the road again. And so today I've invited and got a wonderful person on the other side of this Skype call from our dear friends at IBM, and that's Chuck Calio. Hi, Chuck. 
CC: 00:32.966 Hey, Rik! How we doing? 
RVB: 00:34.711 Good. 
CC: 00:34.668 Summer's come to an end and here we go with a podcast to kick things off in September. 
RVB: 00:40.038 Exactly. That's the way it is. And thank you for making the time, Chuck. I really do appreciate it. And as always, many people may know you but other people may not. So I'd like you to introduce yourself a little bit. Who are you, what do you do, and what's your relationship to the wonderful world of graphs? 
CC: 00:58.212 Okay. Thanks, Rik. So I'm Chuck Calio. I work in IBM. I'm based out of Poughkeepsie, New York, which is 100 kilometers north of New York City. And I'm actually the worldwide offering manager for IBM's Neo4j on the Power Hardware solution. And so my role is to do everything from develop new Neo4j on Power Systems hardware offerings, to help sell and market the solution. And spend a lot of time working with individual clients on responding to opportunities for Neo4j on Linux and Power, to help them understand Neo4j on Linux and Power, to guide them through the process of learning about graph and how graph would complement their existing relational database environment, and/or their other NoSQL environments, many of which I've deployed like the MongoDB or Redis. And now, they're looking at the exciting growth opportunities that graph and connected data also mean and present to them. So basically, my role is to really be the individual leading advocate for Neo4j on Linux and Power, and it's very easy because there's an incredible demand for Neo4j on Linux and Power. And so that makes my job easy. It's a fascinating job. I typically, on every day, will get requests from all over the world to do just about anything. So that's what makes it fun. I drive into work in the morning and I don't know what's going to happen, and I kind of enjoy that. I'm pretty well-- 
RVB: 02:25.884 Super cool. 
CC: 02:26.986 --[inaudible] kind of environment. So-- 
RVB: 02:29.614 So, Chuck, I mean, I think many people may not know exactly what we do together, right? I mean, Neo4j and IBM, we've been integrating those two environments quite a bit. And our chief scientist, Jim Webber, talked about it at GraphConnect San Francisco last year, and in London. But maybe it will be useful to kind of repeat that. What's the story there? What's the vision behind this integration between graph databases and Power? 
NOTE: some of the audio in this next section was unfortunately lost in recording. We have tried to represent/save as much as possible - but some parts of this next part of the conversation are sadly missing. 
CC: 02:59.535 Yeah, sure. So Neo4j runs on different types of hardware. And in particular, on the Power hardware, we started out with Neo4j a couple years ago, where we just basically ported Neo4j to Linux on the Power Systems hardware and sort of-- that gave a kind of a solution that would allow Neo4j to inherit the quality of service that the larger memory, more threads, faster CPUs, faster memory to CPU, and the basically better I/O of Power Systems. So that was sort of the first phase of the work. And then we did find that Neo4j and graph databases in particular respond very well to larger amounts of memory and faster bandwidth. So then we worked on further optimizations with Neo4j on the IBM POWER8 Systems [inaudible] accelerator [inaudible] also extend the memory. So we started out with just the basic [inaudible] solution, then we worked out on more optimized [inaudible] look at larger memory sizes to enable Neo4j to scale [inaudible] users and transactions and graph sizes and such. So that was sort of the second step in the process. Then the third step of the process is the actual hardware designers [inaudible] next generation of IBM Power Hardware which is called POWER9. That will come out starting in the fourth quarter of 2017, and then more in 2018. We actually had our electronic design engineering team actually start to use Neo4j to better optimize chip design and timing design. So based on that, we sort of had a next step beyond that, where we could actually do some hardware traces of the Neo4j software running on the IBM Power hardware, and now we're even identifying further enhancements to Neo4j software based on the traces that we did on the IBM POWER8 hardware, which is based on running Neo4j on IBM POWER8, which was designed with Neo4j. 
CC: 05:07.774 So we have a recursive kind of thing going on here. We have some incredibly valuable use cases that we're finding between the two companies, but more importantly, creating a kind of an innovation, a one plus one equals three solution that our clients will benefit from greatly going forward in the future. And that's the most important thing to me. 
RVB: 05:28.047 That's so cool. Are there any kind of indicative advantages? Like quantitative results in terms of what types of systems we can deploy on POWER8 using Neo4j? Have we done some tests there? 
CC: 05:45.127 Yeah. We do. We have some performance data. So we're typically 80% better price performance than our competitors. And we enable up to 56 terabytes of either RAM and/or near RAM speed memory. So you can have very, very large graphs all on memory with Neo4j on Linux and Power. It's a very unique part of our solution that makes it very, very scalable. And very large clients appreciate that part of the solution. And so going forward, we're looking at further ways to optimize the transfer rates between memory and the CPU. And further looking at exploiting accelerator technologies. Because in general purpose hardware the advancements are being slowed a little bit due to Moore's law's limitations and such. But the big thing happening in hardware nowadays is to exploit accelerators, both GPUs and FPGAs. And we're more aligned with the CAPI technology that we have at IBM, which is essentially larger memory in an FPGA, and working closely with Neo4j's engineering team around trying to see if some of the algorithms that Neo4j uses today that are used a lot by their clients can benefit from really deeper optimization and acceleration. So the thing going on in hardware now is really all about accelerators, both GPUs and FPGAs. And you need these for the really heavy duty use cases, like AI, machine learning, deep learning, graph, other areas that benefit greatly from it. So really exciting stuff. We're finding hardware does matter in some of these newer growth solutions that really challenge the hardware much more robustly than the traditional relational database models which-- 
RVB: 07:29.064 It's a little bit like what Jim Webber was saying at GraphConnect. Basically, this new kind of a wave where software engineers and hardware engineers are going to have to work together much more intimately in order to get these really big workloads to perform, right? The collaboration between POWER8 and Neo4j seems to be in line with that, I guess. 
CC: 07:58.083 Yeah, that's absolutely right. I think the very specialized expertise and really hardware that steps up and runs certain workloads in use cases much, much better, orders of magnitude better, than just general purpose hardware I think is a very common approach. And a lot of the growth solutions-- in particular a lot of the growth solutions in the analytics including artificial intelligence, machine learning and deep learning, that specific area where I would put graph and Neo4j into seems to benefit greatly from the latest levels of hardware and accelerated hardware. So really exciting area to work on. It's really stepping up to meet the big challenges that our clients have of us. Of course price is very important and always does matter, so that's also something we have to keep our eyes on. 
RVB: 08:49.563 Totally. Hey, Chuck, and so where is this going, you think? What does the future hold, you know? Do you see this accelerating in the next couple of years, or what's in store for POWER9 and maybe beyond that and [crosstalk] databases like ours? 
CC: 09:06.731 Yeah, I think we're going to continue to see the interfacing and the interplay between graph and artificial intelligence, machine learning, and deep learning. I think that's a given. I think that that's an important area that we see an expansion on. I think super advanced cyber security solutions is also an area that we're both really interested in and focusing on. So those kinds of things I think are where I see it going. The other thing I would like to mention is expansion to areas like the Asian markets, China in particular, Japan, countries like that. We're seeing a lot of big step up in interest from those areas. Quite a big recent increase in Neo4j on Linux and on Power in the Asian market. So that's another trend I'd like to mention, which I'm very, very pleased with. And then inside of IBM, I feel the direction now is a natural expansion to areas beyond my subject matter of expertise. So for example, the IBM Watson cognitive team is now using Neo4j. Like I said before, our cyber security teams, other hardware teams are looking closely at Neo4j. And I anticipate that many other areas, including potentially software development teams inside of IBM, would also look at using Neo4j and use cases around software development and graph to identify areas to review it's defects, to increase productivity, to be more agile and that kind of stuff, so. 
RVB: 10:29.003 Lots of stuff happening. It's a very exciting time. And I guess we'll be hearing a lot more of that at GraphConnect New York, right? I'm assuming you'll be there.
CC: 10:37.579 Yes. I will be a featured speaker. And you know what I'd like to see is anybody who wants to come on over and meet me, meet us at the IBM booth. We're a gold sponsor of GraphConnect, and we are very pleased to be at the New York City event. New York City is absolutely a wonderful town to come and visit. Please come and see us. We will be there. A number of my colleagues from IBM will be there with a diverse set of background. I think you'll find it fascinating to stop by. Meet us. Like I said, we are a gold sponsor and we'll have a booth and some feature sessions. And please stop on by in GraphConnect New York City. We'd love to see you. It's a great town as well, so. And I'm sure the conference will be absolutely fabulous in terms of a broad variety of speakers with a lot of subject matter expertise, skills, ability and experience. Yeah, really ranging from smaller start ups, up to the biggest firms showcasing how Neo4j is bringing value to them. 
RVB: 11:34.526 That's it. And I think that's what we're all looking forward to. I'll be there as well for a full week. So I'm looking forward to meeting you there face to face Chuck. And I'll continue the conversation then. 
CC: 11:49.546 Very good. Very good. 
RVB: 11:50.550 Thank you so much for coming online. It was great talking to you and I'll see you in about a month. 
CC: 11:55.724 Okay. Thank you. See you soon. 
RVB: 11:57.541 Thank you. See you soon. Bye.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

Tuesday, 5 September 2017

Podcast Interview with Kevin Madden, Tom Sawyer Software

OMG has summer flown by. It has been a fantastic season over here in Europe, with lots of great family time and lovely trips to different destinations across Europe - I had a blast.

However, the downside of all this fun has been that I have really not had the time or inclination to publish more podcast episodes. In fact, I have to apologize to the guest on this episode that I am publishing today, the super-smart and fun Chief Software Engineer of Tom Sawyer Software, Kevin Madden - because I actually recorded this episode back in June already!!! Seems like an eternity ago - but at the end of June I was just really running short on time, did not find it possible to publish the interview then, and then... summer sunshine got in the way.

But hey - better late than never! So here's a great interview with Kevin - as you would expect, he has many great and interesting perspectives (pun intended!!!). Here's our chat:



Wednesday, 14 June 2017

Podcast Interview with Sébastien Heymann, Linkurious

As I am coming up on my 5th anniversary working for Neo4j, I am increasingly happy, proud and thankful for the journey that we had - and the many great people that I have met along the way. One of these people is FINALLY appearing on this podcast, and has a history with this blog every since the VERY first article that I wrote in january 2013: in this article, I showed folks how to load the Belgian Beer Graph into Neo4j using a tool that was actually not intended for this use: Gephi. Many beer (related article)-s later, I am now finally talking to Sébastien Heymann, founder and CEO of Linkurio.us, and one of the main people behind Gephi at the time. Here we go:



Thursday, 1 June 2017

Podcast Interview with Steven Baker, Neo Technology

About a year and a half ago, at some meeting in Malmö at Neo's Swedish HQ, I bumped into a new colleague there who was... a little different than most Swedes. He was a bit louder, a bit more outspoken, unapologetically sarcastic, VERY funny - and a big fan of all kinds of (Belgian and other) beers. So we started talking - over a beer. I think it's fair to say we hitted of - and I got to know this Canadian-gone-Swedish guy a bit more, talked more, drank more, ... and decided to invite him to talk about some of the interesting stuff that he works and worked on for our podcast. His name? Señor Baker. Senior Baker. Srbaker. Also know as Steven Baker - working for Neo. Here's our chat:


Wednesday, 17 May 2017

Podcast Interview with Darko Križić, Prodyna

Another stupidly late podcast publication on my behalf. Somewhere early March (yes, I KNOW - dammit!!!) I had a great conversation with one of our prime Neo4j partners in Germany and across Europe these days, called Prodyna. We did a couple of events together, and I found that some of their thinking and case studies really aligned very well with my own. So we got together for a chat. It's a bit annoying because both of us were referring and looking forward to GraphConnect - and I clearly missed that deadline/timeline. But still wanted to share the conversation... Here it is:
 As per usual, here's the transcript of our conversation:
RVB: 00:02.689 Hello, everyone. My name is Rik, Rik van Bruggen from Neo Technology, and here we are again recording another podcast, a little bit closer to home. It's actually a really special podcast for me because it's exactly two years ago since we started it on request or instigation of my dear friend Michael Hunger. And this week we've invited someone from Germany in order to talk a little bit about of the wonderful things that PRODYNA is doing with Neo4J. And that's Darko Krizic from PRODYNA. Hey Darko, how are you? 

Tuesday, 9 May 2017

Part 2/2: looking at the Web of Belgian Public Companies in Neo4j

Yesterday, I published part 1 of this short little blogpost on how we could load the dataset of a great newspaper article in De Tijd (our local financial/economic newspaper) into Neo4j. Of course, the whole point of that loading process (all of which is easily copied from github, btw) is to be able to do some additional querying on the dataset - just because we can :) ... So let's do some simple queries here, and then you can of course explore this some more yourself!

Start with some simple queries

In the article above, one of the key figures in the web of public companies, is Luc Bertrand, the CEO of Ackermans & Van Haaren - a former dredging company that turned into a holding company. Let's explore the network around him - by walking the paths from his node for three hops.
//network around Luc Bertrand 
match path = (m:Male)-[r*..3]-(n) 
where m.name contains "Bertrand"return path
That query gives us a nice little graph that we can explore:




Monday, 8 May 2017

Part 1/2: looking at the Web of Belgian Public Companies in Neo4j

Just a few days ago I came across an interesting article on Belgium's premier economic newspaper - (De Tijd, the local equivalent of the Financial Times or the Wall Street Journal) that was over here:

The title of the article is "The Spider's web of publicly traded Belgium", referring to the web of companies, ceo's, chairmen and directors for the 126 public companies that Belgium still has.