Thursday, 12 November 2020

Graphistania 2.0 - Episode 10 - This Month in Neo4j

Hi everyone

Hope you are all well, keeping safe, and finding some time to relax and enjoy life in this wonderful rollercoaster that is 2020. Think of it this way - we will never forget this ride, EVAH! 

As you can imagine, things have been evolving at warp speed in the wonderful world of graphs as well. So me and my partner in crime Stefan had another chat about all the things we have seen pop up, mostly through the awesome This Week in Neo4j (Twin4J) newsletter. Here's the chat we recorded:

Here's the transcript of our conversation:

RVB:00:00:01.448 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4j, and here I am again recording another episode of our Graphistania Neo4j podcast. Wonderful time of the day to start with this type of conversation because I have my dear friend, Stefan, on the other side of this call. Hi, Stefan. How are you?
SW:00:00:21.484 Hi, Rik, and hi, all of the rest of you. I'm doing good. Autumn is coming up, or winter I would say here in Stockholm, but this is also one of the highlights of the month, right, getting together, talking about all the magical things that are happening within the graph community. So yeah, super excited.
RVB:00:00:42.412 Yeah, absolutely. Well, I mean, we've prepared a couple of things here to talk about, and obviously we'll share all of that in the transcription, but maybe we'll start a little bit with some of, let's call it, the little bit the corporate-y things that we have seen happen in the graph community.
SW:00:00:56.382 Oh, corporate.
RVB:00:00:59.201 I mean, obviously, those are the things that we work on as well on a day to day basis at Neo4j. I was really excited by the Nodes conference that we had two weeks ago. I'm assuming you attended as well.
SW:00:01:11.672 This is now, where I going to get fired. I was actually working with some clients, so I had a very hard time. I did a double delivery. So one on in [inaudible] time and one on the West Coast time. So I was completely booked. But for me also one of the good parts-- and this is also the perk of working within Neo that we were kind of recorded a lot of it or recorded all of it to the point what we can share, but the amount of amazing content-- I think it's like kind of mind blowing. [crosstalk]--
RVB:00:01:45.579 Six days of content. Six days of content.
SW:00:01:48.857 This is fucking-- I don't know. I said fucking. Sorry, I was so excited. Maybe we're going to need to blur that out. No, but it's like the Woodstock of graphs without any drugs, right? It's a huge thing, and I think it's so cool to see how this is kind of put together from all parts of the world. Young, old, use cases, people are just coming together. I think it's such a beautiful thing to really see how this kind of exploding almost [crosstalk].
RVB:00:02:18.637 I mean, I introduce it as a corporate thing obviously because it's organized by Neo4j, but I think at least 80% of the contributions have nothing to do with the Neo4j as a company. It's the community that's driving all the content. So it's really kind of special. We've also--
SW:00:02:37.180 It's a network, right? Yeah?
RVB:00:02:39.514 Yeah, sorry. No, we also had some really interesting stuff happening. New release of Graph Data Science Library. You're familiar with that, right? It's kind of so new, so innovative. I've never seen anything like it with the graph embeddings and stuff like that. What do you think of it?
SW:00:03:00.522 Yeah. Yeah. No, but all these embeddings and stuff, and I think, for me, I have been fortunate enough to put together a sprint working with this with our GDS team. And every time when we go in there, I kind of mind blown of the result. I usually have this saying-- I tend to talk to a lot to the business people and you know how-- sorry, business people, but sometimes you actually just go like, "And then I sprinkle some AI on top of this, and then I have magical results," right? I always say that that sprinkle AI is somebody taking your job, right? Why would I ever want to have somebody saying sprinkle? There's always a person in behind.
SW:00:03:39.900 But with these embeddings, there is a moment when me and ?? kind of tried it out, and first we did it using only the algorithms and not embeddings. And the result were extremely cool. We did the recommendation of our customer segmentation and then recommendation. And then we did one using embeddings and doing centralities and similarities with that. That resulted so much better. But the tricky part, we couldn't really backtrack what happens. We know that result is better, but it was very hard to kind of explain. So it was actually a little bit of that magical black-box feeling, which is always cool when we enter into this kind of sphere or the place of unknown. Of course, this will be known for us in the future. We'll kind of have the mental model to accept it and understand it. Yeah, it's super, super cool. I losing track of words, and I guess that's a first, so yeah.
RVB:00:04:37.056 Well, I mean, people can read about it a little bit more on the website, and there's a new Graph Academy course about it as well, so there's some really cool stuff that we can refer to. The other thing that I'm quite proud of and-- yeah, we talked about it on the previous episode of this podcast, which is the little book that me and Jim Webber created, which is-- I don't want to repeat myself too much, but I think it's kind of special because it's a sign of the times in many ways, right? We're expanding into a mass market where lots of multiple people and additional people are going to be exposed to graphs, and I think that's super, super cool. That's a really big evolution for all of us.
SW:00:05:24.033 Yeah, I mean, this is one of those famous books that when you're trying to catch up on things you're like-- and I had no idea. You went and bought this, right? I have so many memories of this series like for dummies, right? So I think, again, as you're saying, it's super cool with this super forefront niche things, but what I really see and what I think is so cool coming back to the community is now it becoming a paradigm shift. Now, this is a democratization of graphs and graph data science for the masses, right? And this is where I think we now start to see the big chunk of the bell curve, right, the mass coming. So it's super cool and super good that you did a book. I going to force you to sign it for me and put it in my bookshelf with all of the other ones. I have some really embarrassing ones here for dummies, but I will not put it on here. But when you visit Stockholm, surely you will see it.
RVB:00:06:25.224 Super cool. Super cool. Stefan, then let's talk about a couple of those like super interesting cases that we saw, right? I mean, the big one for me was-- I think it was end of September, early October, when the ICIJ, The International Consortium of Investigative Journalists, they released the FinCEN files. Have you taken a look at those?
SW:00:06:48.553 Yes, yes, yes, yes. It's once again an example of people putting their hands in the cookie jar where they did not belong or hiding things, right? And I think, again, coming back to it, this idea that-- before we don't really see that this happens. And then a lot of us goes in good faith, but then these ideas and this work of ICIJ and [inaudible] enabling that to uncover or demistify these kind of shady corners-- and I think this is also one of the things that I'm really proud of working at a company as Neo because being part of kind of unfolding and putting the spotlight on these things makes me really proud.
SW:00:07:35.687 And of course this is also like a shift because people have been able to kind of hide and trick systems and people I guess because of the limitations of those systems, right, but now with graphs, that's kind of changing. Of course, criminal activity will also change, but I think at least we clearing up this space, and I feel really proud about it.
RVB:00:08:00.458 Me too. I mean, I think it's one of the coolest things about Neo4j is that-- obviously, it's a commercial company, and it's an open-source movement. Those are already two really, really cool things being successful as we are. But then you see these use cases like Panama Papers, Paradise Papers, and now the FinCEN files where you feel the impact of this stuff is just way broader than the technical open-source community or the business reality of a company. This is society. This is us effecting change at a much bigger scale, and I think that just energizes me. It's super cool to see that.
SW:00:08:44.932 Yeah, you did some stuff with Apache Zeppelin around this, right? I just saw it in our [inaudible]. I didn't have time to read it.
RVB:00:08:52.663 No, I mean, obviously, since I don't have a social life and people know this by now [laughter]-- I'm not allowed anymore anyway. But when the FinCEN files came out, I started tinkering with it myself. And I'd always known about this technology called Apache Zeppelin, which is a notebook type data analysis data wrangling infrastructure. And a couple of friends of ours in the community at Larus in Italy, they have created like connector for Apache Zeppelin and Neo4j. So it's super easy to use that notebook paradigm where you go from query to the next query and you build up a story really analyzing the data. And so I applied that to the FinCEN files, which was a lot of fun to do. And you know what? It kept me off the streets. So everyone's happy.
SW:00:09:56.899 So you did survive yet another month. Good or bad? Nobody knows, right? But I think it was a great reading. I got it in at least one of my lists to catch up on, so. It looks very, very interesting, so.
RVB:00:10:12.542 I thought you would be talking about the Spotify playlist builder that Niels [crosstalk].
SW:00:10:16.665 Yeah, yeah, yeah. I'm just trying to play it cool. How it is. You don't want to spoil all of your good cards [crosstalk].
RVB:00:10:24.401 You must get excited about that.
SW:00:10:26.445 Yeah, yeah, yeah. Also like my background within music, DJing, producing, but also my other foot within I guess in the business part and this kind of understanding personas because I think the way he did it is-- it is like a very, very smart recommendation, right, instead of viewing the traditional kind of buckets of things where-- if you do a client example, segmentation of customers, right? I would be a middle aged I guess - that's a hard to say - man wearing a turtleneck. I don't got any hair left. I work with innovation, right? Then we have the stereotype of me, but it says very little on my actual behavior in different scenarios.
SW:00:11:15.425 Now, I think what's great about this is this is it's the same kind of thing [inaudible] sorting out basically playlist and how songs are related to other songs based upon the actual similarities of the song itself rather than the artist name or their artist category, which it belongs within. So one of those example, I think Niels did an amazing job. I showed it to a lot of friends in the music companies here in the Swedish famous one. I don't know if we do commercial stuff in here, but I'm used with Swedish television, and in that, you can't say the brand name, but you can all know what company it was most likely.
SW:00:11:59.540 But I think it's such a cool thing. I think also the walkthrough with [Neil's?] is-- it's super neat. It's explainable, and I think this is one of the use case that every single business that have anything like any person's relating to it can have a use of. Just go copy it. Try it out. It's amazing. And also [crosstalk]--
RVB:00:12:23.984 I uploaded it to my own-- I applied it to my own favorite's playlist in Spotify, and it's amazing what you get out of it. So this script that [inaudible] created, the basically reads the Spotify API, and then it categorizes your bucket of favorites into all of these kinds of like sub-playlist which is super cool to do. One of the things that I found out about that which I didn't know about is the valence score of Spotify songs. Have you ever heard of that?
SW:00:12:55.316 Yeah. Yeah, I heard about it. It's super cool.
RVB:00:12:57.336 It's the happiness score. I think it's fantastic. They give a happiness score to every song which you can use to kind of build playlists and do recommendations as well. So really, really cool. [crosstalk].
SW:00:13:11.241 So may I ask you, did you take any of the newly recommended-- or find or whatever-- created maybe is the word, or sorted out playlist and then took those, and then made that playlist into radio station because that would then turn that into a nice recommendation into the recommendation of that and then backwards. That could be actually kind of fun and see where we end up. Maybe we're just playing too much with algorithms here, but it could also be finding those-- I think this is the beauty of these algorithms. It's not hard to kind of just recommend what you're supposed to listen to. I mean, that's a simple equation, right?
SW:00:13:53.442 But the one thing where this touches upon is these kind of cool things like I didn't even know that I would like to listen to this song. This is where it gets spooky. It's when you learn things you didn't know that you wanted to learn, and I think this is that whole kind of-- we talk about the Wikipedia sinkhole, the YouTube sinkhole, and now I think [inaudible] uncovered the Spotify sinkhole in one sense, right, which is the part that I know they are working a lot with because the more time you spend on there, the better it is for their business.
RVB:00:14:26.458 Exactly. Well, Stefan, there's a bunch of other use cases that we saw pop up. I'll put some of them on the transcription, but we want to keep these podcast recordings quite short-ish. So I think we'll wrap up here, and want to thank you for helping out and talking about all of this wonderful stuff with me, and I'll see you very soon.
SW:00:14:55.899 Yes. Bye-bye. Again, strangers of the internet, give me your recommendations of graphs and connect. I would love that because we're all a graph, right?
RVB:00:15:06.525 Exactly. All right. Thank you, Stefan. Have a nice day. Bye.
Subscribing to the podcast is easy: just add the rss feed, find the show on Spotify, or add us in iTunes! Hope you'll enjoy it!

All the best


No comments:

Post a comment