Oh man - things are heating up in the graph space, and keeping me super busy. After announcing our Series D last week (read more over here) I barely found the chance to publish this interview with Evelina Gabasova about graphs, Star Wars and biotech. Listen or read the full interview below.
Here's the transcription of our conversation:
RVB: 00:03.805 Hello, everyone. My name is Rik - Rik Van Bruggen from Neo Technology, and here we are again. It's been a while. We're recording another episode of our Neo4j Graphistania podcast, and today I have a wonderful lady from the beautiful lands around Cambridge, on the other side of this Skype call. And that's Evelina Gabasova. Hello, Evelina.
EG: 00:26.929 Hello, Rik.
RVB: 00:27.991 Hey. Good to have you on the call. Thank you for making the time.
EG: 00:30.946 Thanks for having me.
RVB: 00:31.940 Yeah, fantastic. So, Evelina, I have learned from our conversations that you are a postdoc researcher at the University of Cambridge, but maybe you might want to introduce yourself a little bit and tell our listeners who you are and what's your relationship to the wonderful world of graphs.
EG: 00:49.666 Well, I originally started as a programmer and then I got interested more in machine learning, so I went on to do a PhD in machine learning actually, and statistics in mathematics. And now I'm working as a postdoc in biomedical research, so I don't have any biological background at all, but I'm a quantitative person and I help biologists analyse their data. So I work in like statistical genomics and bioinformatics at the moment. And I got interested in graphs because they are very useful in modeling quite lot of biological phenomena because there are these protein-protein interaction networks, et cetera, and gene interactions. So graphs are a very natural way of modeling these kinds of things.
RVB: 01:33.283 Wow. You know this is a funny story because my first exposure to graphs was also about protein-protein interactions at University of Ghent here in Belgium. Is that metaproteomics? Is that the kind of field that you're talking about here?
EG: 01:46.540 I'm not working with proteins most of the time. I'm working on the bit lower level with the individual genes and different DNA variations, et cetera. But, still, they interact with each other, and the thing with biology is that it's a very multi-layered process and the different layers interact with each other, so it's extremely complex. I'm still-- my mind is exploding whenever I think about it, to be honest.
RVB: 02:12.786 Oh, my god [chuckles].
EG: 02:14.342 Because whenever you look closer it's just much more complex, and some aspects of it are very well modelled by graphs. Some are not, but we are just trying to integrate all the types of information that we have, and graphs are very helpful in that.
RVB: 02:32.241 Fantastic. And you're a big Star Wars fan, right? Because I read that [chuckles] GraphGist about Star Wars [laughter].
EG: 02:40.119 Yeah, I am a big Star Wars fan [laughter]. Yeah, that's one thing that I did. Social network-based analysis is very nice for just playing with things, so last year before Christmas when the new Star Wars movie was coming out, I just decided, "Okay, let's play with it a little," and I extracted social networks from all the scripts, all the movies, and then I just played with it. And it's a wonderful data set because you can understand what's happening there. Because sometimes when I look at bilingual data sets and see, okay, so this gene interacts with this gene without actually consulting quite a lot of literature and biologists, I have no idea what it means properly. But if I see that these two characters interact with each other, it makes sense because I've seen all the movies [laughter].
RVB: 03:33.382 I won't ask you for your opinion about the last movie [laughter]. So what do you think is so nice about using graphs for these different fields? You know what do you like it about it? Why is it so interesting for you?
EG: 03:50.762 Well, I find graphs as a very natural way of structuring information and very nice way to analysing very complex data where I just know how-- maybe I don't know that much about the data, but I know something about interactions, and graphs are just great for that. [crosstalk] Also, if you are looking at a graph, it doesn't have to be like direct interaction. The interactions in a graph can mean whatever you decide they should mean and that's a very flexible framework for approaching complex problems.
RVB: 04:26.904 Is that something you encounter in biology a lot, or are you talking more about the social networking stuff or both?
EG: 04:33.988 No, I was talking more about biology, probably.
RVB: 04:36.439 Yeah, it's more like a pathfinding to see if there is a path between different genes, for example. Is that what sounds like an example?
RVB: 04:48.423 Yeah, for example, or it doesn't have to be like directly pathways, it can be like if genes are related. For example, we are also working with some colleagues on a system that does data mining on academic papers. And it can be, for example, if two genes are mentioned in the same paper, which is not a very direct interaction between them, but it tells me that they are probably related in some way.
EG: 05:15.520 Absolutely, yeah. Actually, I did a podcast recording a couple of months ago with someone from University of California who was writing about molecular interactions. And he called it HETnets. Really interesting. I'll look up it for you [crosstalk].
EG: 05:34.317 Yeah, that sounds very interesting. But the interactions can mean anything. It can be on the bio-- like, chemical level. It can be on physical interaction level. It can be on what we know about those genes, et cetera. And I like to play with social networks because that is a very interpretable way of dealing with networks. So, these are my hobby projects, and at work it's much more complex [laughter].
RVB: 06:05.747 Well graphs are everywhere [laughter], right?
EG: 06:08.607 Yeah, that's true.
RVB: 06:09.181 So, that the kind of the tagline of Neo4j and it's been-- it's so true, right? Once you get into it, it’s almost impossible not to see things as a graph [laughter].
EG: 06:21.159 That's true [laughter].
RVB: 06:24.220 Do you have any plans for other use cases right now, Evelina?
EG: 06:29.661 Not at the moment because, well, the use cases that we are already working on with my colleagues are complex enough [chuckles] to be honest. So, we are playing with some like--
RVB: 06:39.938 Enough to keep you busy.
EG: 06:41.580 Yeah, definitely [laughter].
RVB: 06:44.227 I always say, "It keeps me off the streets" [laughter]. Exactly.
EG: 06:49.666 Yeah, but what I'm working on mostly is how to integrate the information from different layers in biology proposals. So, I'm looking at like the very low-level gene level or like the DNA level, and if there are any changes there, and how does it integrate with the RNA changes that are in the cell and how does it integrate with protein changes, et cetera? So, these are many different levels--
RVB: 07:14.369 What's RNA? I don't know what RNA is.
EG: 07:17.103 Sorry?
RVB: 07:17.552 What's RNA? I have no idea what that is.
EG: 07:20.178 Oh sorry [chuckles]. It’s like intermediate product between the DNA and a protein. So, it's how the DNA is transcribed into RNA, and that is then changed into protein. So it's sort of like an intermediate product, and if you are looking at that, you can see what's actually happening in a cell in a specific moment, because it's telling you which genes are being actively changed into proteins.
RVB: 07:50.302 You know what?
EG: 07:50.913 Did it make sense?
RVB: 07:50.917 This is why I like doing these podcasts. I learn something everyday, you know, it's [chuckles] very good, thank you [crosstalk].
EG: 08:00.830 Sorry, I just wanted to add something, that the DNA is basically a stable structure, the RNA tells you what's actually happening in a cell in a specific moment, and the protein level tells you basically what are those processes that were happening over some time ago, or over some longer time, because the proteins are just in the cell produced and then they are doing their roles.
RVB: 08:27.269 Interesting. So where do you think this is going, you know, both for you personally, Evelina, in your job or in your play time, but also looking at the IT industry - at the end of the day you're an IT professional - where do you think this is going, what does the future hold for the world of graphs?
EG: 08:50.072 Well, I think the future for our graphs is bright, because we have a lot of unstructured data and graphs are a great way to represent that. And it allows us to mine quite a lot of very complex data sets that would be impossible to structure in any other way, or maybe we don't have any other good way to structure those data sets. So, I think graphs will continue being quite successful in modeling, in quite a lot of domains, and for me personally, well, I hope I will get to play with graphs even more, because it's quite a lot of fun.
RVB: 09:27.334 Excellent.
EG: 09:28.508 At least in my free time analyzing some more movies, et cetera.
RVB: 09:33.443 Well, I look forward to seeing the results of that, and I wish you all the best for the professional use cases. I want to thank you for coming online, Evelina, it's been great talking to you. And I'm sure our listeners will enjoy listening to and reading about your story as well.
EG: 09:51.603 Thanks for having me.
RVB: 09:52.591 Thank you and have a nice day.
EG: 09:55.145 Bye, thank you.
RVB: 09:56.020 Bye.Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!
All the best