Bruggen Blog: Podcast Interview with Javier de la Rosa, SylvaDB

Monday, 29 June 2015

Podcast Interview with Javier de la Rosa, SylvaDB

I had a great conversation with Javier de la Rosa, recently. Javier is an active member of our community, and one of the driving forces behind the SylvaDB project. I first learned about Sylva when I had just started to work for Neo, and when I was trying to acquaint myself with the basic concepts in more detail. SylvaDB stands for

Sylva ["silva", a book to organize knowledge during the Renaissance] is an easy-to-use, flexible, and scalable database management system that helps you collect, collaborate, visualize and query large data sets. In Sylva, all your data is connected using a graph, and you will see the connections all the way through.

And it really does deliver on that vision. It's really simple to use - even if you don't know anything about graphs, databases, query languages, etc... It's totally open source, and a great topic of conversation:

Here's the transcript of our conversation:

RVB: Hello everyone, my name is Rik Van Bruggen from Neo Technology, and here we are again doing an evening podcast recording all the way across the Atlantic. And my guest today is Javier de la Rosa, all the way from Ontario, in Canada. Hi Javier.

JDLR: Hello, how are you Rik?

RVB: I am very, very well. Thanks for coming on the podcast, really appreciate it.

JDLR: Thank you for inviting me actually.

RVB: Yeah, great. Javier, do you mind introducing yourself? I always ask the same question, but who are you and what's your relationship to the wonderful world of graph databases?

JDLR: Sure, no problem. So, as you said, my name is Javier de la Rosa. I'm originally from Spain. I have background in computer science and artificial intelligence, but then I decided to switch to a kind of different field. I moved to Canada, I live in Ontario now, the city of London, ironically, it's close to Toronto. Actually, I'm getting my PhD in Literature, working in a field called Digital Humanities. Because they have a lot of problems, we are now using graph databases and try to tackle all those different problems that they have.

RVB: That's interesting. So how did you get into graph databases, Javier, and why did you get into it?

JDLR: When I first came to the lab that I'm working for, I saw that they were all using somehow databases - let's say they were using like Microsoft Access, or Excel, stuff like that. Then, because I have a background in computer science, I thought maybe it's a good idea to use something that allow them to model the problems more freely. Then I discovered neo4j, and then I started working on my own and address client the first time that they released the rest end point. Then, on top of that, we decided to be a tool for all the humanities to actually use, and get all the power that neo4j has to provide.

RVB: So basically, you've written a bunch of tools on top of neo4j that allow actual language specialists, digital humanities specialists to do their jobs in a graphy kind of way. Is that what I'm hearing [chuckles]?

JDLR: Yes, that's exactly it. Because they usually do a lot of analysis. They also work with networks. For example, the typical example is social network analysis, but sometimes they have to work in an isolated environment, and they only have like the email to share the stuff, so I thought that maybe like a cloud-based solution for them would be better. So that's why I thought of neo4j in the first place.

RVB: Super cool. And the second question I always ask is why did you get into the graph? You've sort of started answering it, but what is it that makes it such a good fit for this digital humanities? What makes the graph and the graph database such a good fit?

JDLR: The main thing is is that the world of humanities is such a mess. So they start working on a problem and then in two months they decide that they have to change the schema. And then a week later they have to change it again. Then again, and again, and again. So having something schema-free is actually the best solution for all of them, so that's why we decided to use something that allows to have flexible schema, or at least schema-less.

RVB: Can you give me an example of a domain that really benefited from that? Some kind of a project that you were able to solve using a graph database?

JDLR: Yeah. For example, we have a colleague now, he's working on analyzing like 13 million books written in Spanish, and then he has to model like how the transformation of knowledge actually happened in the 17th, 18th centuries. So he started creating a schema, he started to modeling the problem. And then as long as the research was actually in advance, he had to modify the schema several times to actually put his data. So the thing is that instead of having your schema first and then trying to feed your data into the schema, you actually modify the schema as long as you need it. It was a natural option for us.

RVB: As I understand it, Javier, you've also done a lot of work to sort of put this into the hands of the researchers, right? That's what the SylvaDB is all about, isn't it?

JDLR: Exactly. Even if I love neo4j, and I really like the Cypher language, we have to acknowledge that it's not for everyone. If you're not a programmer or an analyst, it's hard to learn it, especially if you are from the humanities and all you have done in your life is just read books and do a lot of critical thinking. So with SylvaDB on top of neo4j, for them to get all the power, that you can actually get from using neo4j.

RVB: So anyone can use it, right? I've registered, for example, and I was playing around with it, but-- so anyone can use this to create their own graph database?

JDLR: Yeah. It is free to use. We have a public website which is called sylvadb.com so you can go there. But because we are running in it's like an academic thing, so if you feel that it's not enough, you can actually go to Github, download all the code, and put it on your own machine, and that's good. That's good for us. It's a GPL license, so very good with that too.

RVB: Super cool, yeah. I mean that's the beauty of open source, right [chuckles]?

JDLR: Yeah.

RVB: Absolutely. Yeah.

JDLR: You have to contribute.

RVB: Exactly, yeah. Absolutely. So in terms of open source, and one of the things is obviously there's a lot of top people developing new stuff around the open source project. That brings me to my last question - where is it going? Where do you think or where do you want it to go? Or what does the future have in store for graph databases and maybe also for Sylvadb?

JDLR: As we see it, for example here in Canada, there is now a huge debate about if the government should provide a national wide infrastructure to support research tools. One of the idea is to actually push for the government to have like an instance or something close to a graph database, a massive graph database support for all the data sets that researchers are using. That's one thing, but that would take a long time to be a reality. In terms of our short-term goals, we want to create another tool in Sylvadb that allows you to create projections. That's now easily done using Cypher, but in Sylvadb everything is useful. You don't need any programming knowledge. We want the researchers to be able to say, "Okay, I have a book and an author and then a CD," and they want to create a new graph which is the result of project. That relationship, which is have three different types into one single relationship that only going from one type to the other - usually called projections, but right now it's only available through Cypher.

RVB: That sounds really interesting. And is that something that you'll think you'll be able to release in the next couple of months or is that long-term?

JDLR: Let's see. I am working now, I'm finishing my thesis, so I have to defend, but that's, so let's see how much time I have for that. I would love to.

RVB: Okay, well very cool. Well, Javier, thank you so much for sharing that with us.

JDLR: Thank you.

RVB: I think it was very interesting and I'm sure you're going to play around a little bit more with SylvaDB, so I thank you for sharing that with the community as well. And yeah--

JDLR: Thank you very much.

RVB: I look forward to seeing you at one of the conferences, maybe a graph connect in October or something. That would be great.

JDLR: Sure, I will try.

RVB: Okay, thank you, Javier. Have a nice evening.

JDLR: Thank you very much, Rik.

RVB: Bye.

Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

Bruggen Blog

Pages

Monday, 29 June 2015

Podcast Interview with Javier de la Rosa, SylvaDB

No comments:

Post a Comment

Labels

Blogarchive

Metricool