Friday 22 June 2018

Podcast Interview with Estelle Joubert, Dalhousie University

One of the coolest things about Neo4j is just the sheer breadth and diversity of applications that we see for connected data and graph databases out there. I think I have said it before, but it truly continues to baffle me. Very frequently, I will have a morning conversation with a user about battling financial fraud, a lunch conversation about using graphs in biotech to fight world hunger, and an afternoon conversation about real time recommender systems in retail. And of course finish it of with a beergraph conversation in the evening :) ...

Really - it's just amazing. And the next podcast episode is a true testimony to that. I got to have a chat with a lovely lady all the way over in Canada recently, Estelle Joubert from Dalhousie University. She and her team have been using Neo4j in her amazing field of research, which is all about understanding how music and opera came to be what they are today in a historical perspective. She is best at explaining it herself - so here's our chat:


Here's the transcript of our conversation:
RVB:  00:01:20.209 Hello, everyone. My name is Rik, Rik Van Bruggen from Neo4J, and tonight I am joined by a guest on our podcast all the way from Canada, someone that has been working with, and experimenting with, Neo4j for quite some time in a very interesting domain that I hadn't heard of before. And that's Estelle Joubert from Dalhousie University. Hi, Estelle.

EJ:  00:01:50.299 Hi.
RVB:  00:01:51.214 Hey. Thank you for joining me. It's always nice when people make the time to do that. Thank you so much.
EJ:  00:01:58.246 I'm glad to chat.
RVB:  00:01:59.518 Fantastic. Estelle, I've been reading a little bit about your academic work, which is super interesting but, to be honest, a little bit out of my depth. So would you mind introducing yourself, and your field of research, and how you use Neo4j for that?
EJ:  00:02:21.410 Sure. So, yeah, my name is Estelle Joubert. I am a musicologist at Dalhousie University, and musicology is basically the study of music and culture. So we think in terms of music history, music theory, music culture. And, as a number of musicologists are also doing at the moment, I, too, am turning to technology to be able to answer some of the questions and to be able to pose new questions that musicologists haven't been able to answer in the past. My area of specialty includes opera and political theory, and also the question of opera in the musical canon, as well as music in the global, early modern period. So the project for which I'm using Neo4j is Opera and the Musical Canon, and this is basically the question of a imaginary body of works that emerged around 1800 called the musical canon, and it's a collection of the works by famous composers. So these are just the sort of list of masterpieces that everyone agreed upon were good pieces to study. And that sort of canon that emerged around 1800 tended to be focused around instrumental repertories. So we think of the-- or the symphonies of Haydn, of Mozart, of Beethoven, of Brahms. And opera didn't play a major role in that development, or so scholars have believed. So my research project asks the question of what role does opera play in that formation of this idea of the musical classics, this collection of classics. And, of course, we're dealing with visualizing operatic fame, and that's also the name of my Neo4j project. So, yeah.
RVB:  00:04:24.531 Wow. So how long have you been working on that Estelle? Has this been something that's been years in the works? Or how long have you been studying this?
EJ:  00:04:34.147 Not that many years. So the Opera and the Musical Canon project started in 2015 I believe, and the digital project has been underway since January in full-swing.
RVB:  00:04:46.997 Wow. And then what are some of the tools and techniques that you use for that? Because you talked a little bit about the computational component to that and Neo4j's part of it, but what else is part of it, and how do you do that?
EJ:  00:05:01.149 Well, I can take you through a brief tour. First of all, we have a number of primary sources that we're working with, and these are old documents found in rare book rooms mostly in Europe but a few also in North America. So we work with opera scores, and we work with opera reviews, so old reviews published in the 18th century of publishers, catalogs, and then also calendars of performances. So we're trying to get a sense of what made opera famous back in the day. Is it the composer? Is it just the sheer number of performances? Is it having performances across a broad geographic realm? Is it music criticism even if there are no performances of this piece? So what makes an opera famous is really the question that we're trying to answer. Thus far, we have been working with a database in Airtable that is a relational database. The trouble with relational databases is that our questions are very limited, so I actually can't answer the question of what makes opera famous just with a relational database. So we have been importing things into Neo4j. And one of the great things about Neo4j is that you can change the data model every time we want to tweak results and tweak the question. So it is very flexible. It is allowing us to chart opera performances in given cities. I'm hoping that we'll be able to eventually map how opera was mobilized throughout a specific geographic realm and within a given time frame using Neo4j as well. And it allows us to focus on relationships. So if there was one music critic that reviewed a whole lot of operas, that relationship automatically shows up as being stronger as compared to a critic that was much less known and perhaps far away. So having a sense of which connections are strong and which connections are weak is actually crucial to our project.
RVB:  00:07:28.094 Wow. My next question was going to be why did you start using Neo4j, but I think you've just explained all of that, Estelle.
EJ:  00:07:35.465 Yeah. I think it--
RVB:  00:07:35.728 Yeah, it was really, really cool. But how did you get into it? How did you find Neo4j? Did you just bump into it somewhere, or how does that work?
EJ:  00:07:44.807 Well, I was looking at a number of sort of graph-based solutions, and a lot of academics tend to go with R. I wanted to go with a community that is large, that is flexible, and that is actually commercial as well [laughter]. So I decided to go with Neo4j.
RVB:  00:08:12.541 Wow. Very good. Very good. And has it been a good experience for you? Has it been working the way you wanted it to work or any major lessons learned through there?
EJ:  00:08:23.574 It has mostly been a really good experience. I have made it a requirement for my team members. And I have a post-doctoral fellow, a master's student working on a thesis on the project, an undergraduate research assistant, and a technical lead, and all four of them have had to learn some basic Neo4j. So they have done some of the basic training that's online through your website, actually, which I think is a really good experience for students these days. I think they should have more opportunity to gain digital skills. So far, it's been really good. The one thing that has tripped us up, I have to say, is the fact that there is no date type in Neo4j. And for a project that traces things from one day to the next, from a premier of an opera to subsequent performances, that we've had to work around a little bit. But we have found some workarounds.
RVB:  00:09:26.014 Glad to hear that. I mean, it is coming. Next release you'll have a date type in Neo4j [laughter].
EJ:  00:09:31.051 Oh. Oh, that's fantastic. I'm excited.
RVB:  00:09:35.205 So it's been a very common request. And yes, there have been some good workarounds, but the next version Neo4j 3.4 will have a date type that you can use. So you should definitely take a look at that. Can I ask you a little bit about the future, Estelle? Where is this going for you? What's in store? And how do you see this evolve, both your own project and the graph space that you've been using? What's your view?
EJ:  00:10:05.631 Well, this project is actually a pilot project for a bigger research project that I would like to get underway possibly as early as the fall. And the bigger project would be to trace the dissemination and distribution of European music throughout the globe from about 1500 to 1800. So right now, we know that European music traveled all over the world via a whole lot of means, but we don't know where it went and what people did with it. We don't have a sense of those connections. And there's simply too much data for one researcher to really make sense of it, so it would be a team-based project, possibly with international collaborators as well. So that's where it's going. I forget the other part of the question.
RVB:  00:11:07.063 Well, the other part of the question is how do you look at the graph space in that context? Is that something where you would also use a graph you think or--?
EJ:  00:11:14.172 I think you have to use a graph because the focus is on relationships. It's not just the output of the sort of table in a graph, right, or a table in a database. So because it's focused on relationships and often asymmetrical relationships-- so when we're dealing with European explorers going to China or going to Indonesia, for example, those are asymmetrical relationships, often. And they need it-- and we need software to be able to accommodate that.
RVB:  00:11:49.967 Yup. Fantastic. Well, I mean, there's a lot of people in the humanities space and in humanities academics, I would say, that have been experimenting with Neo4j. And I suggest that I'll put some links to your research and to the other research, actually, in the transcription of this conversation so that people can find their way around and maybe get some more ideas for those future projects. Yeah?
EJ:  00:12:16.939 Sure. Thank you.
RVB:  00:12:18.030 Fantastic. Estelle, thank you so much for joining me. It's been a really interesting chat, and I really appreciate you making the time. And I wish you so much success in the future. It will be great to hear more about it in the future.
EJ:  00:12:34.091 Thanks very much.
RVB:  00:12:35.262 Thank you. Bye-bye.
EJ:  00:12:36.617 Bye-bye.
Subscribing to the podcast is easy: just add the rss feed or add us in iTunes! Hope you'll enjoy it!

All the best

Rik

No comments:

Post a Comment