Wednesday, 19 June 2013

FUN with FACEBOOK in NEO4J


Ever since Facebook promoted its “graph search” methodology, lots of people in our industry have been waking up to the fact that graphs are über-cool. Thanks to the powerful query possibilities, people like Facebook, Twitter, LinkedIn, and let us not forget, Google have been providing us with some of the most amazing technologies. Specifically, the power of the “social network” is tempting many people to get their feet wet, and to start using graph technology. And they should: graphs are fantastic at storing, querying and exploiting social structures, stored in a graph database.

So how would that really work? I am a curious, “want to know” but “not very technical” kind of guy, and I decided to get my hands dirty (again), and try some of this out by storing my own little part of Facebook - in neo4j. Without programming any kind of production-ready system - because I don’t know how - but with enough real world data to make us see what it would be like.

Give me my data!


The first step to take was to get access to my own facebook data. Obviously there is the facebook graph api, but I hope that by now you would realise that is just not my cup of tea. Sounded to exhausting :) … So: I found myself a little tool that would allow me to download the data from facebook, in a workable format. Give me my data provides a number of different export options, but I chose the “Mutual Friends Network Graph - as it would give me most info with regards to my actual social network.


After a few seconds, the app presents me with a csv document that I could copy and paste into a text file:



And then, I was very quick to start using the neo4j and spreadsheets method to get data into neo4j: in this particular case I created a Google Spreadsheet.

and in the spreadsheet I then used some spreadsheet wizardry to generate the Cypher statements that would allow us to import the data.
  • first we create the nodes:
    • make sure that all the people are uniquely named (using the “unique()” spreadsheet function
    • create the cypher statements:
      • using the following spreadsheet formula
="create (n{name:'"&E2&"', type:'Facebook'});"
  • I ended up with a cypher statement that looks like
create (n{name:'Rik Van Bruggen', type:'Facebook'});

  • Once we have the nodes, we can then create the relationships between them:
    • Column A contains the start nodes
    • Column B contains the end nodes
    • the relationships are all of the type “IS_A_FRIEND_OF”
    • So we can create the cypher statements:
      • using the following spreadsheet formula
="start n1=node:node_auto_index(name='"&A2&"'),
n2=node:node_auto_index(name='"&B2&"') 
CREATE n1-[:IS_A_FRIEND_OF]->n2;"
  • I end up with this following cyphher statement:
start n1=node:node_auto_index(name='Pieter-Jan Boone'), 
n2=node:node_auto_index(name='Jason Goode') 
CREATE n1-[:IS_A_FRIEND_OF]->n2;



So now we have the cypher statements. All we now need to do is execute them in a transaction (wrap them with begin / commit) to make creating the graph really fast and easy. Before you do that, please make two important notes:
  • don’t forget to configure your “auto-indexing” in neo4j before you execute the statements - otherwise your nodes/relations will not be indexed properly.
  • you have to use the neo4j-shell for copying/pasting in the statements. There is a shell-like environment in the web-admin tool as well, but copying/pasting into it did not work - at least not on my machine.


And that’s about it. Now you can just surf to the webadmin and start exploring: http://localhost:7474/webadmin/ … Of course there are better visualisations possible - just look at neo4j.org for more info.



Just to show you how easy it is, I have also created a little video that explains the process for you - you will see that it’s dead easy.


Hope this is useful, and that we will be seeing many more neo4j based social network application in the near future.

Monday, 10 June 2013

Graphs for Bunnies


Over the past couple of months, I have been doing lots of "Intro to Graphs" talks across Europe. What started out as an idea in our London User Group, has now been a standard monthly introduction that has been extremely well attended. And that's not where it stops - many other people want to start doing these sessions as well in different cities.

During these talks, lots of people have been asking really smart and important questions - and that's why I have created a small screencast to share the answers to these questions. Here is the screencast:





And here are some links that could be useful info to go with each of the questions:
  1. What are Graph Databases? 
  2. How do Graph Databases work? 
  3. What are Graph Databases good for?
  4. What are Graph Databases bad for? 
  5. How do Graph Databases perform?
  6. How do Graph Databases scale?
  7. How do I model my Graph Database?
  8. How do I ensure data integrity in my Graph Database?
  9. How to fit a Graph Database into my enterprise architecture?
  10. How do I fit a Graph Database in with other DB tech?
In general, please use neotechnology.com or neo4j.org as your starting point. We also run a number of cool meetup groups in Europe and beyond.

I hope this is useful for all of you. If you want to take a look at the slides, then please visit the google doc or contact me! Happy to deliver or help deliver this talk in your city or at your company if that would be of interest.