Wednesday, 19 June 2013

FUN with FACEBOOK in NEO4J


Ever since Facebook promoted its “graph search” methodology, lots of people in our industry have been waking up to the fact that graphs are über-cool. Thanks to the powerful query possibilities, people like Facebook, Twitter, LinkedIn, and let us not forget, Google have been providing us with some of the most amazing technologies. Specifically, the power of the “social network” is tempting many people to get their feet wet, and to start using graph technology. And they should: graphs are fantastic at storing, querying and exploiting social structures, stored in a graph database.

So how would that really work? I am a curious, “want to know” but “not very technical” kind of guy, and I decided to get my hands dirty (again), and try some of this out by storing my own little part of Facebook - in neo4j. Without programming any kind of production-ready system - because I don’t know how - but with enough real world data to make us see what it would be like.

Give me my data!


The first step to take was to get access to my own facebook data. Obviously there is the facebook graph api, but I hope that by now you would realise that is just not my cup of tea. Sounded to exhausting :) … So: I found myself a little tool that would allow me to download the data from facebook, in a workable format. Give me my data provides a number of different export options, but I chose the “Mutual Friends Network Graph - as it would give me most info with regards to my actual social network.


After a few seconds, the app presents me with a csv document that I could copy and paste into a text file:



And then, I was very quick to start using the neo4j and spreadsheets method to get data into neo4j: in this particular case I created a Google Spreadsheet.

and in the spreadsheet I then used some spreadsheet wizardry to generate the Cypher statements that would allow us to import the data.
  • first we create the nodes:
    • make sure that all the people are uniquely named (using the “unique()” spreadsheet function
    • create the cypher statements:
      • using the following spreadsheet formula
="create (n{name:'"&E2&"', type:'Facebook'});"
  • I ended up with a cypher statement that looks like
create (n{name:'Rik Van Bruggen', type:'Facebook'});

  • Once we have the nodes, we can then create the relationships between them:
    • Column A contains the start nodes
    • Column B contains the end nodes
    • the relationships are all of the type “IS_A_FRIEND_OF”
    • So we can create the cypher statements:
      • using the following spreadsheet formula
="start n1=node:node_auto_index(name='"&A2&"'),
n2=node:node_auto_index(name='"&B2&"') 
CREATE n1-[:IS_A_FRIEND_OF]->n2;"
  • I end up with this following cyphher statement:
start n1=node:node_auto_index(name='Pieter-Jan Boone'), 
n2=node:node_auto_index(name='Jason Goode') 
CREATE n1-[:IS_A_FRIEND_OF]->n2;



So now we have the cypher statements. All we now need to do is execute them in a transaction (wrap them with begin / commit) to make creating the graph really fast and easy. Before you do that, please make two important notes:
  • don’t forget to configure your “auto-indexing” in neo4j before you execute the statements - otherwise your nodes/relations will not be indexed properly.
  • you have to use the neo4j-shell for copying/pasting in the statements. There is a shell-like environment in the web-admin tool as well, but copying/pasting into it did not work - at least not on my machine.


And that’s about it. Now you can just surf to the webadmin and start exploring: http://localhost:7474/webadmin/ … Of course there are better visualisations possible - just look at neo4j.org for more info.



Just to show you how easy it is, I have also created a little video that explains the process for you - you will see that it’s dead easy.


Hope this is useful, and that we will be seeing many more neo4j based social network application in the near future.

No comments:

Post a Comment