Wednesday, 10 June 2015

Dates as Numbers

This will be a short blogpost probably, but I am hoping it will be a good/useful one.

Something that I recently had to do was to store DATE information on nodes in my Neo4j database. I was loading the data from a CSV file, and in that CSV file the date information was formatted as
dd-mm-yyyy
In other words, as a string. Now that is not very helpful in any database, but in Neo4j particularly because we do not have a date-datatype primitive. The next best thing then, would be to store the date as a number - so I asked myself the question how I could load the data straight from the CSV file, transform the string into an integer, and then store the integer as a property. Not that difficult, you would think. Until I - with my limited skills and tiny reptile brain - gave it a try and immediately hit some issue. So what then - of course I called a friend, again!

So let's try this. I have put the data and the sample query on github. So let's take a look

If I run the following query
 load csv with headers   
 from "https://gist.githubusercontent.com/rvanbruggen/39be473ef3d69ce9a316/raw/24e9d5470abbfbe60d7fedb2005117a5259090d9/source.csv" as csv    
 return *  

I can clearly see that the date is in the above format:

So then I would run a simple load script to read the CSV, transform the dates, and then create a node for every row in the CSV:

load csv with headers  
from "https://gist.githubusercontent.com/rvanbruggen/39be473ef3d69ce9a316/raw/24e9d5470abbfbe60d7fedb2005117a5259090d9/source.csv" as csv  
create (p:Person {name: csv.Name, birthdate: toInt(substring(csv.Birthdate,0,2))+toInt(substring(csv.Birthdate,3,2))*100 + toInt(substring(csv.Birthdate,6,4))*10000});  

Look at what happens with the csv.Birthdate field. by using the "substring" string function, I am selecting the dd first, adding the mm*100 to it, and then adding the yyyy*10000 - so we get something like this:

yyyy0000+mm00+dd

which will give us a number equal to

yyyymmdd

Isn't that sweet? As you can see below the 5 nodes were created:

And indeed, the properties were set as integers, not strings:

So there we are. A nice and simple way to import data and do some simple but important transformations on the go.

Hope this was useful.

Cheers

Rik

No comments:

Post a Comment