Showing posts with label aggregations. Show all posts
Showing posts with label aggregations. Show all posts

Monday, 22 October 2018

Poring over Power Plants: Global Power Emissions Database in Neo4j

In the past couple of weeks, I have been looking to some interesting datasets for the Utility sector, where Networks or Graphs are of course in very, VERY abundant supply. We have Electricity Networks, Gas Networks, Water Networks, Sewage Networks, etc etc - that all form these really interesting graphs that our users can. Lots of users have specialised, GIS based tools to manage these networks - but when you think about it there are so many interesting things that we could do if ... we would only store the network as a network - in Neo4j of course.
So I started looking for some datasets, and maybe I am not familiar with this domain, but I did not really find anything too graphy. But I did find a different dataset that contained a lot of information about Power Plants - and their emissions. Take a look at this website:
and then you can download the Excel workbook from over here. It's not that big - and of course the first thing I did was to convert it into a Google Sheet. You can access that sheet over here:

There's two really interesting tabs in this dataset:
  1. the sheet containing the fuel types: this gives you a set of classifications of the fuel that is used in the different power plants around the globe
  2. the list of 30,5k power plants from around the world that generate different levels of power from different fueltypes. While doing so, they also generate different levels of emissions, of course, and that data is also clearly mentioned in this sheet. Note that the dataset does not include any information on Nuclear plants - as they don't really have any "emissions" other than water vapour and... the nuclear waste of course.
So let's get going - let's import this dataset into Neo4j.