
Finding a Data source
That was so easy. I very quickly located a Dataset on Kaggle that I thought would be really interesting. It's a comma-separated file, about 110k lines long and 10MB in size, that holds all the lines that Shakespeare wrote for his plays. It's just an amazing dataset - not too complicated, but terribly interesting.
The structure of the file has the following File headers:
| Dataline | Play | PlayerLinenumber | ActSceneLine | Player | PlayerLine |
|---|---|---|---|---|---|
| abc | def | ghi | jkl | mno | pqr |
Of course you can find the dataset on Kaggle yourself, but I actually quickly imported it into a google sheet version that you can access as well. This gsheet is shared and made public on the internet, and can then be downloaded as a csv at any time from this URL. This URL is what we will use for importing this data into Neo4j.