I am an economist/engineer. I studied "Commercial Engineering" in Belgium in the nineties, and was quite an avid learner of economic theories large and small at the time. I did however, always kind of find myself uneasy at economists insistence on the rationality the homo economicus, as I knew, and observed all around me, that people were far from rational. That's why, ever since I learned of its existence, I have been a big fan of the field of behavioral economics - which actually tries to formulate ecomic theories that are real, and often times, irrational. I fondly remember first reading Dan Ariely's Predictably Irrational, and learning about some of the crazy biases that he observed and DESCribed. And Nobel-prize-winning Daniel Kahneman has been a hero for decades. I think about the Framing Effect) and Prospect Theory almost on a daily basis.
It all started with a tweet
So you can imagine my excitement when I learned about this tweet:
Should be taught to all at a young age pic.twitter.com/GlVkjcdhah
— Elon Musk (@elonmusk) December 19, 2021
This tweet include this particular infographic:
I then looked into the source of the graphic, and found that it was featured in more detail on this page. Not much later I was thinking how I could actually do something cool and graphy with this wonderful little piece of data. And again not much later I came up with this blogpost below.
First: get the Cognitive Bias data into a spreadsheet
Call me strange, but I love a good little Google sheet. I put the data in there, and took the trouble of actually putting the category data in there as well. That was 10mins of manual work that I will never get back.
Once I had the spreadsheet nailed, I could easily download the data as a .csv
file. That file is then of course ready for import.
Import that data into a Neo4j graph
Import file from .csv
As usual, it take a only a second to import that data into Neo4j, with a very simple query:
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/e/2PACX-1vQDDO_Fqewk1OSR7qrW-2XR7AHhy1MiWmFGwZgLhptaislLP6JLmXgDkR0F331WClserKQz61UDjG8n/pub?gid=0&single=true&output=csv" AS csv
CREATE (b:Bias)
SET b = csv;
The result:
Next, I wanted to extract the Category
information from the Bias
nodes.
Creating the HAS_CATEGORY
relationship
From the spreadsheet, we know that there are 6 different categories:
- Memory
- Social
- Learning
- Belief
- Money
- Politics
Every
Bias
has one or more categories, but some are actually pertinent to all.
So let's run the following queries to CREATE the Category
nodes, and connect the Bias
nodes to these using the HAS_CATEGORY
relationship:
MATCH (b:Bias)
WHERE b.Memory IS NOT NULL
MERGE (c:Category {name: "Memory"})
CREATE (b)-[:HAS_CATEGORY]->(c);
MATCH (b:Bias)
WHERE b.Social IS NOT NULL
MERGE (c:Category {name: "Social"})
CREATE (b)-[:HAS_CATEGORY]->(c);
MATCH (b:Bias)
WHERE b.Learning IS NOT NULL
MERGE (c:Category {name: "Learning"})
CREATE (b)-[:HAS_CATEGORY]->(c);
MATCH (b:Bias)
WHERE b.Belief IS NOT NULL
MERGE (c:Category {name: "Belief"})
CREATE (b)-[:HAS_CATEGORY]->(c);
MATCH (b:Bias)
WHERE b.Money IS NOT NULL
MERGE (c:Category {name: "Money"})
CREATE (b)-[:HAS_CATEGORY]->(c);
MATCH (b:Bias)
WHERE b.Politics IS NOT NULL
MERGE (c:Category {name: "Politics"})
CREATE (b)-[:HAS_CATEGORY]->(c);
Here's the result of that query:
The graph now looks like this:
And then we can immediately see that some biases are more interesting than others - if only because they impact more "categories":
MATCH (b:Bias)-->(c:Category)
RETURN b.Title, count(c) AS numberofcategories
ORDER BY numberofcategories DESC;
The result of this looks like this:
My summary would be: interesting, but a bit boring... So I thought about how to make this little graph a little bit more interesting.
NLP on the Bias descriptions
I decided to apply a method that I have used a few times before: the Bias
nodes all have a Description
property, which actually summarizes in a good way the meaning of each and every one of the 50 biases. If we run
MATCH (b:Bias)
RETURN b.Title, b.Description
ORDER BY b.Title ASC;
Then we see that we could actually do some useful Natural Language Processing on these descriptions:
So, after having installed the required NLP .jar
file in the Plugin
directory of the Neo4j server, we can start analysing the descriptions using the Google Cloud NLP service. Here's how that works:
:param apiKey =>("`some fake key here`");
MATCH (b:Bias)
CALL apoc.nlp.gcp.entities.graph(b, {
key: $apiKey,
nodeProperty: "Description",
scoreCutoff: 0.01,
writeRelationshipType: "HAS_ENTITY",
writeRelationshipProperty: "gcpEntityScore",
write: true
})
YIELD graph AS g
RETURN "Success!";
This returns after a few seconds:
Now you can see a much richer graph.
And of course we can do some interesting querying on this, like for example exploring the paths between two Bias
es.
match path =
shortestpath ((b1:Bias {Title: "Reactance"})-[*]-(b2:Bias {Title: "Automation Bias"}))
return path;
Gives us this:
Wrapping up
So, in conclusion: thanks to Elon's tweet, I had another bit of fun with Neo4j, Bloom, and Google NLP. Hope you liked this example as much as I did - let me know your thoughts regardless!
Cheers
Rik
PS: you can import this little graph in no time, without running the NLP yourself, via this Cypher script.
No comments:
Post a Comment