{"id":10340,"date":"2023-07-10T21:52:46","date_gmt":"2023-07-11T02:52:46","guid":{"rendered":"https:\/\/www.rushworth.us\/lisa\/?p=10340"},"modified":"2023-09-01T13:55:44","modified_gmt":"2023-09-01T18:55:44","slug":"neo4j-the-importance-of-the-data-model","status":"publish","type":"post","link":"https:\/\/www.rushworth.us\/lisa\/?p=10340","title":{"rendered":"Neo4J \u2013 The Importance of the Data Model"},"content":{"rendered":"<p>While I am certain table-based SQL databases required planning to establish a reasonable data model \u2013 optimizing storage, defining foreign keys, indexing \u2026 I have found it more challenging to create a <em>good<\/em> data model in Neo4j. Maybe that\u2019s because I normally populate SQL tables with custom scripts that can be modified to handle all sorts of edge cases. Maybe I\u2019m still thinking in tables, but there seems to be more trial and error in defining the data model than I\u2019ve ever had in SQL databases.<\/p>\n<p>In the import-from-html-table example, a candidate often is associated with multiple elections. Storing candidates as nodes and elections as other nodes that contain results (electoral college votes for winner &amp; loser and popular votes for winner &amp; loser) then associating candidates with elections allowed me to store data about US elections in the Graph. I know who ran, who won and who lost, and what the results were for each election.<\/p>\n<p>Associating the results with candidates didn\u2019t work because Franklin D Roosevelt only has one property for \u201cEC_VOTES\u201d \u2026 which election does that reflect? I could also have added the vote totals to the <em>relationship<\/em> but that would either separate the data (the loser\u2019s votes are stored on LOST relationships and the winner\u2019s are stored on WON relationships) or data duplication (both WON and LOST relationships contain the same vote numbers).<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1419\" height=\"719\" class=\"wp-image-10341\" src=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2023\/09\/word-image-10340-1.png\" srcset=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2023\/09\/word-image-10340-1.png 1419w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2023\/09\/word-image-10340-1-300x152.png 300w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2023\/09\/word-image-10340-1-1024x519.png 1024w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2023\/09\/word-image-10340-1-768x389.png 768w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2023\/09\/word-image-10340-1-750x380.png 750w\" sizes=\"auto, (max-width: 1419px) 100vw, 1419px\" \/><\/p>\n<p>Query used to populate the data:<\/p>\n<pre>CALL apoc.load.html(\"https:\/\/www.iweblists.com\/us\/government\/PresidentialElectionResults.html\",\r\n{electionyear: \"#listtable tbody tr td:eq(0)\"\r\n, winner: \"#listtable tbody tr td:eq(1)\"\r\n, loser: \"#listtable tbody tr td:eq(2)\"\r\n, electoral_win: \"#listtable tbody tr td:eq(3)\"\r\n, electoral_lose: \"#listtable tbody tr td:eq(4)\"\r\n, popular_win: \"#listtable tbody tr td:eq(5)\"\r\n, popular_delta: \"#listtable tbody tr td:eq(6)\" }) yield value\r\nWITH value, size(value.electionyear) as rangeup\r\n\r\nUNWIND range(0,rangeup) as i WITH value.electionyear[i].text as ElectionYear\r\n, value.winner[i].text as Winner, value.loser[i].text as Loser\r\n, value.electoral_win[i].text as EC_Winner, value.electoral_lose[i].text as EC_Loser\r\n, value.popular_win[i].text as Pop_Vote_Winner\r\n, value.popular_delta[i].text as Pop_Vote_Margin\r\n\r\nMERGE (election:Election {year: coalesce(ElectionYear,\"Unknown\")})\r\nSET election.EC_Votes_Winner = coalesce(EC_Winner,\"Unknown\")\r\nSET election.EC_Votes_Loser = coalesce(EC_Loser,\"Unknown\")\r\nSET election.Pop_Votes_Winner = apoc.text.replace(Pop_Vote_Winner, \",\", \"\")\r\nSET election.Pop_Votes_Loser = apoc.number.exact.sub(apoc.text.replace(Pop_Vote_Winner, \",\", \"\"), apoc.text.replace(Pop_Vote_Margin, \",\", \"\"))\r\n\r\nMERGE (ew:CANDIDATE {name: coalesce(Winner,\"Unknown\")})\r\nMERGE (el:CANDIDATE {name: coalesce(Loser,\"Unknown\")})\r\n\r\nMERGE (ew)-[:WON]-&gt;(election) MERGE (el)-[:LOST]-&gt;(election);<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While I am certain table-based SQL databases required planning to establish a reasonable data model \u2013 optimizing storage, defining foreign keys, indexing \u2026 I have found it more challenging to create a good data model in Neo4j. Maybe that\u2019s because I normally populate SQL tables with custom scripts that can be modified to handle all &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1896],"tags":[1902,1897],"class_list":["post-10340","post","type-post","status-publish","format-standard","hentry","category-neo4j","tag-cypher","tag-neo4j"],"_links":{"self":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/10340","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=10340"}],"version-history":[{"count":1,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/10340\/revisions"}],"predecessor-version":[{"id":10342,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/10340\/revisions\/10342"}],"wp:attachment":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=10340"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=10340"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=10340"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}