{"id":9111,"date":"2022-07-28T14:30:31","date_gmt":"2022-07-28T19:30:31","guid":{"rendered":"https:\/\/www.rushworth.us\/lisa\/?p=9111"},"modified":"2022-07-28T14:30:31","modified_gmt":"2022-07-28T19:30:31","slug":"postgresql-wraparound","status":"publish","type":"post","link":"https:\/\/www.rushworth.us\/lisa\/?p=9111","title":{"rendered":"PostgreSQL Wraparound"},"content":{"rendered":"<p>We had a Postgres server go into read-only mode &#8212; which provided a stressful opportunity to learn more nuances of Postgres internals. It appears this &#8220;read only mode&#8221; something Postgres does to save it from itself. Transaction IDs are assigned to each row in the database &#8212; the ID values are used to determine what transactions can see. For each transaction, Postgres increments the last transaction ID and assigns the incremented value to the current transaction. When a row is written, the transaction ID is stored in the row and used to determine whether a row is visible to a transaction.<\/p>\n<p>Inserting a row will assign the last transaction ID to the xmin column. A transaction can see all rows where xmin is less than its transaction ID. Updating a row actually creates a <em>new<\/em> row &#8212; the old row then has an xmax value and the new row has the same number as its xmin &#8212; transactions with IDs <em>newer<\/em> than the xmax value will not see the row. Similarly, deleting a row updates the row&#8217;s xmax value &#8212; older transactions will still be able to see the row, but newer ones will not.<\/p>\n<p>You can even view the xmax and xmin values by specifically asking for them in a select statement: <tt>select *, xmin, xmax from TableName;<\/tt><\/p>\n<p>The transaction ID is stored in a 32-bit number &#8212; making the possible values 0 through 4,294,967,295. Which can become a problem for a heavily I\/O or long-running database (i.e. even if I only get a couple of records an hour, that adds up over years of service) because &#8230; what happens when we get to 4,294,967,295 and need to write another record? To combat this,\u00a0 Postgres does something that reminds me of the &#8220;doomsday&#8221; Mayan calendar &#8212; this number range isn&#8217;t aligned on a straight line where one eventually runs into a wall. The numbers are arranged in a circle, so there&#8217;s always a new cycle and numbers are issued all over again. In the <a href=\"https:\/\/github.com\/postgres\/postgres\/blob\/70a437aa45b6dcacc2ad894f95ef5bb46b26035f\/src\/include\/access\/transam.h#L222\" target=\"_blank\" rel=\"noopener\">Postgres source<\/a>, the wrap limit is &#8220;where the world ends&#8221;! But, like the Mayan calendar &#8230; this isn&#8217;t actually <em>the end<\/em> as much as it&#8217;s a new beginning.<\/p>\n<p><a href=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-xidInCode.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-9207 size-full\" src=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-xidInCode.png\" alt=\"\" width=\"640\" height=\"397\" srcset=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-xidInCode.png 640w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-xidInCode-300x186.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>How do you know if transaction 5 is &#8216;old&#8217; or &#8216;new&#8217; if the number can be reissued? The database considers half of the IDs in the real past and half for future use. When transaction ID four billion is issued, ID number 5 is considered part of the &#8220;future&#8221;; but when the current transaction ID is one billion, ID number 5 is considered part of the &#8220;past&#8221;. Which could be problematic if one of the first records in the database has never been updated but is still perfectly legitimate. Reserving in-use transaction IDs would make the re-issuing of transaction IDs more resource intensive (not just assign ++xid to this transaction, but xid++;is xid assigned {if so, xid++ and check again until the answer is no}; assign xid to this transaction). Instead of implementing more complex logic, rows can be &#8220;frozen&#8221; &#8212; this is a special flag that basically says &#8220;I am a row from the past and ignore my transaction ID number&#8221;. In versions 9.4 and later, both committed and aborted hint bits are set to freeze a row &#8212; in earlier versions, used a special FrozenTransactionId index.<\/p>\n<p>There is a minimum age for freezing a row &#8212; it generally doesn&#8217;t make sense to mark a row that&#8217;s existed for eight seconds as frozen. This is configured in the database as the <a href=\"https:\/\/www.postgresql.org\/docs\/current\/runtime-config-client.html#GUC-VACUUM-FREEZE-MIN-AGE\" target=\"_blank\" rel=\"noopener\">vacuum_freeze_min_age<\/a>. But it&#8217;s also not good to let rows sit around without being frozen for too long &#8212; the database could wrap around to the point where the transaction ID is reissued and the row would be lost (well, it&#8217;s still <em>there<\/em> but no one can see it). Since vacuuming doesn&#8217;t look through every page of the database on every cycle, there is a <a href=\"https:\/\/www.postgresql.org\/docs\/current\/runtime-config-client.html#GUC-VACUUM-FREEZE-TABLE-AGE\" target=\"_blank\" rel=\"noopener\">vacuum_freeze_table_age<\/a> which defines the age of a transaction where vacuum will look through an entire table to freeze rows instead of relying on the visibility map. This combination, hopefully, balances the I\/O of freezing rows with full scans that <em>effectively<\/em> freeze rows.<\/p>\n<p>What I believe led to our outage &#8212; most of our data is time-series data. It is written, never modified, and eventually deleted. Auto-vacuum will skip tables that don&#8217;t <em>need<\/em> vacuuming. In our case, that&#8217;s <em>most<\/em> of the tables. The <a href=\"https:\/\/www.postgresql.org\/docs\/current\/runtime-config-autovacuum.html#GUC-AUTOVACUUM-FREEZE-MAX-AGE\" target=\"_blank\" rel=\"noopener\">autovacuum_freeze_max_age parameter<\/a> sets an &#8216;age&#8217; at which vacuuming is forced. If these special vacuum processes don&#8217;t complete fully &#8230; you eventually get into a state where the server stops accepting writes in order to avoid potential data loss.<\/p>\n<p>So <em>monitoring<\/em> for transaction IDs approaching the wraparound and emergency vacuum values is important. I set up a task that alerts us when we approach wraparound (fortunately, we&#8217;ve not gotten there again) as well as when we approach the emergency auto-vacuum threshold &#8212; a state which we reach a few times a week.<\/p>\n<p>Using the following query, we monitor how close each of our databases is to both the auto-vacuum threshold and the &#8216;end of the world&#8217; wrap-around point.<\/p>\n<pre>WITH max_age AS ( SELECT 2000000000 as max_old_xid\r\n                        , setting AS autovacuum_freeze_max_age FROM pg_catalog.pg_settings \r\n                        WHERE name = 'autovacuum_freeze_max_age' )\r\n         , per_database_stats AS ( SELECT datname , m.max_old_xid::int \r\n                        , m.autovacuum_freeze_max_age::int \r\n                        , age(d.datfrozenxid) AS oldest_current_xid \r\n                        FROM pg_catalog.pg_database d \r\n                        JOIN max_age m ON (true) WHERE d.datallowconn ) \r\n\r\nSELECT max(oldest_current_xid) AS oldest_current_xid \r\n      , max(ROUND(100*(oldest_current_xid\/max_old_xid::float))) AS percent_towards_wraparound \r\n      , max(ROUND(100*(oldest_current_xid\/autovacuum_freeze_max_age::float))) AS percent_towards_emergency_autovac FROM per_database_stats<\/pre>\n<p>If we are approaching either point, e-mail alerts are sent.<\/p>\n<p><a href=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-wraparound-warning.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-9209 size-full\" src=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-wraparound-warning.png\" alt=\"\" width=\"811\" height=\"301\" srcset=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-wraparound-warning.png 811w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-wraparound-warning-300x111.png 300w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-wraparound-warning-768x285.png 768w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/07\/pg-wraparound-warning-750x278.png 750w\" sizes=\"auto, (max-width: 811px) 100vw, 811px\" \/><\/a><\/p>\n<p>When a database approaches the emergency auto-vacuum threshold, we freeze data manually &#8212;\u00a0 <tt>vacuumdb --all --freeze --jobs=1 --echo --verbose --analyze<\/tt> (or &#8211;jobs=3 if I want the process to hurry up and get done).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We had a Postgres server go into read-only mode &#8212; which provided a stressful opportunity to learn more nuances of Postgres internals. It appears this &#8220;read only mode&#8221; something Postgres does to save it from itself. Transaction IDs are assigned to each row in the database &#8212; the ID values are used to determine what &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1522],"tags":[953,1348,1349],"class_list":["post-9111","post","type-post","status-publish","format-standard","hentry","category-postgresql","tag-database","tag-postgres","tag-postgresql"],"_links":{"self":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/9111","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9111"}],"version-history":[{"count":5,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/9111\/revisions"}],"predecessor-version":[{"id":9211,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/9111\/revisions\/9211"}],"wp:attachment":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9111"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}