{"id":8818,"date":"2022-03-29T16:29:14","date_gmt":"2022-03-29T21:29:14","guid":{"rendered":"https:\/\/www.rushworth.us\/lisa\/?p=8818"},"modified":"2022-03-30T13:50:31","modified_gmt":"2022-03-30T18:50:31","slug":"tracking-down-which-pod-is-exhausting-ip-connections","status":"publish","type":"post","link":"https:\/\/www.rushworth.us\/lisa\/?p=8818","title":{"rendered":"Tracking Down Which Pod is Exhausting IP Connections"},"content":{"rendered":"<p>We\u2019ve been seeing an error that prevents clients from connecting to Postgresql servers \u2013 basically that all available connections are in use and the remaining connections are reserved for superuser and replication activity.<\/p>\n<p>First, we need to determine what the connection limit <em>is<\/em> &#8212;<\/p>\n<p><tt>SELECT setting, source, sourcefile, sourceline FROM pg_settings WHERE name = 'max_connections';<\/tt><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"573\" height=\"107\" class=\"wp-image-8819\" src=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/03\/word-image-72.png\" srcset=\"https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/03\/word-image-72.png 573w, https:\/\/www.rushworth.us\/lisa\/wp-content\/uploads\/2022\/03\/word-image-72-300x56.png 300w\" sizes=\"auto, (max-width: 573px) 100vw, 573px\" \/><\/p>\n<p>And if there are any per-user connection limits \u2013 a limit of -1 means unlimited connections are allowed.<\/p>\n<p><tt>SELECT rolname, rolconnlimit FROM pg_roles<\/tt><\/p>\n<p>The next step is to identify what connections are exhausting available connections \u2013 are there a lot of long-running queries? Are there just more active queries than anticipated? Are there a bunch of idle connections?<\/p>\n<pre>SELECT pid, usename, client_addr, client_port \r\n ,to_char(pg_stat_activity.query_start, 'YYYY-MM-DD HH:MI:SS') as query_start\r\n , state, query \r\nFROM pg_stat_activity\r\n-- where state = 'idle'\r\n-- and usename = 'app_user'\r\norder by query_start;<\/pre>\n<p>In our case, there were over 100 idle connections using up about 77% of the available connections. Auto-vacuum, client read operations, and replication easily filled up the remaining available connections.<\/p>\n<p>Because the clients keeping these idle connections open are an app running in a Kubernetes cluster, there\u2019s an extra layer of complexity identifying <em>where<\/em> the connection is actually sourced. When you view the list of connections from the Postgresql server\u2019s perspective, \u201cclient_addr\u201d is the worker hosting the pod.<\/p>\n<p>On the worker server, use <em>conntrack<\/em> to identify the actual source of the connection \u2013 the IP address in \u201c-d\u201d is the IP address of the Postgresql server. To isolate a specific connection, select a \u201cclient_port\u201d from the list of connections (37900 in this case) and grep for the port. You will see the src IP of the individual POD.<\/p>\n<pre>lhost1750:~ # conntrack -L -f ipv4 -d 10.24.29.140 -o extended | grep 37900\r\nipv4 2 tcp 6 86394 ESTABLISHED src=<strong>10.244.4.80<\/strong> dst=10.24.29.140 sport=37900 dport=5432 src=10.24.29.140 dst=10.24.29.155 sport=5432 dport=37900 [ASSURED] mark=0 use=1\r\nconntrack v1.4.4 (conntrack-tools): 27 flow entries have been shown.<\/pre>\n<p>Then use kubeadm to identify which pod is assigned that address:<\/p>\n<pre>lhost1745:~ # kubectl get po --all-namespaces -o wide | grep \"10.244.4.80\"\r\nkstreams kafka-stream-app-deployment-1336-d8f7d7456-2n24x 2\/2 Running 0 10d 10.244.4.80 lhost0.example.net &lt;none&gt; &lt;none&gt;<\/pre>\n<p>In this case, we\u2019ve got an application automatically scaling up that can have 25 connections help open and idle \u2026 so there isn\u2019t really a solution other than increasing the number of available connections to a number that\u2019s appropriate given the number of client connections we plan on leaving open. I also want to enact a connection limit on the individual account \u2013 if there are 250 connections available on the Postgresql server, then limit the application to 200 of those connections.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We\u2019ve been seeing an error that prevents clients from connecting to Postgresql servers \u2013 basically that all available connections are in use and the remaining connections are reserved for superuser and replication activity. First, we need to determine what the connection limit is &#8212; SELECT setting, source, sourcefile, sourceline FROM pg_settings WHERE name = &#8216;max_connections&#8217;; &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1522,30],"tags":[1542,615,1294,1349],"class_list":["post-8818","post","type-post","status-publish","format-standard","hentry","category-postgresql","category-system-administration","tag-k8s","tag-kubernetes","tag-networking","tag-postgresql"],"_links":{"self":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/8818","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8818"}],"version-history":[{"count":1,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/8818\/revisions"}],"predecessor-version":[{"id":8823,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/8818\/revisions\/8823"}],"wp:attachment":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8818"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8818"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8818"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}