{"id":11332,"date":"2025-01-06T11:55:22","date_gmt":"2025-01-06T16:55:22","guid":{"rendered":"https:\/\/www.rushworth.us\/lisa\/?p=11332"},"modified":"2025-01-07T19:49:05","modified_gmt":"2025-01-08T00:49:05","slug":"parsing-har-file","status":"publish","type":"post","link":"https:\/\/www.rushworth.us\/lisa\/?p=11332","title":{"rendered":"Parsing HAR File"},"content":{"rendered":"\n<p>I am working with a new application that doesn&#8217;t seem to like when a person has multiple roles assigned to them &#8230; however, I first need to prove that is the problem. Luckily, your browser gets the SAML response and you can actually see the Role entitlements that are being sent. Just need to parse them out of the big 80 meg file that a simple &#8220;go here and log on&#8221; generates!<\/p>\n\n\n\n<p>To gather data to be parsed, open the Dev Tools for the browser tab. Click the settings gear icon and select &#8220;Persist Logs&#8221;. Reproduce the scenario &#8211; navigate to the site, log in. Then save the dev tools session as a HAR file. The following Python script will analyze the file, extract any SAML response tokens, and print them in a human-readable format. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n################################################################################\n# This script reads a HAR file, identifies HTTP requests and responses containing\n# SAML tokens, and decodes &quot;SAMLResponse&quot; values.\n#\n# The decoded SAML assertions are printed out for inspection in a readable format.\n#\n# Usage:\n# - Update the str_har_file_path with your HAR file\n################################################################################\n# Editable Variables\nstr_har_file_path = &#039;SumoLogin.har&#039;\n\n# Imports\nimport json\nimport base64\nimport urllib.parse\nfrom xml.dom.minidom import parseString\n\n################################################################################\n#  This function decodes SAML responses found within the HAR capture\n# Args: \n#   saml_response_encoded(str): URL encoded, base-64 encoded SAML response\n# Returns:\n#   string: decoded string\n################################################################################\ndef decode_saml_response(saml_response_encoded):\n    url_decoded = urllib.parse.unquote(saml_response_encoded)\n    base64_decoded = base64.b64decode(url_decoded).decode(&#039;utf-8&#039;)\n    return base64_decoded\n\n################################################################################\n#  This function finds and decodes SAML tokens from HAR entries.\n#\n# Args:\n#   entries(list): A list of HTTP request and response entries from a HAR file.\n#\n# Returns:\n#   list: List of decoded SAML assertion response strings.\n################################################################################\ndef find_saml_tokens(entries):\n    saml_tokens = &#x5B;]\n    for entry in entries:\n        request = entry&#x5B;&#039;request&#039;]\n        response = entry&#x5B;&#039;response&#039;]\n        \n        if request&#x5B;&#039;method&#039;] == &#039;POST&#039;:\n            request_body = request.get(&#039;postData&#039;, {}).get(&#039;text&#039;, &#039;&#039;)\n            \n            if &#039;SAMLResponse=&#039; in request_body:\n                saml_response_encoded = request_body.split(&#039;SAMLResponse=&#039;)&#x5B;1].split(&#039;&amp;&#039;)&#x5B;0]\n                saml_tokens.append(decode_saml_response(saml_response_encoded))\n        \n        response_body = response.get(&#039;content&#039;, {}).get(&#039;text&#039;, &#039;&#039;)\n        \n        if response.get(&#039;content&#039;, {}).get(&#039;encoding&#039;) == &#039;base64&#039;:\n            response_body = base64.b64decode(response_body).decode(&#039;utf-8&#039;, errors=&#039;ignore&#039;)\n        \n        if &#039;SAMLResponse=&#039; in response_body:\n            saml_response_encoded = response_body.split(&#039;SAMLResponse=&#039;)&#x5B;1].split(&#039;&amp;&#039;)&#x5B;0]\n            saml_tokens.append(decode_saml_response(saml_response_encoded))\n    \n    return saml_tokens\n\n################################################################################\n#  This function converts XML string to an XML dom object formatted with\n# multiple lines with heirarchital indentations\n#\n# Args:\n#   xml_string (str): The XML string to be pretty-printed.\n#\n# Returns:\n#   dom: A pretty-printed version of the XML string.\n################################################################################\ndef pretty_print_xml(xml_string):\n    dom = parseString(xml_string)\n    return dom.toprettyxml(indent=&quot;  &quot;)\n\n# Load HAR file with UTF-8 encoding\nwith open(str_har_file_path, &#039;r&#039;, encoding=&#039;utf-8&#039;) as file:\n    har_data = json.load(file)\n\nentries = har_data&#x5B;&#039;log&#039;]&#x5B;&#039;entries&#039;]\n\nsaml_tokens = find_saml_tokens(entries)\nfor token in saml_tokens:\n    print(&quot;Decoded SAML Token:&quot;)\n    print(pretty_print_xml(token))\n    print(&#039;-&#039; * 80)\n<\/pre><\/div>\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I am working with a new application that doesn&#8217;t seem to like when a person has multiple roles assigned to them &#8230; however, I first need to prove that is the problem. Luckily, your browser gets the SAML response and you can actually see the Role entitlements that are being sent. Just need to parse &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[30],"tags":[664,1701,326],"class_list":["post-11332","post","type-post","status-publish","format-standard","hentry","category-system-administration","tag-python","tag-saml","tag-sso"],"_links":{"self":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/11332","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=11332"}],"version-history":[{"count":1,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/11332\/revisions"}],"predecessor-version":[{"id":11333,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=\/wp\/v2\/posts\/11332\/revisions\/11333"}],"wp:attachment":[{"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=11332"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=11332"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rushworth.us\/lisa\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=11332"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}