Category: Technology

ElasticSearch to OpenSearch: Local User Migration

One of the trickier bits of migrating from ElasticSearch to OpenSearch has been the local users — most of our users are authenticated via OAUTH, but programmatic access is done with local user accounts. Fortunately, you appear to be able to get the user password hash from the .opendistro_security API if you authenticate using an SSL cert.

This means the CN of the certificate being used must be registered in the elasticsearch.yml as an admin DN:

plugins.security.authcz.admin_dn:
  - 'CN=admin,O=LJRTest,ST=Ohio,C=US'
  - 'CN=ljradmin,O=LJRTest,ST=Ohio,C=US'

Provided the certificate is an admin_dn, the account can be used to search the .opendistro_security index and return local user info — including hashes. Information within the document is base 64 encoded, so the value needs to be decoded before you’ve got legible user information. One the user record has been obtained, the information can be used to POST details to the OpenSearch API and create a matching user.

import json
import requests
import base64
from requests.auth import HTTPBasicAuth

clientCrt = "./certs/ljr-mgr.pem"
clientKey = "./certs/ljr-mgr.key"
strOSAdminUser = 'something'
strOSAdminPass = 'something'

r = requests.get("https://elasticsearch.example.com:9200/.opendistro_security/_search?pretty", verify=False, cert=(clientCrt, clientKey))
if r.status_code == 200:
        dictResult = r.json()

        for item in dictResult.get('hits').get('hits'):
                if item.get('_id') == "internalusers":
                        strInternalUsersXML = item.get('_source').get('internalusers')
                        strUserJSON = base64.b64decode(strInternalUsersXML).decode("utf-8")
                        dictUserInfo = json.loads(strUserJSON)
                        for tupleUserRecord in dictUserInfo.items():
                                strUserName = tupleUserRecord[0]
                                dictUserRecord = tupleUserRecord[1]
                                if dictUserRecord.get('reserved') == False:
                                        dictUserDetails = {
                                                "hash": dictUserRecord.get('hash'),
                                                "opendistro_security_roles": dictUserRecord.get('opendistro_security_roles'),
                                                "backend_roles": dictUserRecord.get('backend_roles'),
                                                "attributes": dictUserRecord.get('attributes')
                                                }

                                        if dictUserRecord.get('description') is not None:
                                                dictUserDetails["description"] = dictUserRecord.get('description')

                                        reqCreateUser = requests.put(f'https://opensearch.example.com:9200/_plugins/_security/api/internalusers/{strUserName}', json=dictUserDetails, auth = HTTPBasicAuth(strOSAdminUser, strOSAdminPass), verify=False)
                                        print(reqCreateUser.text)
else:
        print(r.status_code)

ElasticSearch to OpenSearch Migration: Remote Reindex to Move Data

Since we cannot do an in-place upgrade of our ElasticSearch environment, I need to move everything to the new servers. The biggest component is moving the data — which can easily be done using the remote reindex. Use the ElasticSearch API to get a list of all indices, and tell the OpenSearch API to reindex that index from the ElasticSearch remote. This operates on deltas — it will add new documents to an index — so my plan is to spend a few days seeding the initial data, then perform delta updates leading up to the scheduled change.

import requests
from requests.auth import HTTPBasicAuth

f = open("results.txt", "a")

listIndexNames = []

reqGetIndexes = requests.get('https://elasticsearch.example.com:9200/_cat/indices?format=json', auth=HTTPBasicAuth('something','something'), verify=False)
for jsonIndex in reqGetIndexes.json():
        if jsonIndex.get('index')[0] != '.':
                listIndexNames.append(jsonIndex.get('index'))

for strIndexName in listIndexNames:
  jsonReindexItem = {
    "source": {
      "remote": {
        "host": "https://elasticsearch.example.com:9200",
        "username": "something",
        "password": "something"
      },
  "index": strIndexName
    },
    "dest": {
  "index": strIndexName
    }
  }

  r = requests.post('https://opensearch.example.com:9200/_reindex', json=jsonReindexItem, auth = HTTPBasicAuth('something', 'something'), verify=False)
  print(r.json())
  jsonResponse = r.json()

  if r.status_code == 400 and "mapping set to strict" in jsonResponse.get('failures')[0].get('cause').get("reason"):
    # {'error': {'root_cause': [{'type': 'x_content_parse_exception', 'reason': '[1:2] [reindex] unknown field [key]'}], 'type': 'x_content_parse_exception', 'reason': '[1:2] [reindex] unknown field [key]'}, 'status': 400}
    if jsonResponse.get('failures'):
      print(jsonResponse.get('failures')[0].get('cause').get("reason"))
      print("I need to set dynamic mapping")
      r2 = requests.put(f'https://opensearch.example.com:9200/{strIndexName}/_mapping', json={"dynamic":"true"}, auth = HTTPBasicAuth('something', 'something'), verify=False)
      print(r2.json)
      r3 = requests.post('https://opensearch.example.com:9200/_reindex', json=jsonReindexItem, auth = HTTPBasicAuth('something', 'something), verify=False)
      print(r.json())
      print(f"{strIndexName}\t{r3.status_code}\t{r.json()}\n")
      f.write(f"{strIndexName}\t{r3.status_code}\t{r.json()}\n")

  elif r.status_code == 200:
    print(jsonResponse)
    print(f"{strIndexName}\t{r.status_code}\t{r.json()}\n")
    f.write(f"{strIndexName}\t{r.status_code}\t{r.json()}\n")
  else:
    print(f"HTTP Error: {r.status_code} on web call")
    print(f"{strIndexName}\t{r.status_code}\t{r.json()}\n")
    f.write(f"{strIndexName}\t{r.status_code}\t{r.json()}\n")

f.close()

ElasticSearch to OpenSearch Migration: Creating Index Templates

Prior to creating the indices, I need to create the index templates.

import requests
from requests.auth import HTTPBasicAuth
import json
from time import sleep

def serialize_sets(obj):
        if isinstance(obj, set):
                return list(obj)
        return obj

listIgnoredTemplates = ['.watch-history', '.watch-history-1', '.watch-history-2', '.watch-history-3', '.watch-history-4', '.watch-history-5', '.watch-history-6', '.watch-history-7', '.watch-history-8', '.watch-history-9', '.watch-history-10', '.watch-history-11', 'ilm-history', 'ilm-history_2', 'tenant_template', '.monitoring-logstash']

# Get all roles from prod & list users in those roles
r = requests.get(f"https://elasticsearch.example.com:9200/_template", auth = HTTPBasicAuth('something', 'something'), verify=False)

dictAllTemplates= r.json()

for item in dictAllTemplates.items():
        if item[0] not in listIgnoredTemplates:
                if item[1].get('settings').get('index'):
                        iShards = (item[1].get('settings').get('index').get('number_of_shards'))
                        iReplicas = (item[1].get('settings').get('index').get('number_of_replicas'))
                else:
                        iShards = 3
                        iReplicas = 1
                if iShards is None:
                        iShards = 3
                if iReplicas is None:
                        iReplicas = 1
                if item[1].get('settings').get('index') and item[1].get('settings').get('index').get('lifecycle'):
                        jsonAddTemplate = {
                                 "index_patterns": item[1].get('index_patterns'),
                                  "template": {
                                    "aliases": {
                                      item[1].get('settings').get('index').get('lifecycle').get('rollover_alias'): {}
                                    },
                                    "settings": {
                                      "number_of_shards": iShards,
                                      "number_of_replicas": iReplicas
                                    },
                                    "mappings":        item[1].get('mappings')
                                    }
                                  }
                else:
                        jsonAddTemplate = {
                                  "index_patterns": item[1].get('index_patterns'),
                                  "template": {
                                    "settings": {
                                      "number_of_shards": iShards,
                                      "number_of_replicas": iReplicas
                                    },
                                    "mappings":         item[1].get('mappings')
                                    }
                                  }
                r2 = requests.put(f"https://opensearch.example.com:9200/_index_template/{item[0]}", json=jsonAddTemplate, auth = HTTPBasicAuth('something', 'something'), verify=False)
                print(r2.text)
                print(r2.status_code)
                sleep(2)

Another pretty buggy bug

We spent a morning trying to figure out why containers in a new installation of Swarm just couldn’t talk to each other. Overlay network looked fine. Firewall looked fine. You could get from the host to the container, just not from the container to a container on the other server. So … here’s a bug where your swarm (i.e. the thing you do when you want docker stuff to run across more than one server) cannot actually, ya know, talk to the other servers. Sigh!

https://github.com/moby/moby/issues/41775

Communicating With Kafka Server Using SSL

Update the Client Configuration

Use the keytool command to create a trust store with the CA chain used in your certificates. I am using Venafi, so I need to import two CA public keys:

keytool -keystore kafka.truststore.jks -alias SectigoRoot -import -file "Sectigo RSA Organization Validation Secure Server CA.crt"
keytool -keystore kafka.truststore.jks -alias UserTrustRoot -import -file "USERTrust RSA Certification Authority.crt"

Update the Client Configuration

Create a producer-ssl.properties or consumer-ssl.properties based on your current producer/consumer properties file. Update the port – 9095 is used for SSL – and append the following lines

security.protocol=SSLssl.truststore.location=/path/to/kafka.truststore.jks
ssl.truststore.password=<WhateverYouSetInThePreviousStep>

Using the CLI Client Tools

Once you have a property configured properties file, you can invoke either the kafka-console-consumer.sh or kafka-console-producer.sh scripts indicating your new properties file:

/kafka/bin/kafka-console-consumer.sh --bootstrap-server kafka1586.example.net:9095 --topic LJRTest --consumer.config /kafka/config/consumer-ssl.properties --group LJR5

/kafka/bin/kafka-console-producer.sh --bootstrap-server kafka1586.example.net:9095 --topic LJRTest --producer.config /kafka/config/producer-ssl.properties

To debug SSL communication, set the following KAFKA_OPTS prior to invoking the command line producer/consumer utilities:

export KAFKA_OPTS="-Djavax.net.debug=ssl,handshake"

Adding SSL To Kafka Server

Obtain SSL Certificates for Each Server

The following process was used to enable SSL communication with the Kakfa servers. Firstly, generate certificates for each server in the environment. I am using a third-party certificate provider, Venafi. When you download the certificates, make sure to select the “PEM (OpenSSL)” format and check the box to “Extract PEM content into separate files (.crt, .key)”

Upload each zip file to the appropriate server under /tmp/ named in the $(hostname).zip format. The following series of commands creates the files needed in the Kafka server configuration. You will be asked to set passwords for the keystore and truststore JKS files. Don’t forget what you use — we’ll need them later.

# Assumes Venafi certificates downloaded as OpenSSL zip files with separate public/private keys are present in /tmp/$(hostname).zip
mkdir /kafka/config/ssl/$(date +%Y)
cd /kafka/config/ssl/$(date +%Y)
mv /tmp/$(hostname).zip ./
unzip $(hostname).zip

# Create keystore for Kakfa
openssl pkcs12 -export -in $(hostname).crt -inkey $(hostname).key -out $(hostname).p12 -name $(hostname) -CAfile ./ca.crt -caname root
keytool -importkeystore -destkeystore $(hostname).keystore.jks -srckeystore $(hostname).p12 -srcstoretype pkcs12 -alias $(hostname)

# Create truststore from CA certs
keytool -keystore kafka.server.truststore.jks -alias SectigoRoot -import -file "Sectigo RSA Organization Validation Secure Server CA.crt"
keytool -keystore kafka.server.truststore.jks -alias UserTrustRoot -import -file "USERTrust RSA Certification Authority.crt"

# Fix permissions
chown -R kafkauser:kafkagroup /kafka/config/ssl

# Create symlinks for current-year certs
cd ..
ln -s /kafka/config/ssl/$(date +%Y)/$(hostname).keystore.jks /kafka/config/ssl/kafka.keystore.jks
ln -s /kafka/config/ssl/$(date +%Y)/kafka.server.truststore.jks /kafka/config/ssl/kafka.truststore.jks

By creating symlinks to the active certs, you can renew the certificates by creating a new /kafka/config/ssl/$(date +%Y) folder and updating the symlink. No change to the configuration files is needed.

Update Kafka server.properties to Use SSL

Append a listener prefixed with SSL:// to the existing listeners – as an example:

#2024-03-27 LJR Adding SSL port on 9095
#listeners=PLAINTEXT://kafka1587.example.net:9092
#advertised.listeners=PLAINTEXT://kafka1587.example.net:9092
listeners=PLAINTEXT://kafka1587.example.net:9092,,SSL://kafka1587.example.net:9095
advertised.listeners=PLAINTEXT://kafka1587.example.net:9092,SSL://kafka1587.example.net:9095

Then add configuration values to use the keystore and truststore, specify which SSL protocols will be permitted, and set whatever client auth requirements you want:

ssl.keystore.location=/kafka/config/ssl/kafka.keystore.jks
ssl.keystore.password=<WhateverYouSetEarlier>
ssl.truststore.location=/kafka/config/ssl/kafka.truststore.jks
ssl.truststore.password=<WhateverYouSetForThisOne>
ssl.enabled.protocols=TLSv1.2,TLSv1.3
ssl.client.auth=none # Or whatever auth setting you require

Save the server.properties file and use “systemctl restart kafka” to restart the Kafka service.

Update Firewall Rules to Permit Traffic on New Port

firewall-cmd –add-port=9095/tcp
firewall-cmd –add-port=9095/tcp –permanent

OpenSearch Proof of Concept In-Place Upgrade from ElasticSearch 7.7.0 to OpenSearch 2.12.0

I need to migrate my ElasticSearch installation over to OpenSearch. From reading the documentation, it isn’t really clear if that is even possible as an in-place upgrade or if I’d need to use a remote reindex or snapshot backup/restore. So I tested the process with a minimal data set. TL;DR: Yes, it works.

Create a docker instance of ElasticSearch 7.7.0

mkdir /docker/es/esdata
chmod -R g+dwx /docker/es/esdata
chgrp -R 0 /docker/es/esdata

mkdir /docker/es/esconfig

Populate configuration info into ./esconfig and ./esdata is an empty directory

docker run –name es770 -dit -v /docker/es/esdata:/usr/share/elasticsearch/data -v /docker/es/esconfig:/usr/share/elasticsearch/config -p 9200:9200 -p 9300:9300 -e “discovery.type=single-node” docker.elastic.co/elasticsearch/elasticsearch:7.7.0

Populate Data into ElasticSearch Sandbox

Use curl to populate an index with some records – you can create lifecycle policies, customize the fields, etc … this is the bare minimum to validate that data in ES7.7 can be ingested by OS2.12curl -X POST “localhost:9200/ljrtest/_bulk” -H “Content-Type: application/x-ndjson” -d’
{“index”: {“_id”: “1”}}
{“id”: “1”, “message”: “Record one”}
{“index”: {“_id”: “2”}}
{“id”: “2”, “message”: “Record two”}
{“index”: {“_id”: “3”}}
{“id”: “3”, “message”: “Record three”}
{“index”: {“_id”: “4”}}
{“id”: “4”, “message”: “Record four”}
{“index”: {“_id”: “5”}}
{“id”: “5”, “message”: “Record five”}
{“index”: {“_id”: “6”}}
{“id”: “6”, “message”: “Record six”}
{“index”: {“_id”: “7”}}
{“id”: “7”, “message”: “Record seven”}
{“index”: {“_id”: “8”}}
{“id”: “8”, “message”: “Record eight”}
{“index”: {“_id”: “9”}}
{“id”: “9”, “message”: “Record nine”}
{“index”: {“_id”: “10”}}
{“id”: “10”, “message”: “Record ten”}

Shut Down ElasticSearch

docker stop es770

Bring Up an OpenSearch 2.12 Host

mkdir /docker/es/osconfig

Populate the configuration data for OpenSearch in ./osconfig

docker run –name os212 -dit -v /docker/es/esdata:/usr/share/opensearch/data -v /docker/es/osconfig:/usr/share/opensearch/config -p 9200:9200 -p 9600:9600 -e “discovery.type=single-node” -e “OPENSEARCH_INITIAL_ADMIN_PASSWORD=P@s5w0rd-123” opensearchproject/opensearch:2.12.0

Verify Data is Still Available in OpenSearch

[root@docker es]# curl -k -u “admin:P@s5w0rd-123” https://localhost:9200/ljrtest
{“ljrtest”:{“aliases”:{},”mappings”:{“properties”:{“id”:{“type”:”text”,”fields”:{“keyword”:{“type”:”keyword”,”ignore_above”:256}}},”message”:{“type”:”text”,”fields”:{“keyword”:{“type”:”keyword”,”ignore_above”:256}}}}},”settings”:{“index”:{“creation_date”:”1710969477402″,”number_of_shards”:”1″,”number_of_replicas”:”1″,”uuid”:”AO5JBoyzSJiKZA9xeA2imQ”,”version”:{“created”:”7070099″,”upgraded”:”136337827″},”provided_name”:”ljrtest”}}}}

Conclusion

Yes, a very basic data set in ElasticSearch 7.7.0 can be upgraded in-place to OpenSearch 2.12.0 — in the “real world” compatibility issues will crop up (flatten!!), but the idea is fundamentally sound.

Problem, though, is compatibility issues. We don’t have exotic data types in our instance but Kibana uses “flatten” … so those rare people use use Kibana to access and visualize their data really cannot just move to OpenSearch. That’s a huge caveat. I can recreate everything manually after deleting all of the Kibana indices (and possibly some more, haven’t gone this route to see). But if I’m going to recreate everything, why wouldn’t I recreate everything and use remote reindex to move data? I can do this incrementally — take a week to move all the data slowly, do a catch-up reindex t-2 days, another t-1 days, another the day of the change, heck even one a few hours before the change. Then the change is a quick delta reindex, stop ElasticSearch, and swap over to OpenSearch. The backout is to just swing back to the fully functional, unchanged ElasticSearch instance.

Office 365 Activation Failure

We’ve been working to lock down our workstations … not “so secure you cannot use it”, but just this side of the functional/nonfunctional line. Everything went surprisingly well except I use the Office 365 suite for work. Periodically, it has to “phone home” and verify my work account is still valid. And that didn’t seem to go through the proxy well. The authentication screen would pop up and immediately throw an error:

No internet connection. Please check your network settings and try again [2604]

I spent a whole bunch of time playing around with the firewall rules, the proxy rules … and finally went so far as to just turn off the firewall and remove the proxy. And it still didn’t work. Which was nice because it means I didn’t break it … but also meant it was going to be a lot harder to fix!

Finally found the culprit — a new Windows installation, for some reason, uses really old SSL/TLS versions. Turned on 1.2 and, voila, I’ve got a sign-on screen. Sigh! Turned the firewall & proxy back on, and everything works beautifully. I think I’m going to add these settings to the domain policy so I don’t have to configure this silliness every time.

Firewall Settings: Local Network Access Plus Skype

I’m playing around with blocking all outbound connections on our computers and run most traffic through a proxy … Skype, however, won’t make voice/video calls with the HTTPS proxy set. We had to add a lot of subnets to the ruleset before the called party would get a ring. But it finally worked. This is the NFT ruleset, but I’ve got the same subnets added to the Windows Firewall too.

table inet filter {
        chain WIFI-FILTERONLYLOCAL {
                type filter hook output priority filter; policy accept;
                ip protocol tcp ip daddr 10.0.0.0/8 accept
                ip protocol udp ip daddr 10.0.0.0/8 accept
                ip protocol tcp ip daddr 13.64.0.0/11 accept
                ip protocol tcp ip daddr 13.96.0.0/13 accept
                ip protocol tcp ip daddr 13.104.0.0/14 accept
                ip protocol tcp ip daddr 13.107.0.0/16 accept
                ip protocol tcp ip daddr 13.107.6.171/32 accept
                ip protocol tcp ip daddr 13.107.18.15/32 accept
                ip protocol tcp ip daddr 13.107.140.6/32 accept
                ip protocol tcp ip daddr 20.20.32.0/19 accept
                ip protocol tcp ip daddr 20.180.0.0/14 accept
                ip protocol tcp ip daddr 20.184.0.0/13 accept
                ip protocol tcp ip daddr 20.190.128.0/18 accept
                ip protocol tcp ip daddr 20.192.0.0/10 accept
                ip protocol tcp ip daddr 20.202.0.0/16 accept
                ip protocol udp ip daddr 20.202.0.0/16 accept
                ip protocol tcp ip daddr 20.231.128.0/19 accept
                ip protocol tcp ip daddr 40.126.0.0/18 accept
                ip protocol tcp ip daddr 51.105.0.0/16 accept
                ip protocol tcp ip daddr 51.116.0.0/16 accept
                ip protocol tcp ip daddr 52.108.0.0/14 accept
                ip protocol tcp ip daddr 52.112.0.0/14 accept
                ip protocol tcp ip daddr 52.138.0.0/16 accept
                ip protocol udp ip daddr 52.138.0.0/16 accept
                ip protocol tcp ip daddr 52.145.0.0/16 accept
                ip protocol tcp ip daddr 52.146.0.0/15 accept
                ip protocol tcp ip daddr 52.148.0.0/14 accept
                ip protocol tcp ip daddr 52.152.0.0/13 accept
                ip protocol tcp ip daddr 52.160.0.0/11 accept
                ip protocol tcp ip daddr 52.244.37.168/32 accept
                ip protocol tcp ip daddr 138.91.0.0/16 accept
                ip protocol udp ip daddr 138.91.0.0/16 accept
                ip protocol icmp accept
                ip protocol udp ct state { established, related } accept
                limit rate over 1/second log prefix "FILTERONLYLOCAL: "
                drop
        }
}

Python Script: Alert for pending SAML IdP Certificate Expiry

I got a rather last minute notice from our security department that the SSL certificate used in the IdP partnership between my application and their identity provider would be expiring soon and did I want to renew it Monday, Tuesday, or Wednesday. Being that this was Friday afternoon … “none of the above” would have been my preference to avoid filing the “emergency change” paperwork, but Wednesday was the least bad of the three options. Of course, an emergency requires paperwork as to why you didn’t plan two weeks in advance. And how you’ll do better next time.

Sometimes that is a bit of a stretch — next time someone is working on the electrical system and drops a half-inch metal plate into the building wiring, I’m probably still going to have a problem when the power drops. But, in this case, there are two perfectly rational solutions. One, of course, would be that the people planning the certificate renewals start contacting partner applications more promptly. But that’s not within my purview. The thing I can do is watch the metadata on the identity provider and tell myself when the certificates will be expiring soon.

So I now have a little python script that has a list of all of our SAML-authenticated applications. It pulls the metadata from PingID, loads the X509 certificate, checks how far in the future the expiry date is. In my production version, anything < 30 days sends an e-mail alert. Next time, we can contact security ahead of time, find out when they’re planning on doing the renewal, and get the change request approved well in advance.

import requests
import xml.etree.ElementTree as ET
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from datetime import datetime, date

strIDPMetadataURLBase = 'https://login.example.com/pf/federation_metadata.ping?PartnerSpId='
listSPIDs = ["https://tableau.example.com", "https://email.example.com", "https://internal.example.com", "https://salestool.example.com"]

for strSPID in listSPIDs:
    objResults = requests.get(f"{strIDPMetadataURLBase}{strSPID}")
    if objResults.status_code == 200:
        try:
            root = ET.fromstring(objResults.text)

            for objX509Cert in root.findall("./{urn:oasis:names:tc:SAML:2.0:metadata}IDPSSODescriptor/{urn:oasis:names:tc:SAML:2.0:metadata}KeyDescriptor/{http://www.w3.org/2000/09/xmldsig#}KeyInfo/{http://www.w3.org/2000/09/xmldsig#}X509Data/{http://www.w3.org/2000/09/xmldsig#}X509Certificate"):
                strX509Cert = f"-----BEGIN CERTIFICATE-----\n{objX509Cert.text}\n-----END CERTIFICATE-----"

                cert = x509.load_pem_x509_certificate(bytes(strX509Cert,'utf8'), default_backend())
                iDaysUntilExpiry = cert.not_valid_after - datetime.today()
                print(f"{strSPID}\t{iDaysUntilExpiry.days}")
        except:
            print(f"{strSPID}\tFailed to decode X509 Certficate")
    else:
        print(f"{strSPID}\tFailed to retrieve metadata XML")