Category: System Administration

Signing Kernel Modules

The new servers being built at work use SecureBoot — something that you don’t even notice 99% of the time. But that 1% where you are doing something “strange” like trying to use OpenZFS … well, you’ve got to sign any kernel modules that you need to use. Just installing them doesn’t work — they won’t load.

To sign a kernel module, first you need to create a signing key and use mokutil to import it into the machine owner key store.

cd /root
mkdir signing
cd signing
openssl req -new -x509 -newkey rsa:2048 -keyout MOK.priv -outform DER -out MOK.der -nodes -days 36500 -subj "/CN=Windstream/"

mokutil --import MOK.der

When you run mokutil, you will set a password. This password will be needed to complete importing the key to the machine.

Get access to the console — out of band management, vSphere manager, stand in front of the server. Reboot, and there will be a “press any key” screen for ten seconds that begins the import process. Press any key!

Select “Enroll MOK”

View the key and verify it is the right one, then use ‘Continue’ to import it

Enter the password used when you ran mokutil

Then reboot

To verify your key has been successfully enrolled:

mokutil --list-enrolled

Postgresql with File System Compression – VDO and ZFS

Our database storage is sizable. To reduce the financial impact of storing so much data, we opted to use a compressed file system. This allows us to maintain, for example, 8TB of data in under 2TB of space. Unfortunately, the ZFS file system we use to compress our data is no longer “built in” with newer version of RedHat.

There are alternatives. BTRFS is a long-standing option, however it’s got reliability issues (we piloted BTRFS on one of the read only replicas, and the compression ratio is nowhere near as good — the 2TB of ZFS data filled the 10TB BTRFS disk even using the better compression option. And I/O was so slow there was a continual replication backlog). RedHat introduced Virtual Disk Optimizer to replace ZFS. In theory, it’s better since it also deduplicates data (e.g. if every one of us saved the same PPT presentation to the disk, only one copy would actually be stored). That’s great for email and file shares where a lot of people are likely to store the same information. Not so useful on a database server where there’s little to de-duplicate. It does, however, compress data … so we decided to try it out.

The results, unfortunately, are not spectacular. VDO does not allow you to do much customization of the compression. It’s on or off. I’ve found some people tweaking it up in unsupported ways, but the impetus behind trying VDO was that it’s supported by RedHat. Making unsupported changes to it defeats that purpose. And the compression that we’re seeing is far less than we get in ZFS. Our existing servers run between 4.5x and 6x compression

In VDO, however, we don’t even get a 2x compression factor. 11TB of information is stored in 8TB of space! That’s 1.4x

So, while we found the performance of VDO to be satisfactory and it’s really easy to use in newer RedHat releases … we’d have to increase our 20TB LUNs to 80TB to continue storing the data we store today. That seems like A Really Bad Idea(tm).

Seems like I’m going to have to sort out using OpenZFS on the new servers.

Tableau — Who Deleted That Workbook?!?

While Tableau doesn’t have anything nice like a ‘dumpster’ from which you can restore a deleted workbook, it does at least keep tables for historic events like workbook deletion. The following query finds records where a workbook with FOOBAR in its name was deleted. It lists all of the event info as well as info on the user who deleted it. Near as I can tell, the “created” date for the historical_events table is the date the workbook was deleted (from my restore, I know the workbook itself was created last year!)

SELECT historical_events.*, hist_workbooks.*, hist_users.*
FROM historical_events
left outer join historical_event_types on historical_event_types.type_id = historical_events.historical_event_type_id 
left outer join hist_workbooks on hist_workbooks.id = historical_events.hist_workbook_id 
left outer join hist_users on hist_users.id = historical_events.hist_actor_user_id 
WHERE historical_event_types.name = 'Delete Workbook'
and hist_workbooks.name like '%FOOBAR%'
;

Oracle SQLNET.ORA Active Directory Anonymous Binds

A very, very long time ago (2002-ish), we moved to using AD to store our Oracle connections — it’s far easier to edit the one connection entry in Active Directory than to distribute the latest connection file to every desktop and server in the company. Frankly, they never get to the servers. Individuals enter the connections they need … and update them when something stops working and they find the new host/port/etc. Unfortunately, Oracle used an anonymous connection to retrieve the data. So we’ve had anonymous binds enabled in Active Directory ever since. I no longer support AD, so haven’t really kept up with it … until a coworker asked why this huge security vulnerability was specifically configured for our domain. And I gave him the whole history. While we were chatting, a quick search revealed that Oracle 21c and later clients actually can use a wallet for credentials in the sqlnet.ora file:

MES.LDAP_AUTHENTICATE_BIND = TRUE
NAMES.LDAP_AUTHENTICATE_BIND_METHOD = LDAPS_SIMPLE_AUTH
WALLET_LOCATION = (SOURCE = (METHOD = FILE)
(METHOD_DATA =  (DIRECTORY = /path/to/wallet.file)  )

From https://www.oracle.com/a/otn/docs/database/oracle-net-active-directory-naming.pdf

 

Postgresql and Timescale with RedHat VDO

RedHat is phasing out ZFS – there are several reasons for this move, but primarily ZFS is a closed source Solaris (now Oracle) codebase. While OpenZFS exists, it’s not quite ‘the same’. RedHat’s preferred solution is Virtual Data Optimizer (VDO). This page walks through the process of installing PostgreSQL and creating a database cluster on VDO and installing TimescaleDB extension on the database cluster for RedHat Enterprise 8 (RHEL8)

Before we create a VDO disk, we need to install it

yum install vdo kmod-kvdo

Then we need to create a vdo – here a VDO named ‘PGData’ is created on /dev/sdb – a 9TB volume on which we will hold 16TB

vdo create --name=PGData --device=/dev/sdb --vdoLogicalSize=16T

Check to verify that the object was created – it is /dev/mapper/PGData in this instance

vdo list

Now format the volume using xfs.

mkfs.xfs /dev/mapper/PGData

And finally add a mount point

# Create the mount point folder
mkdir /pgpool
# Update fstab to mount the new volume to that mount pint
cat /etc/fstab
/dev/mapper/PGData /pgpool xfs defaults,x-systemd.requires=vdo.service 0 0
# Load the updated fstab
systemctl daemon-reload
# and mount the volume
mount -a

it should be mounted at ‘/pgpool/’

The main reason for using VDO with Postgres is because of its compression feature – this is automatically enabled, although we may need to tweak settings as we test it.

We now have a place in our pool where we want our Postgres database to store its data. So let’s go ahead and install PostgreSQL,

here we are using RHEL8 and installing PostgreSQL 12

# Install the repository RPM:
dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
dnf clean all
# Disable the built-in PostgreSQL module:
dnf -qy module disable postgresql
# Install PostgreSQL:
dnf install -y postgresql12-server

Once the installation is done we need to initiate the database cluster and start the server . Since we want our Postgres to store data in our VDO volume we need to initialize it into our custom directory, we can do that in many ways,

In all cases we need to make sure that the mount point of our zpool i.e., ‘/pgpool/pgdata/’ is owned by the ‘postgres’ user which is created when we install PostgreSQL. We can do that by running the below command before running below steps for starting the postgres server

mkdir /pgpool/pgdata
chown -R postgres:postgres /pgpool

Customize the systemd service by editing the postgresql-12 unit file and updateding the PGDATA environment variable

vdotest-uos:pgpool # grep Environment /usr/lib/systemd/system/postgresql-12.service
# Note: avoid inserting whitespace in these Environment= lines, or you may
Environment=PGDATA=/pgpool/pgdata

and  then initialize, enable and start our server as below

/usr/pgsql-12/bin/postgresql-12-setup initdb
systemctl enable postgresql-12
systemctl start postgresql-12

Here ‘/usr/pgsql-12/bin/’ is the bin directory of postgres installation you can substitute it with your bin directory path.

or

We can also directly give the data directory value while initializing db using below command

/usr/pgsql-12/bin/initdb -D /pgpool/pgdata/

and then start the server using

systemctl start postgresql-12

Now we have installed postgreSQL and started the server, we will install the Timescale extension for Postgres now.

add the time scale repo with below command

tee /etc/yum.repos.d/timescale_timescaledb.repo <<EOL
[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/8/\$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/timescale/timescaledb/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
EOL
sudo yum update -y

then install  it using below command

yum install -y timescaledb-postgresql-12

After installing we need to add ‘timescale’ to shared_preload_libraries in our postgresql.conf, Timescale gives us ‘timescaledb-tune‘ which can be used for this and also configuring different settings for our database. Since we initialize our PG database cluster in a custom location we need to point the direction of postgresql.conf to timescaledb-tune it also requires a path to our pg_config file we can do both by following command.

timescaledb-tune --pg-config=/usr/pgsql-12/bin/pg_config --conf-path=/pgpool/pgdata/postgresql.conf

After running above command we need to restart our Postgres server, we can do that by one of the below commands

systemctl restart postgresql-12

After restarting using one of the above commands connect to the database you want to use Timescale hypertables in and run below statement to load Timescale extension

CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

you can check if Timescale is loaded by passing ‘\dx’ command to psql which will load the extension list.

in order to configure PostgreSQL to allow remote connection we need to do couple of changes as below

Office 365 Activation Failure

We’ve been working to lock down our workstations … not “so secure you cannot use it”, but just this side of the functional/nonfunctional line. Everything went surprisingly well except I use the Office 365 suite for work. Periodically, it has to “phone home” and verify my work account is still valid. And that didn’t seem to go through the proxy well. The authentication screen would pop up and immediately throw an error:

No internet connection. Please check your network settings and try again [2604]

I spent a whole bunch of time playing around with the firewall rules, the proxy rules … and finally went so far as to just turn off the firewall and remove the proxy. And it still didn’t work. Which was nice because it means I didn’t break it … but also meant it was going to be a lot harder to fix!

Finally found the culprit — a new Windows installation, for some reason, uses really old SSL/TLS versions. Turned on 1.2 and, voila, I’ve got a sign-on screen. Sigh! Turned the firewall & proxy back on, and everything works beautifully. I think I’m going to add these settings to the domain policy so I don’t have to configure this silliness every time.

Firewall Settings: Local Network Access Plus Skype

I’m playing around with blocking all outbound connections on our computers and run most traffic through a proxy … Skype, however, won’t make voice/video calls with the HTTPS proxy set. We had to add a lot of subnets to the ruleset before the called party would get a ring. But it finally worked. This is the NFT ruleset, but I’ve got the same subnets added to the Windows Firewall too.

table inet filter {
        chain WIFI-FILTERONLYLOCAL {
                type filter hook output priority filter; policy accept;
                ip protocol tcp ip daddr 10.0.0.0/8 accept
                ip protocol udp ip daddr 10.0.0.0/8 accept
                ip protocol tcp ip daddr 13.64.0.0/11 accept
                ip protocol tcp ip daddr 13.96.0.0/13 accept
                ip protocol tcp ip daddr 13.104.0.0/14 accept
                ip protocol tcp ip daddr 13.107.0.0/16 accept
                ip protocol tcp ip daddr 13.107.6.171/32 accept
                ip protocol tcp ip daddr 13.107.18.15/32 accept
                ip protocol tcp ip daddr 13.107.140.6/32 accept
                ip protocol tcp ip daddr 20.20.32.0/19 accept
                ip protocol tcp ip daddr 20.180.0.0/14 accept
                ip protocol tcp ip daddr 20.184.0.0/13 accept
                ip protocol tcp ip daddr 20.190.128.0/18 accept
                ip protocol tcp ip daddr 20.192.0.0/10 accept
                ip protocol tcp ip daddr 20.202.0.0/16 accept
                ip protocol udp ip daddr 20.202.0.0/16 accept
                ip protocol tcp ip daddr 20.231.128.0/19 accept
                ip protocol tcp ip daddr 40.126.0.0/18 accept
                ip protocol tcp ip daddr 51.105.0.0/16 accept
                ip protocol tcp ip daddr 51.116.0.0/16 accept
                ip protocol tcp ip daddr 52.108.0.0/14 accept
                ip protocol tcp ip daddr 52.112.0.0/14 accept
                ip protocol tcp ip daddr 52.138.0.0/16 accept
                ip protocol udp ip daddr 52.138.0.0/16 accept
                ip protocol tcp ip daddr 52.145.0.0/16 accept
                ip protocol tcp ip daddr 52.146.0.0/15 accept
                ip protocol tcp ip daddr 52.148.0.0/14 accept
                ip protocol tcp ip daddr 52.152.0.0/13 accept
                ip protocol tcp ip daddr 52.160.0.0/11 accept
                ip protocol tcp ip daddr 52.244.37.168/32 accept
                ip protocol tcp ip daddr 138.91.0.0/16 accept
                ip protocol udp ip daddr 138.91.0.0/16 accept
                ip protocol icmp accept
                ip protocol udp ct state { established, related } accept
                limit rate over 1/second log prefix "FILTERONLYLOCAL: "
                drop
        }
}

Python Script: Alert for pending SAML IdP Certificate Expiry

I got a rather last minute notice from our security department that the SSL certificate used in the IdP partnership between my application and their identity provider would be expiring soon and did I want to renew it Monday, Tuesday, or Wednesday. Being that this was Friday afternoon … “none of the above” would have been my preference to avoid filing the “emergency change” paperwork, but Wednesday was the least bad of the three options. Of course, an emergency requires paperwork as to why you didn’t plan two weeks in advance. And how you’ll do better next time.

Sometimes that is a bit of a stretch — next time someone is working on the electrical system and drops a half-inch metal plate into the building wiring, I’m probably still going to have a problem when the power drops. But, in this case, there are two perfectly rational solutions. One, of course, would be that the people planning the certificate renewals start contacting partner applications more promptly. But that’s not within my purview. The thing I can do is watch the metadata on the identity provider and tell myself when the certificates will be expiring soon.

So I now have a little python script that has a list of all of our SAML-authenticated applications. It pulls the metadata from PingID, loads the X509 certificate, checks how far in the future the expiry date is. In my production version, anything < 30 days sends an e-mail alert. Next time, we can contact security ahead of time, find out when they’re planning on doing the renewal, and get the change request approved well in advance.

import requests
import xml.etree.ElementTree as ET
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from datetime import datetime, date

strIDPMetadataURLBase = 'https://login.example.com/pf/federation_metadata.ping?PartnerSpId='
listSPIDs = ["https://tableau.example.com", "https://email.example.com", "https://internal.example.com", "https://salestool.example.com"]

for strSPID in listSPIDs:
    objResults = requests.get(f"{strIDPMetadataURLBase}{strSPID}")
    if objResults.status_code == 200:
        try:
            root = ET.fromstring(objResults.text)

            for objX509Cert in root.findall("./{urn:oasis:names:tc:SAML:2.0:metadata}IDPSSODescriptor/{urn:oasis:names:tc:SAML:2.0:metadata}KeyDescriptor/{http://www.w3.org/2000/09/xmldsig#}KeyInfo/{http://www.w3.org/2000/09/xmldsig#}X509Data/{http://www.w3.org/2000/09/xmldsig#}X509Certificate"):
                strX509Cert = f"-----BEGIN CERTIFICATE-----\n{objX509Cert.text}\n-----END CERTIFICATE-----"

                cert = x509.load_pem_x509_certificate(bytes(strX509Cert,'utf8'), default_backend())
                iDaysUntilExpiry = cert.not_valid_after - datetime.today()
                print(f"{strSPID}\t{iDaysUntilExpiry.days}")
        except:
            print(f"{strSPID}\tFailed to decode X509 Certficate")
    else:
        print(f"{strSPID}\tFailed to retrieve metadata XML")