Category: System Administration

Listing Unit Files

I usually know what the name of the unit file for a service is … but sometimes you just need to ask what’s there. Or search for one that isn’t showing up with the expected name.

linux1505:~ # systemctl list-unit-files | grep zfs
zfs-import-cache.service                   enabled
zfs-import-scan.service                    disabled
zfs-import.service                         masked
zfs-load-key.service                       masked
zfs-mount.service                          enabled
zfs-scrub@.service                         static
zfs-share.service                          enabled
zfs-volume-wait.service                    enabled
zfs-zed.service                            enabled
zfs-import.target                          enabled
zfs-volumes.target                         disabled
zfs.target                                 enabled
zfs-scrub-monthly@.timer                   disabled
zfs-scrub-weekly@.timer                    disabled

SSSD LDAP Schema

I lost access to all of my Linux servers at work. And, unlike the normal report where nothing changed but xyz is now failing, I knew exactly what happened. A new access request had been approved about ten minutes previously. Looking at my ID, for some reason adding a new group membership changed account gid number to that new group. Except … that shouldn’t have actually dropped my access. If I needed the group to be my primary ID, I should have been able to use newgrp to switch contexts. Instead, I got prompted for a group password (which, yes, is a thing. No, no one uses it).

The hosts were set up to authenticate to AD using LDAP, and very successfully let me log in (or not, if I mistyped my password). They, however, would only see me as a member of my primary group. Well, today, I finally got a back door with sufficient access to poke around.

Turns out I was right — something was improperly configured so groups were not being read from the directory but rather implied from the gid value. I added the configuration parameter ldap_schema to instruct the server to use member instead of memberUid for memberships. I used rfc2307bis as that’s the value I was familiar with. I expect “AD” could be used as well, but figured we were well beyond AD 2008r2 and didn’t really want to dig farther into the nuanced differences between the two settings.

From https://linux.die.net/man/5/sssd-ldap

ldap_schema (string)

Specifies the Schema Type in use on the target LDAP server. Depending on the selected schema, the default attribute names retrieved from the servers may vary. The way that some attributes are handled may also differ.

Four schema types are currently supported:

  • rfc2307
  • rfc2307bis
  • IPA
  • AD

The main difference between these schema types is how group memberships are recorded in the server. With rfc2307, group members are listed by name in the memberUid attribute. With rfc2307bis and IPA, group members are listed by DN and stored in the member attribute. The AD schema type sets the attributes to correspond with Active Directory 2008r2 values.

 

Sumo Logic: Running Queries via API

This is my base script for using the Sumo Logic API to query logs and analyze data. This particular script finds hosts sending syslog data successfully through our firewall, looks who owns the netblock (they weren’t all internal!), and checks our configuration management database (cmdb) to see if we have a host registered with the destination IP address of the syslog traffic.

import requests
from requests.auth import HTTPBasicAuth
import time
from collections import defaultdict
import cx_Oracle
import pandas as pd
import ipaddress
from datetime import datetime
from ipwhois import IPWhois
from ipwhois.exceptions import IPDefinedError

# Import credentials from a config file
from config import access_id, access_key, oracle_username, oracle_password

# Initialize Oracle Client
cx_Oracle.init_oracle_client(lib_dir=r"C:\Oracle\instantclient_21_15")
oracle_dsn = cx_Oracle.makedsn('cmdb_db.example.com', 1521, service_name='cmdb_db.example.com')

# Function to query Oracle database
def query_oracle_cmdb(strIPAddress):
    with cx_Oracle.connect(user=oracle_username, password=oracle_password, dsn=oracle_dsn) as connection:
        cursor = connection.cursor()
        query = """
            SELECT HOSTNAME, FRIENDLYNAME, STATUS, COLLECTIONTIME, RETIREDBYDISPLAYNAME, 
                    RETIREDDATETIME, SERVERAPPSUPPORTTEAM, SERVERENVIRONMENT
            FROM NBIREPORT.CHERWELL_CMDBDATA_FULL
            WHERE IPADDRESS = :ipaddy
        """
        cursor.execute(query, [strIPAddress])
        result = cursor.fetchone()
        cursor.close()
        return result if result else ("",) * 8

# Function to determine IP ownership
def get_ip_ownership(ip):
    # Define internal IP ranges
    internal_networks = [
        ipaddress.IPv4Network("10.0.0.0/8"),
        ipaddress.IPv4Network("172.16.0.0/12"),
        ipaddress.IPv4Network("192.168.0.0/16")
    ]
    
    # Check if the IP is internal
    ip_obj = ipaddress.IPv4Address(ip)
    if any(ip_obj in network for network in internal_networks):
        return "INTERNAL"
    
    # For external IPs, use ipwhois to get ownership info
    try:
        obj = IPWhois(ip)
        result = obj.lookup_rdap(depth=1)
        ownership = result['network']['name']
    except IPDefinedError:
        ownership = "Reserved IP"
    except Exception as e:
        print(f"Error looking up IP {ip}: {e}")
        ownership = "UNKNOWN"
    
    return ownership

# Base URL for Sumo Logic API
base_url = 'https://api.sumologic.com/api/v1'

# Define the search query
search_query = '''
(dpt=514)
AND _sourcecategory = "observe/perimeter/firewall/logs"
| where !(act = "deny")
| where !(act = "timeout")
| where !(act = "ip-conn")
| where (proto=17 or proto=6)
| count dst, act
'''

# Function to create and manage search jobs
def run_search_job(start_time, end_time):
    search_job_data = {
        'query': search_query,
        'from': start_time,
        'to': end_time,
        'timeZone': 'UTC'
    }

    # Create a search job
    search_job_url = f'{base_url}/search/jobs'
    response = requests.post(
        search_job_url,
        auth=HTTPBasicAuth(access_id, access_key),
        json=search_job_data
    )

    if response.status_code != 202:
        print('Error starting search job:', response.status_code, response.text)
        return None

    # Get the search job ID
    job_id = response.json()['id']
    print('Search Job ID:', job_id)

    # Poll for the search job status
    job_status_url = f'{search_job_url}/{job_id}'
    while True:
        response = requests.get(job_status_url, auth=HTTPBasicAuth(access_id, access_key))
        status = response.json().get('state', None)
        print('Search Job Status:', status)
        if status in ['DONE GATHERING RESULTS', 'CANCELLED', 'FAILED']:
            break
        time.sleep(5)  # Reasonable delay to prevent overwhelming the server

    return job_id if status == 'DONE GATHERING RESULTS' else None

# Function to retrieve results of a search job
def retrieve_results(job_id):
    dst_counts = defaultdict(int)
    results_url = f'{base_url}/search/jobs/{job_id}/messages'
    offset = 0
    limit = 1000

    while True:
        params = {'offset': offset, 'limit': limit}
        try:
            response = requests.get(results_url, auth=HTTPBasicAuth(access_id, access_key), params=params, timeout=30)
            if response.status_code == 200:
                results = response.json()
                messages = results.get('messages', [])
                
                for message in messages:
                    message_map = message['map']
                    dst = message_map.get('dst')
                    if dst:
                        dst_counts[dst] += 1
                
                if len(messages) < limit:
                    break

                offset += limit
            else:
                print('Error retrieving results:', response.status_code, response.text)
                break
        except requests.exceptions.RequestException as e:
            print(f'Error during request: {e}')
            time.sleep(5)
            continue

    return dst_counts

# Main execution
if __name__ == "__main__":
    # Prompt for the start date
    start_date_input = input("Enter the start date (YYYY-MM-DD): ")
    try:
        start_time = datetime.strptime(start_date_input, "%Y-%m-%d").strftime("%Y-%m-%dT00:00:00")
    except ValueError:
        print("Invalid date format. Please enter the date in YYYY-MM-DD format.")
        exit()

    # Use today's date as the end date
    end_time = datetime.now().strftime("%Y-%m-%dT00:00:00")

    # Create a search job
    job_id = run_search_job(start_time, end_time)
    if job_id:
        # Retrieve and process results
        dst_counts = retrieve_results(job_id)

        # Prepare data for Excel
        data_for_excel = []

        print("\nDestination IP Counts and Oracle Data:")
        for dst, count in dst_counts.items():
            oracle_data = query_oracle_cmdb(dst)
            ownership = get_ip_ownership(dst)
            # Use only Oracle data columns
            combined_data = (dst, count, ownership) + oracle_data
            data_for_excel.append(combined_data)
            print(combined_data)

        # Create a DataFrame and write to Excel
        df = pd.DataFrame(data_for_excel, columns=[
            "IP Address", "Occurrence Count", "Ownership",
            "CMDB_Hostname", "CMDB_Friendly Name", "CMDB_Status", "CMDB_Collection Time", 
            "CMDB_Retired By", "CMDB_Retired Date", "CMDB_Support Team", "CMDB_Environment"
        ])

        # Generate the filename with current date and time
        timestamp = datetime.now().strftime("%Y%m%d-%H%M")
        output_file = f"{timestamp}-sumo_oracle_data.xlsx"
        df.to_excel(output_file, index=False)
        print(f"\nData written to {output_file}")
    else:
        print('Search job did not complete successfully.')

AD passwordLastSet Times

I’m doing “stuff” in AD again, and have again come across Microsoft’s wild “nanoseconds elapsed since 1601” reference time. AKA “Windows file time”. In previous experience, I was just looking to calculate deltas (how long since that password was set) so figuring out now, subtracting then, and converting nanoseconds elapsed into something a little less specific (days, for example) was fine. Today, though, I need to display a human readable date and time in Excel. Excel, which has its own peculiar way of storing date time values. Fortunately, I happened across a formula that works

=((C2-116444736000000000)/864000000000)+DATE(1970,1,1)

Voila!

Quick sed For Sanitizing Config Files

When sending configuration files to other people for reference, I like to redact any credential-type information … endpoints that allow you to post data without creds, auth configs, etc. Sometimes I replace the string with REDACTED and sometimes I just drop the line completely.

Make a copy of the config files elsewhere, then run sed


# Retain parameter but replace value with REDACTED
sed -i 's|http_post_url: "https://.*"|post_url: "REDACTED"|' *.yaml

# Remove line from config
sed -i '/authorization: Basic/d' *.yaml

SNMPWalk

I’ve been doing a lot of testing with SNMP this week, and it is helpful to have an ad hoc SNMP client that can retrieve data before you go about trying to retrieve and parse data in your own code. I’m a lot more confident telling someone they gave me a bad community string if someone else’s “known working” program fails! Enter snmpwalk

Some of our devices return data out of order, so I need the -Cc (turn off check for increasing OID numbers). The following command walks the 1.3.6.1.2.1.2.2.1.2 (ifDescr) tree for host 10.13.115.82 using the community string C0mmun1tyStr1ngH3r3:

snmpwalk -v 2c -c C0mmun1tyStr1ngH3r3 -Cc "10.13.115.82" .1.3.6.1.2.1.2.2.1.2
IF-MIB::ifDescr.1 = STRING: eth 6/0
IF-MIB::ifDescr.2 = STRING: eth 7/0
IF-MIB::ifDescr.1000072 = STRING: XGige 6/0
IF-MIB::ifDescr.1000073 = STRING: XGige 6/1
IF-MIB::ifDescr.1000074 = STRING: XGige 6/2
IF-MIB::ifDescr.1000075 = STRING: XGige 6/3
IF-MIB::ifDescr.1000076 = STRING: XGige 6/4
IF-MIB::ifDescr.1000077 = STRING: XGige 6/5
IF-MIB::ifDescr.1000078 = STRING: XGige 6/6
IF-MIB::ifDescr.1000079 = STRING: XGige 6/7
IF-MIB::ifDescr.1000080 = STRING: Gige 6/0

PowerShell to Uninstall an Application

I was curious if, instead of getting prompted for the local admin account for each application I want to remove, I could run PowerShell “as a different user” then use it to uninstall an application or list of applications. In this case, all of the .NET 6 stuff. Answer: absolutely.

# List all installed applications containing string "NET"
# Get-WmiObject -Class Win32_Product | Where-Object { $_.Name -like '*NET*' } | Select-Object -Property Name

# Define a static list of application names to uninstall
$appsToUninstall = @(
    'Microsoft .NET Runtime - 6.0.36 (x86)',
    'Microsoft .NET Host FX Resolver - 6.0.36 (x64)',
    'Microsoft .NET Host - 6.0.36 (x64)',
    'Microsoft .NET Host FX Resolver - 6.0.36 (x86)',
    'Microsoft .NET Host - 6.0.36 (x86)',
    'Microsoft .NET Runtime - 6.0.36 (x64)',
    'Microsoft.NET.Workload.Emscripten.net6.Manifest (x64)'
)

# Loop through each application name in the static list
foreach ($appName in $appsToUninstall) {
    # Find the application object by name
    $app = Get-WmiObject -Class Win32_Product | Where-Object { $_.Name -eq $appName }
    
    # Check if the application was found before attempting to uninstall
    if ($app) {
        Write-Output "Uninstalling $($app.Name)..."
        $app.Uninstall() | Out-Null
        Write-Output "$($app.Name) has been uninstalled successfully."
    }
    else {
        Write-Output "Application $appName not found."
    }
}

Fedora 40: NFTables not logging

We upgraded Anya’s laptop to Fedora 40, and Skype has evidently moved from an installable RPM to a snap package. Which didn’t work with the firewall rules we built earlier in the year (video and audio calls would not connect); and, worse, nothing logs out. Looks like the netfilter kernel logging isn’t enabled

Enabled the logging:

echo 1 | sudo tee /proc/sys/net/netfilter/nf_log_all_netns

And, voila, we’ve got log records from nftables. And now Skype works … so I don’t know what to add. Sigh!

Removing and Recreating a ZFS Pool

In testing out various ways to achieve disk compression on our PostgreSQL servers, I ended up with a server build with a version of ZFS newer that the package distribution. Which means I needed to recreate the pool to use an older version of ZFS that would be updated as part of the routine patching. Beyond backing up and restoring the data …

# Get rid of existing pool

zpool export pgpool
zpool destroy pgpool
zpool list # this still shows a pool on sdb

# Clear the label

zpool labelclear /dev/sdb

# Didn’t work, so blow away everything on sdb

dd if=/dev/zero of=/dev/sdb bs=1M count=10
wipefs -a /dev/sdb

# Uninstall custom built zfs

cd /root/zfs
make uninstall

Install new ZFS

yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
yum install kernel-devel

yum install https://zfsonlinux.org/epel/zfs-release-2-3$(rpm –eval “%{dist}”).noarch.rpm
dnf config-manager –disable zfs
dnf config-manager –enable zfs-kmod
yum install zfs

# Sign kernel modules

/usr/src/kernels/$(uname -r)/scripts/sign-file sha256 /root/signing/MOK.priv /root/signing/MOK.der /lib/modules/$(uname -r)/extras/zfs/avl/zavl.ko

/usr/src/kernels/$(uname -r)/scripts/sign-file sha256 /root/signing/MOK.priv /root/signing/MOK.der /lib/modules/$(uname -r)/extras/zfs/zfs/zfs.ko

# Reboot

init 6

# And start over — recreate the pool

zpool create pgpool sdb
zfs create pgpool/pgdata
zfs set compression=lz4 pgpool/pgdata
df -h /pgpool/pgdata/

Verifying Public and Private Keys Go Together

I have no idea how exactly I managed this — but I was renewing certificates on a group of servers and had one that would not work. It’s a Java app, and it just threw a generic handshake error. Even adding debugging didn’t add any useful information. It just didn’t work. Turns out my pubilc key and private key files didn’t go together. I didn’t bother figuring out which one I got wrong — I just downloaded the zip file from our cert provider again.

Using openssl to check the modulus of the cert and key — by getting an md5 checksum of the value, it’s a little easier to compare. This public private key pair go together — they’ve got the same modulus. My original files? Not so much — two different values!

linux1570:certs # openssl x509 -noout -modulus -in /opt/elk/opensearch_config/certs/20240722/$(hostname).pem | openssl md5
(stdin)= 52ca3e85fa7cb564dd395a8f801f9bdf
linux1570:certs # openssl rsa -noout -modulus -in /opt/elk/opensearch_config/certs/20240722/$(hostname)-nopass.key | openssl md5
(stdin)= 52ca3e85fa7cb564dd395a8f801f9bdf