Category: System Administration

Verifying Connectivity From Locked Down Windows Desktop or Server

We frequently encounter individuals who cannot use something from their server or desktop — but their IT group has Windows locked down so they cannot just telnet to the destination on the port and check if it’s connecting. Windows doesn’t have a whole lot of useful tools of its own. Fortunately, I’ve found that nmap.org publishes a local install zip file for Windows.

Download latest Win32 zip file from https://nmap.org/dist/ — on 8/8/2023, that is https://nmap.org/dist/nmap-7.92-win32.zip

 

Extract the zip file contents somewhere (I use tmp, right in downloads works, whatever)
Open command prompt and change directory (cd) into the folder where nmap was extracted — e.g. cd /d c:\tmp\nmap-7.92

— A quick trick for opening a command prompt to a directory location: If you have a file explorer window open to the location, click into the header where the file path is shown and remove the text that appears there

Type cmd and hit enter

And voila — a command prompt opened to the same location you were viewing

In the command prompt, run an map command to test a specific port (-p) and host. Since some hosts do not return ICMP requests, I also include -P0 instructing nmap not to attempt pinging the host first. This example verifies we have connectivity to google.com on port 443 (https)

 

Linux: Disabling Wild Local DNS Server Thing (i.e. systemd-resolved)

I am certain there is some way to configure systemd-resolved to actually use internal DNS servers so you can resolve your local hostnames. But nothing I’ve tried have worked, and I don’t actually need this wild local DNS thing.

Here’s the problem — systemd-resolved creates an /etc/resolv.conf file that uses a localhost address as the nameserver — and that may very well forward requests out to Internet DNS servers. Which don’t have any clue about your internal DNS zones — thus you can no longer resolve local hostnames. Whenever I see 127.0.0.53 in /etc/resolv.conf, I know systemd-resolved is at work.

[lisa@linux ~]# cat /etc/resolv.conf
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search example.com

To disable this local name resolution, stop and disable systemd-resolved, unlink the /etc/resolv.conf file it created, and restart NetworkManager

[lisa@linux ~]# systemctl stop systemd-resolved.service
[lisa@linux ~]# systemctl disable systemd-resolved.service
[lisa@linux ~]# unlink /etc/resolv.conf
[lisa@linux ~]# systemctl restart NetworkManager
Removed /etc/systemd/system/multi-user.target.wants/systemd-resolved.service.
Removed /etc/systemd/system/dbus-org.freedesktop.resolve1.service.

Voila, /etc/resolv.conf is now populated with reasonable internal DNS servers, and you can resolve local hostnames.

[lisa@linux ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search example.com
nameserver 10.1.2.33
nameserver 10.1.2.66

Cisco Aironet — Unable to Access Wireless Device That Moved to Different Access Point

I think this is a fairly esoteric issue — something that happens frequently enough but doesn’t actually impact functionality so anyone notices. We got a new WiFi doorbell that we set up inside (and could access it), took outside (and could access it) … but, when we went back inside? We could not access the doorbell. No HTTPS, no RTSP, no ICMP. Nothing.

Cisco access points maintain a list of associated wireless clients. These may also be kept in an arp table, although arp caching appears to be disabled by default. So device was on AP1, moved to AP2. Clients on AP2 (or AP3, or AP4) were able to access it since the switch has it registered on the port for AP2. Anything on AP1, however, cannot access the device. The MAC address still appeared in the associations table for AP1. You can set a lower activity timeout — the default was one day — to clear devices out more promptly. But … if the device communicates outside of its new WAP, how frequently is it going to be talking to a device on its old WAP? Generally, we’re talking to our servers (wired) or the Internet (also wired). So technically … Scott’s cell phone couldn’t reach my cell phone when I go from the bedroom to the office. But we never notice because we have minimal peer to peer communication. It’s not like doorbells are going to go walking about normally … but it was good to know a quick AP reboot would allow our cell phones to pull up the doorbell’s video feed.

SSH’ing to Older Cisco Access Points

Trying to ssh into our Cisco access points, we get an error saying “no matching key exchange method found. Their offer: diffie-hellman-group1-sha1” … to one-off enable older, deprecated algorithms, we added a cisco.conf to /etc/ssh/ssh_config.d (/etc/ssh/ssh_config includes /etc/ssh/ssh_config.d/*.conf)

Host <IP>
     Ciphers 3des-cbc,blowfish-cbc,aes128-cbc,aes128-ctr,aes256-ctr
     KeyAlgorithms diffie-hellman-group1-sha1

And restart sshd — voila*, you can SSH into the router / access point / etc.

* — you may get an invalid key length error. In this case, you need to regenerate the key on the Cisco device using a 2048-bit key:

config term
crypto key zeroize rsa
crypto key generate rsa modulus 2048
end

Cisco Catalyst 2960-S: Capturing All Traffic Sent Through a Port

We had an issue where an IOT device was not able to establish the connection it wanted — it would report it couldn’t connect to the Internet. I knew it could connect to the Internet in general; but, without knowing what tiny part of the Internet it used to determine ‘connected’ or ‘not connected’, we were stuck. Except! We recently upgraded the switch in our house to a Cisco Catalyst 2960S — which allows me to do one of the cool things I’d seen the network guys at work do but had never been able to reproduce at home: using SPAN (Switched Port ANalyzer). When we’d encounter strange behavior with a network device where we couldn’t just install Wireshark and get a network capture, the network group would basically clone all of the traffic sent to the device’s port to another switch port where we could capture traffic. They would send me a capture file, and it was just like having a Wireshark capture.

You can set up SPAN from the command line configuration, but I don’t have a username/password pair to log into SSH (and can only establish this from the command line configuration). Before breaking out the Cisco console cable, I tried running Cisco Network Assistant (unfortunately, a discontinued product line). One of the options under “Configure” => “Switching” is SPAN:

Since there was no existing SPAN session, I had to select a session number.

Then find the two ports — in the Ingress/Egress/Destination column, the port that is getting the traffic you want needs to either have Ingress (only incoming traffic), Egress (only outgoing traffic), or Both (all traffic). The port to which you want to clone the traffic is set to Destination. And the destination encapsulation is Replicate. Click apply.

In the example above, the laptop plugged in to GE1/0/24 gets all of the traffic traversing GE1/0/5 — running tshark -w /tmp/TheProblem.cap writes the packet capture to a file for later analysis. Caveat — the destination port is no longer “online” — it receives traffic but isn’t sending or receiving its own traffic … so make sure you aren’t using remote access to control the device!

To remove the SPAN, change the Ingress/Egress/Destination values back to “none”, change the destination encapsulation back to select one, and apply.

Since the source port is connected to one of our wireless access points, the network capture encompasses all wireless traffic through that access point.

And we were easily able to identify that this particular device uses the rule “I can ping 8.8.8.8” to determine if it is connected to the Internet. We were able to identify a firewall rule that prevented ICMP replies; allowing this traffic immediately allowed the devices to connect as expected.

*Un*Registering SysInternals ProcMon as Task Manager Replacement

I like the sysinternals tools — I use them frequently at work. But, generally? When trying to look at the running Windows processes or how much memory is being used … I need a really small, simple tool that doesn’t add to the bogging that’s already happening. Which is why I hate when people replace taskmgr.exe with the SysInternals task manager on steroids. It’s too much information. The worst part is that the menu option to replace task manager doesn’t un-replace it.

To accomplish that — to revert to the real Windows task manager — you need to edit the registry. Navigate to HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\taskmgr.exe and remove the key named “Debugger” which points to the SysInternals binary.

Tableau Upgrade Failure

Attempting to upgrade Tableau from 2021.something to 2022.3.5 repeatedly failed with the following error in the upgrade log:

Failed with the error "Detected the old version of Tableau Server takes precedence on the system PATH".

 

And there were all sorts of things online to try — most of which involved changing the environment variables Tableau sets. But they weren’t wrong … or, rather, it was impossible to tell which was “right”: the new version I was trying to install or the old version that was still running. On a whim, I thought I’d try an admin-level command prompt. Opening the command prompt with the “Run as administrator” option allowed me to upgrade the server.

Except … I upgraded another server the next night. I knew to launch the command prompt in administrator mode, but I was incredibly dismayed to encounter the exact same error. Even using an admin level command prompt. Then it occurred to me — Windows evaluates the environment variables when you launch the shell (in fairness, every Unix/Linux variant I’ve encountered does the same thing). The installer must be changing the Tableau environment variables. If you use a command prompt that was open prior to running the installer … you don’t have the new values. You have the old ones — so the environment variable, as seen by the upgrade script, is pointing to an older version of Tableau. Launch a new command prompt, re-evaluate the environment variables, and the script now sees the proper version. Hopefully my next upgrade won’t include a panic inducing “yeah, that’s not gonna happen” error!

Tableau PostgreSQL Query: Finding All Datasources of Name or Type

I frequently need to find details on a data source based on its name and find all data sources of a particular type. Particularly, the Microsoft Graph permissions required to use Sharepoint and OneDrive data within Tableau changed — I needed to reach out to individuals who use those data types to build a business case for the Security organization to approve the new permissions be added to our tenant.

-- Query to find all data sources of a specific type or name 
select system_users.email, datasources.id, datasources.name, datasources.created_at, datasources.updated_at, datasources.db_class, datasources.db_name
, datasources.site_id, sites.name as SiteName, projects.name as ProjectName, workbooks.name as WorkbookName
from datasources
left outer join users on users.id = datasources.owner_id
left outer join system_users on users.system_user_id = system_users.id
left outer join sites on datasources.site_id = sites.id
left outer join projects on datasources.project_id = projects.id
left outer join workbooks on datasources.parent_workbook_id = workbooks.id
-- where datasources.name like '%Sheet1 (LJR Sample%'
where datasources.db_class = 'onedrive'
order by datasources.name
;

Tableau PostgreSQL Query: Stats About Data Source Types

I wanted to report on the different types of data sources used in our Tableau instance — as well as show how many of each type are in use.

-- Query to found how many of each data source type
select datasources.db_class, count(datasources.db_class) as count
from datasources
left outer join users on users.id = datasources.owner_id
left outer join system_users on users.system_user_id = system_users.id
left outer join sites on datasources.site_id = sites.id
left outer join projects on datasources.project_id = projects.id
left outer join workbooks on datasources.parent_workbook_id = workbooks.id
group by datasources.db_class 
order by count desc ;

Answer: About half of them are Oracle!

Unable to use JStat with Cassandra

We have been having some problems with a Cassandra cluster, so I wanted to look at the java heap space. Unfortunately, jstat cannot find the pid. And, yes, it is the right PID!

Looking in /tmp/hsperfdata_cassandra/, there’s no file! Reading through the whole line where Cassandra is running, I noticed +PerfDisableSharedMem … that’d do it!

It looks like they intentionally set +PerfDisableSharedMem in the Cassandra startup script. I assume their rational is still reasonable … so wouldn’t remove the parameter for day-to-day operation. But, when there’s a problem … restarting Cassandra without this parameter allows us to check how garbage collection is going.