Category: System Administration

Cisco Catalyst 2960-S: Capturing All Traffic Sent Through a Port

We had an issue where an IOT device was not able to establish the connection it wanted — it would report it couldn’t connect to the Internet. I knew it could connect to the Internet in general; but, without knowing what tiny part of the Internet it used to determine ‘connected’ or ‘not connected’, we were stuck. Except! We recently upgraded the switch in our house to a Cisco Catalyst 2960S — which allows me to do one of the cool things I’d seen the network guys at work do but had never been able to reproduce at home: using SPAN (Switched Port ANalyzer). When we’d encounter strange behavior with a network device where we couldn’t just install Wireshark and get a network capture, the network group would basically clone all of the traffic sent to the device’s port to another switch port where we could capture traffic. They would send me a capture file, and it was just like having a Wireshark capture.

You can set up SPAN from the command line configuration, but I don’t have a username/password pair to log into SSH (and can only establish this from the command line configuration). Before breaking out the Cisco console cable, I tried running Cisco Network Assistant (unfortunately, a discontinued product line). One of the options under “Configure” => “Switching” is SPAN:

Since there was no existing SPAN session, I had to select a session number.

Then find the two ports — in the Ingress/Egress/Destination column, the port that is getting the traffic you want needs to either have Ingress (only incoming traffic), Egress (only outgoing traffic), or Both (all traffic). The port to which you want to clone the traffic is set to Destination. And the destination encapsulation is Replicate. Click apply.

In the example above, the laptop plugged in to GE1/0/24 gets all of the traffic traversing GE1/0/5 — running tshark -w /tmp/TheProblem.cap writes the packet capture to a file for later analysis. Caveat — the destination port is no longer “online” — it receives traffic but isn’t sending or receiving its own traffic … so make sure you aren’t using remote access to control the device!

To remove the SPAN, change the Ingress/Egress/Destination values back to “none”, change the destination encapsulation back to select one, and apply.

Since the source port is connected to one of our wireless access points, the network capture encompasses all wireless traffic through that access point.

And we were easily able to identify that this particular device uses the rule “I can ping 8.8.8.8” to determine if it is connected to the Internet. We were able to identify a firewall rule that prevented ICMP replies; allowing this traffic immediately allowed the devices to connect as expected.

*Un*Registering SysInternals ProcMon as Task Manager Replacement

I like the sysinternals tools — I use them frequently at work. But, generally? When trying to look at the running Windows processes or how much memory is being used … I need a really small, simple tool that doesn’t add to the bogging that’s already happening. Which is why I hate when people replace taskmgr.exe with the SysInternals task manager on steroids. It’s too much information. The worst part is that the menu option to replace task manager doesn’t un-replace it.

To accomplish that — to revert to the real Windows task manager — you need to edit the registry. Navigate to HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\taskmgr.exe and remove the key named “Debugger” which points to the SysInternals binary.

Tableau Upgrade Failure

Attempting to upgrade Tableau from 2021.something to 2022.3.5 repeatedly failed with the following error in the upgrade log:

Failed with the error "Detected the old version of Tableau Server takes precedence on the system PATH".

 

And there were all sorts of things online to try — most of which involved changing the environment variables Tableau sets. But they weren’t wrong … or, rather, it was impossible to tell which was “right”: the new version I was trying to install or the old version that was still running. On a whim, I thought I’d try an admin-level command prompt. Opening the command prompt with the “Run as administrator” option allowed me to upgrade the server.

Except … I upgraded another server the next night. I knew to launch the command prompt in administrator mode, but I was incredibly dismayed to encounter the exact same error. Even using an admin level command prompt. Then it occurred to me — Windows evaluates the environment variables when you launch the shell (in fairness, every Unix/Linux variant I’ve encountered does the same thing). The installer must be changing the Tableau environment variables. If you use a command prompt that was open prior to running the installer … you don’t have the new values. You have the old ones — so the environment variable, as seen by the upgrade script, is pointing to an older version of Tableau. Launch a new command prompt, re-evaluate the environment variables, and the script now sees the proper version. Hopefully my next upgrade won’t include a panic inducing “yeah, that’s not gonna happen” error!

Tableau PostgreSQL Query: Finding All Datasources of Name or Type

I frequently need to find details on a data source based on its name and find all data sources of a particular type. Particularly, the Microsoft Graph permissions required to use Sharepoint and OneDrive data within Tableau changed — I needed to reach out to individuals who use those data types to build a business case for the Security organization to approve the new permissions be added to our tenant.

-- Query to find all data sources of a specific type or name 
select system_users.email, datasources.id, datasources.name, datasources.created_at, datasources.updated_at, datasources.db_class, datasources.db_name
, datasources.site_id, sites.name as SiteName, projects.name as ProjectName, workbooks.name as WorkbookName
from datasources
left outer join users on users.id = datasources.owner_id
left outer join system_users on users.system_user_id = system_users.id
left outer join sites on datasources.site_id = sites.id
left outer join projects on datasources.project_id = projects.id
left outer join workbooks on datasources.parent_workbook_id = workbooks.id
-- where datasources.name like '%Sheet1 (LJR Sample%'
where datasources.db_class = 'onedrive'
order by datasources.name
;

Tableau PostgreSQL Query: Stats About Data Source Types

I wanted to report on the different types of data sources used in our Tableau instance — as well as show how many of each type are in use.

-- Query to found how many of each data source type
select datasources.db_class, count(datasources.db_class) as count
from datasources
left outer join users on users.id = datasources.owner_id
left outer join system_users on users.system_user_id = system_users.id
left outer join sites on datasources.site_id = sites.id
left outer join projects on datasources.project_id = projects.id
left outer join workbooks on datasources.parent_workbook_id = workbooks.id
group by datasources.db_class 
order by count desc ;

Answer: About half of them are Oracle!

Unable to use JStat with Cassandra

We have been having some problems with a Cassandra cluster, so I wanted to look at the java heap space. Unfortunately, jstat cannot find the pid. And, yes, it is the right PID!

Looking in /tmp/hsperfdata_cassandra/, there’s no file! Reading through the whole line where Cassandra is running, I noticed +PerfDisableSharedMem … that’d do it!

It looks like they intentionally set +PerfDisableSharedMem in the Cassandra startup script. I assume their rational is still reasonable … so wouldn’t remove the parameter for day-to-day operation. But, when there’s a problem … restarting Cassandra without this parameter allows us to check how garbage collection is going.

 

Java Heap Stats with JStat

While there are plenty of third-party utilities for looking at the java heap space, I just use jstat (in OpenJDK, this means installing java-<Version>-openjdk-devel

JStat will display the following columns:

--------------------------------------------------------------------------------
S0C: Survivor space 0 size in K
S1C: Survivor space 1 size in K

S0U: Survivor space 0 usage in K
S1U: Survivor space 1 usage in K

--------------------------------------------------------------------------------

EC: Eden space size in K
EU: Eden space usage in K

--------------------------------------------------------------------------------

OC: Old space size in K
OU: Old space usage in K

--------------------------------------------------------------------------------

MC: Meta space size in K
MU: Meta space usage in K

--------------------------------------------------------------------------------

CCSC: CodeCache size in K
CCSU: CodeCache usage in K

--------------------------------------------------------------------------------

YGC: Young generation garbage collection count
YGCT: Young generation garbage collection total time in seconds

FGC: Full garbage collection count
FGCT: Full garbage collection total time in seconds

CGC: Concurrent garbage collection count
CGCT: Concurrent garbage collection time in seconds
GCT: Total garbage collection time in seconds

--------------------------------------------------------------------------------

https://stackoverflow.com/questions/13660871/jvm-garbage-collection-in-young-generation/13661014#13661014 does a good job of explaining the nomenclature & how stuff gets moved around in the heap space

Sample output — this command is for java PID 19356 and will list 100 lines 2 seconds apart (2000 ms)

server01:bin # jstat -gc 19356 2000 100
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
68096.0 68096.0 0.0 64207.5 545344.0 319007.2 30775744.0 19221750.2 137452.0 124322.4 18860.0 15380.6 324697 14589.985 228 45.830 14635.815
68096.0 68096.0 0.0 64207.5 545344.0 386674.5 30775744.0 19221750.2 137452.0 124322.4 18860.0 15380.6 324697 14589.985 228 45.830 14635.815
68096.0 68096.0 0.0 64207.5 545344.0 457055.4 30775744.0 19221750.2 137452.0 124322.4 18860.0 15380.6 324697 14589.985 228 45.830 14635.815
68096.0 68096.0 0.0 64207.5 545344.0 485538.8 30775744.0 19221750.2 137452.0 124322.4 18860.0 15380.6 324697 14589.985 228 45.830 14635.815
68096.0 68096.0 0.0 64207.5 545344.0 505893.4 30775744.0 19221750.2 137452.0 124322.4 18860.0 15380.6 324697 14589.985 228 45.830 14635.815

And this is a time where a third-party tool would be helpful but I never really ‘get’ what is and what is not OK to install on servers, so try not to install things — because the *useful* bit of information for any of this is really the usage / size percent utilization value.

That last grouping of stuff — I look at those v/s how long the pid has been running. If you’ve gotten a billion GC’s and the PID has only been running for eight seconds, that is a crazy amount of I/O. If I’ve only had 3 GCs and the pid has been running for seven years, it hasn’t been doing anything. In between? I don’t really find the numbers useful unless I’ve got a baseline from normal operation.

Maintaining an /etc/hosts record

I encountered an oddity at work — there’s a server on an internally located public IP space. Because it’s public space, it is not allowed to communicate with the internal interface of some of our security group’s servers. It has to use their public interface (not technically, just a policy on which they will not budge). I cannot just use a DNS server that resolves the public copy of our zone because then we’d lose access to everything else, so we are stuck making an /etc/hosts entry. Except this thing changes IPs fairly regularly (hey, we’re moving from AWS to Azure; hey, let’s try CloudFlare; nope, that is expensive so change it back) and the service it provides is application authentication so not something you want randomly falling over every couple of months.

So I’ve come up with a quick script to maintain the /etc/hosts record for the endpoint.

# requires: dnspython, subprocess

import dns.resolver
import subprocess

strHostToCheck = 'hostname.example.com' # PingID endpoint for authentication
strDNSServer = "8.8.8.8"         # Google's public DNS server
listStrIPs = []

# Get current assignement from hosts file
listCurrentAssignment = [ line for line in open('/etc/hosts') if strHostToCheck in line]

if len(listCurrentAssignment) >= 1:
        strCurrentAssignment = listCurrentAssignment[0].split("\t")[0]

        # Get actual assignment from DNS
        objResolver = dns.resolver.Resolver()
        objResolver.nameservers = [strDNSServer]
        objHostResolution = objResolver.query(strHostToCheck)

        for objARecord in objHostResolution:
                listStrIPs.append(objARecord.to_text())

        if len(listStrIPs) >= 1:
                # Fix /etc/hosts if the assignment there doesn't match DNS
                if strCurrentAssignment in listStrIPs:
                        print(f"Nothing to do -- hosts file record {strCurrentAssignment} is in {listStrIPs}")
                else:
                        print(f"I do not find {strCurrentAssignment} here, so now fix it!")
                        subprocess.call([f"sed -i -e 's/{strCurrentAssignment}\t{strHostToCheck}/{listStrIPs[0]}\t{strHostToCheck}/g' /etc/hosts"], shell=True)
        else:
                print("No resolution from DNS ... that's not great")
else:
        print("No assignment found in /etc/hosts ... that's not great either")

Tableau Query — Data Sources and the Workbooks Where They Are Used

I have found Tableau’s views of data sources to be … lacking. To provide a report of data sources, the database type, and where it is being used, I put together a query that locates all data sources (or filters by database type — specifically, I was trying to see who was using Snowflake) and lists the site, project, and workbook using the data source.

-- Data sources and what workbook they are used in
SELECT system_users.email , datasources.id, datasources.name, datasources.created_at,  datasources.updated_at, datasources.db_class, datasources.db_name
, datasources.site_id, sites.name AS SiteName, projects.name AS ProjectName, workbooks.name AS WorkbookName
FROM datasources 
LEFT OUTER JOIN users ON users.id = datasources.owner_id
LEFT OUTER JOIN system_users ON users.system_user_id = system_users.id
LEFT OUTER JOIN sites ON datasources.site_id = sites.id
LEFT OUTER JOIN projects ON datasources.project_id = projects.id
LEFT OUTER JOIN workbooks ON datasources.parent_workbook_id = workbooks.id
-- WHERE datasources.db_class = 'snowflake' 
ORDER BY datasources.created_at;


Web Redirection Based on Typed URL

I have no idea why I am so pleased with this simple HTML code, but I am! My current project is to move all of our Tableau servers to different servers running a newer version of Windows. When I first got involved with the project, it seemed rather difficult (there was talk of manually recreating all of the permissions on each item!!) … but, some review of the vendors documentation let me to believe one could build a same-version server elsewhere (newer Windows, out in magic cloudy land, but the same Tableau version), back up the data from the old server, restore it to the new one, and be done. It’s not quite that simple — I had to clear out the SAML config & manually reconfigure it so the right elements get added into the Java keystore, access to the local Postgresql database needed to be manually configured, a whole bunch of database drivers needed to be installed, and the Windows registry of ODBC connections needed to be exported/imported. But the whole process was a lot easier than what I was first presented.

Upgrading the first production server was mostly seamless — except users appear to have had the server’s actual name. Instead of accessing https://tableau.example.com, they were typing abcwxy129.example.com. And passing that link around as “the link” to their dashboard. And, upon stopping the Tableau services on the old server … those links started to fail. Now, I could have just CNAMED abcwxy129 over to tableau and left it at that. But letting users continue to do the wrong thing always seems to come back and haunt you (if nothing else, the OS folks own the namespace of servers & are allowed to re-use or delete those hostnames at will). So I wanted something that would take whatever https://tableau.example.com/#/site/DepartMent/workbooks/3851/Views kind of URL a user provided and give them the right address. And, since this was Windows, to do so with IIS without the hassle of integrating PHP or building a C# project. Basically, I wanted to do it within basic HTML. Which meant JavaScript.

And I did it — using such a basic IIS installation that the file is named something like iisstart.htm so I didn’t have to change the default page name. I also redirected 404 to / so any path under the old server’s name will return the redirection landing page.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
	<head>
		<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
		<title>This Tableau server has moved</title>
		
		<style type="text/css">
			<!--
			body {
				color:#000000;
				background-color:#eeeeee;
				margin:0;
			}
			-->
		</style>
	</head>
	<body>
		<P><ul>
		<h2>The Tableau server has moved. </h2>
		<P>The location you accessed, <span style="white-space: nowrap" id="oldurl"></span>, is no longer available.<br><br> Please update your link to use <span style="white-space: nowrap"  id="newurl"></span></p>
		</ul></P>
	
		<script>
			let strOldURL = window.location.href;

			let strNewURL = strOldURL.replace(/hostname.example.com/i,"tableau.example.com");
			strNewURL = strNewURL.replace(/otherhostname.example.com/i,"tableau.example.com");

			document.getElementById("oldurl").innerHTML = window.location.href;
			document.getElementById("newurl").innerHTML = "<a href=" + strNewURL + ">" + strNewURL + "</a>";
		</script>
	
	</body>
</html>