We have some servers that forward along data from various log files including their own /var/log/messages file … so I needed to filter out what is, to me, extraneous data. Good thing adding “NOT” to a query works!
Category: Technology
ELK Monitoring
We have a number of logstash servers gathering data from various filebeat sources. We’ve recently experienced a problem where the pipeline stops getting data for some of those sources. Not all — and restarting the non-functional filebeat source sends data for ten minutes or so. We were able to rectify the immediate problem by restarting our logstash services (IT troubleshooting step #1 — we restarted all of the filebeats and, when that didn’t help, moved on to restarting the logstashes)
But we need to have a way to ensure this isn’t happening — losing days of log data from some sources is really bad. So I put together a Python script to verify there’s something coming in from each of the filebeat sources.
pip install elasticsearch==7.13.4
#!/usr/bin/env python3
#-*- coding: utf-8 -*-
# Disable warnings that not verifying SSL trust isn't a good idea
import requests
requests.packages.urllib3.disable_warnings()
from elasticsearch import Elasticsearch
import time
# Modules for email alerting
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
# Config variables
strSenderAddress = "devnull@example.com"
strRecipientAddress = "me@example.com"
strSMTPHostname = "mail.example.com"
iSMTPPort = 25
listSplunkRelayHosts = ['host293', 'host590', 'host591', 'host022', 'host014', 'host135']
iAgeThreashold = 3600 # Alert if last document is more than an hour old (3600 seconds)
strAlert = None
for strRelayHost in listSplunkRelayHosts:
iCurrentUnixTimestamp = time.time()
elastic_client = Elasticsearch("https://elasticsearchhost.example.com:9200", http_auth=('rouser','r0pAs5w0rD'), verify_certs=False)
query_body = {
"sort": {
"@timestamp": {
"order": "desc"
}
},
"query": {
"bool": {
"must": {
"term": {
"host.hostname": strRelayHost
}
},
"must_not": {
"term": {
"source": "/var/log/messages"
}
}
}
}
}
result = elastic_client.search(index="network_syslog*", body=query_body,size=1)
all_hits = result['hits']['hits']
iDocumentAge = None
for num, doc in enumerate(all_hits):
iDocumentAge = ( (iCurrentUnixTimestamp*1000) - doc.get('sort')[0]) / 1000.0
if iDocumentAge is not None:
if iDocumentAge > iAgeThreashold:
if strAlert is None:
strAlert = f"<tr><td>{strRelayHost}</td><td>{iDocumentAge}</td></tr>"
else:
strAlert = f"{strAlert}\n<tr><td>{strRelayHost}</td><td>{iDocumentAge}</td></tr>\n"
print(f"PROBLEM - For {strRelayHost}, document age is {iDocumentAge} second(s)")
else:
print(f"GOOD - For {strRelayHost}, document age is {iDocumentAge} second(s)")
else:
print(f"PROBLEM - For {strRelayHost}, no recent record found")
if strAlert is not None:
msg = MIMEMultipart('alternative')
msg['Subject'] = "ELK Filebeat Alert"
msg['From'] = strSenderAddress
msg['To'] = strRecipientAddress
strHTMLMessage = f"<html><body><table><tr><th>Server</th><th>Document Age</th></tr>{strAlert}</table></body></html>"
strTextMessage = strAlert
part1 = MIMEText(strTextMessage, 'plain')
part2 = MIMEText(strHTMLMessage, 'html')
msg.attach(part1)
msg.attach(part2)
s = smtplib.SMTP(strSMTPHostname)
s.sendmail(strSenderAddress, strRecipientAddress, msg.as_string())
s.quit()
Debugging Filebeat
# Run filebeat from the command line and add debugging flags to increase verbosity of output
# -e directs output to STDERR instead of syslog
# -c indicates the config file to use
# -d indicates which debugging items you want -- * for all
/opt/filebeat/filebeat -e -c /opt/filebeat/filebeat.yml -d "*"
Python Logging to Logstash Server
Since we are having a problem with some of our filebeat servers actually delivering data over to logstash, I put together a really quick python script that connects to the logstash server and sends a log record. I can then run tcpdump on the logstash server and hopefully see what is going wrong.
import logging
import logstash
import sys
strHost = 'logstash.example.com'
iPort = 5048
test_logger = logging.getLogger('python-logstash-logger')
test_logger.setLevel(logging.INFO)
test_logger.addHandler(logstash.TCPLogstashHandler(host=strHost,port=iPort))
test_logger.info('May 22 23:34:13 ABCDOHEFG66SC03 sipd[3863cc60] CRITICAL One or more Dns Servers are currently unreachable!')
test_logger.warning('May 22 23:34:13 ABCDOHEFG66SC03 sipd[3863cc60] CRITICAL One or more Dns Servers are currently unreachable!')
test_logger.error('May 22 23:34:13 ABCDOHEFG66SC03 sipd[3863cc60] CRITICAL One or more Dns Servers are currently unreachable!')
Using tcpdump to capture traffic
I like tshark (command line wireshark), but some of our servers don’t have it installed and won’t have it installed. So I’m re-learning tcpdump!
List data from a specific source IP
tcpdump src 10.1.2.3
List data sent to a specific port
tcpdump dst port 5048
List data sent from an entire subnet
tcpdump net 10.1.2.0/26
And add -X or -A to see the whole packet.
PostgreSQL Logical Replication – Row Filter
Researching something else about logical replication, I came across a commit message about row filtering on logical replication. From the date of the commit, I expect this will be included in PostgreSQL 15.
Adding a WHERE clause after the table name limits the rows that are included in the publication — you could publish employees in Vermont or only completed transactions.
Using urandom to Generate Password
Frequently, I’ll use password generator websites to create some pseudo-random string of characters for system accounts, database replication,etc. But sometimes the Internet isn’t readily available … and you can create a decent password right from the Linux command line using urandom.
If you want pretty much any “normal” character, use tr to pull out all of the other characters:
'\11\12\40-\176'
Or remove anything outside of upper case, lower case, and number characters using
a-zA-Z0-9
Pass the output to head to grab however many characters you actually want. Voila — a quick password.