Tag: Linux

Logstash, JRuby, and Private Temp

There’s a long-standing bug in logstash where the private temp folder created for jruby isn’t cleaned up when the logstash process exits. To avoid filling up the temp disk space, I put together a quick script to check the PID associated with each jruby temp folder, see if it’s an active process, and remove the temp folder if the associated process doesn’t exist.

When the PID has been re-used, this means we’ve got an extra /tmp/jruby-### folder hanging about … but each folder is only 10 meg. The impacting issue is when we’ve restarted logstash a thousand times and a thousand ten meg folders are hanging about.

This script can be cron’d to run periodically or it can be run when the logstash service launches.

import subprocess
import re

from shutil import rmtree

strResult = subprocess.check_output(f"ls /tmp", shell=True)

for strLine in strResult.decode("utf-8").split('\n'):
        if len(strLine) > 0 and strLine.startswith("jruby-"):
                listSplitFileNames = re.split("jruby-([0-9]*)", strLine)
                if listSplitFileNames[1] is not None:
                        try:
                                strCheckPID = subprocess.check_output(f"ps -efww |  grep {listSplitFileNames[1]} | grep -v grep", shell=True)
                                #print(f"PID check result is {strCheckPID}")
                        except:
                                print(f"I am deleting |{strLine}|")
                                rmtree(f"/tmp/{strLine}")

Using urandom to Generate Password

Frequently, I’ll use password generator websites to create some pseudo-random string of characters for system accounts, database replication,etc. But sometimes the Internet isn’t readily available … and you can create a decent password right from the Linux command line using urandom.

If you want pretty much any “normal” character, use tr to pull out all of the other characters:

'\11\12\40-\176'

Or remove anything outside of upper case, lower case, and number characters using

a-zA-Z0-9

Pass the output to head to grab however many characters you actually want. Voila — a quick password.

Linux Disk Utilization – Reducing Size of /var/log/sa

We occasionally get alerted that our /var volume is over 80% full … which generally means /var/log has a lot of data, some of which is really useful and some of it not so useful. The application-specific log files already have the shortest retention period that is reasonable (and logs that are rotated out are compressed). Similarly, the system log files rotated through logrotate.conf and logrotate.d/* have been configured with reasonable retention.

Using du -sh /var/log/ showed the /var/log/sa folder took half a gig of space.

This is the daily output from sar (a “daily summary of process accounting” cron’d up with /etc/cron.d/sysstat). This content doesn’t get rotated out with the expected logrotation configuration. It’s got a special configuration at /etc/sysconfig/sysstat — changing the number of days (or, in my case, compressing some of the older files) is a quick way to reduce the amount of space the sar output files consume).

Linux – Clearing Caches

I encountered some documentation at work that provided a process for clearing caches. It wasn’t wrong per se, but it showed a lack of understanding of what was being performed. I enhanced our documentation to explain what was happening and why the series of commands was redundant. Figured I’d post my revisions here in case they’re useful for someone else.

Only clean caches can be dropped — dirty ones need to be written somewhere before they can be dropped. Before dropping caches, flush the file system buffer using sync — this tells the kernel to write dirty cache pages to disk (or, well, write as many as it can). This will maximize the number of cache pages that can be dropped. You don’t have to run sync, but doing so optimizes your subsequent commands effectiveness.

Page cache is memory that’s held after reading a file. Linux tends to keep the files in cache on the assumption that a file that’s been read once will probably be read again. Clear the pagecache using echo 1 > /proc/sys/vm/drop_caches — this is the safest to use in production and generally a good first try.

If clearing the pagecache has not freed sufficient memory, proceed to this step. The dentries (directory cache) and inodes cache are memory held after reading file attributes (run strace and look at all of those stat() calls!). Clear the dentries and inodes using echo 2 > /proc/sys/vm/drop_caches — this is kind of a last-ditch effort for a production environment. Better than having it all fall over, but things will be a little slow as all of the in-flight processes repopulate the cached data.

You can clear the pagecache, dentries, and inodes using echo 3 > /proc/sys/vm/drop_caches — this is a good shortcut in a non-production environment. But, if you’re already run 1 and 2 … well, 3 = 1+2, so clearing 1, 2, and then 3 is repetitive.

 

Another note from other documentation I’ve encountered — you can use sysctl to clear the caches, but this can cause a deadlock under heavy load … as such, I don’t do this. The syntax is sysctl -w vm.drop_caches=1 where the number corresponds to the 1, 2, and 3 described above.

 

2>/dev/null

A few times now, I’ve encountered individuals with cron jobs or bash scripts where a command execution ends in 2>/dev/null … and the individual is stymied by the fact it’s not working but there’s no clue as to why. The error output is being sent into a big black hole never to escape!

The trick here is to understand file descriptors — 1 is basically a shortcut name for STDOUT and 2 is basically a shortcut name for STDERR (0 is STDIN, although that’s not particularly relevant here).  So 2>/dev/null says “take all of the STDERR stuff and redirect it to /dev/null”.

Sometimes you’ll see both STDERR and STDOUT being redirected either to a file or to /dev/null — in that case you will see 2>&1 where the ampersand prior to the “1” indicates the stream is being redirected to a file descriptor (2>1 would direct STDOUT to a file named “1”) — so >/dev/null 2>&1 is the normal way you’d see it written. Functionally, >/dev/null 1>&2 would be the same thing … but redirecting all output into error is, conceptually, a little odd.

To visualize all of this, use a command that will output something to both STDERR and STDOUT — for clarify, I’ve used “1>/dev/null” (redirect STDOUT to /devnull) in conjunction with 2>&1 (redirect STDERR to STDOUT). As written in the text above, the number 1 is generally omitted and just >/dev/null is written.

 

 

Useful Bash Commands

Viewing Log Files

Tailing the File

When the same file name is used when logs are rotated (i.e. app.log is renamed to app.yyyymmdd.log and a new app.log is created), use the -F flag to follow the name instead of the file descriptor

tail -F /var/log/app.log

Tailing with Filtering

When you are looking for something specific in the log file, it often helps to run the log output through grep. This example watches a sendmail log for communication with the host 10.5.5.5

tail -F /var/log/maillog | grep "10.5.5.5"

Handling Log Files with Date Specific Naming

I alias out commands for viewing commonly read log files. This is easy enough when the current log file is always /var/log/application/content.log, but some active log files have date components in the file name. As an example, our Postgresql servers have the short day-of-week string in the log. Use command substitution to get the date-specific elements from the date executable. Here, I tail a file named postgresql-Tue.log on Tuesday. Since logs rotate to a new name, tail -F doesn’t really do anything. You’ll still need to ctrl-c the tail and restart it for the next day.

tail -f /pgdata/log/postgresql-$(date +%a).log