Category: Technology

November 9, 2017

Strange spam

We have been getting spam messages with the subject “top level quality of paint bucket” both at home and at work. I get that it costs essentially nothing to send a million junk e-mail messages, so it doesn’t take a lot of sales for a campaign to be profitable. But are there seriously people who buy their paint buckets from cold e-mails? Especially e-mails that I thought were trying to sell me buckets of paint.

And how lazy is a spam campaign that uses static strings in the subject field?

November 4, 2017

Containerized Development v/s Microservices

While both monolithic and microservice applications can be deployed in containers, there is a significant difference. Understanding that difference can save time/money/effort decomposing an application into microservices when the benefits you desire can be gained through simple containerized deployments.

One of the touted benefits of microservices — the ability for different teams to use different internal practices, different coding standards, hell even different languages and still have a functioning application because the interface is static and well documented … well, that sounds like a nightmare to me.

A company with which I worked a decade ago had teams of developers devoted to different components of the application — essentially your team owned a class or set of functions. The class/functions were had well documented and static interfaces — you wouldn’t change void functionX(int iVariable, string strOtherVariable) to return boolean values. Or to randomly add inputs (although functions were overloaded). Developers were tasked with ensuring backwards compatibility of their classes and functions. The company had a “shared libraries” development team who worked on, well, shared libraries. Database I/O stuff, authentication frameworks, GUI interfaces. A new project would immediately pull in the relevant shared functions, then start developing their code.

Developers were able to focus on a small component of the application, were able to implement code changes without having to coordinate with other teams, and consumers of their resource were able to rely on the consistent input and output of the functions as well as consistent representation of class objects.

When a specific project encountered resource shortfalls (be that family emergencies reducing workers or sales teams making overly optimistic commitments), the dozens of C# programmers could be shifted around to expand a team. In a team with an outstanding team lead, employees could easily move to other groups to progress their career.

What happens in a microservices environment? You’ve got a C# team, a Java team, a Python team. You get some guy in from Uni and he’s starting up a LISP team because Lisplets will get his code delivered through Tomcat. The next guy who comes in starts the F90 team because why not? Now I’m not saying someone with a decade of experience in Java couldn’t learn LISP … but you go back to “Google up how to do X in LISP” programming speed. There are language nuances of which you are not aware and you introduce inefficiency and possibly bugs to the code.

What’s my point? Well, (1) business practices (we program in this language, here’s our style guide, etc) are going to negate some of the perceived benefits of microservices. The small gain to be had by individual teams picking their own way are going to be outweighed by siloing (some guy from the Java team isn’t going to move into a lead role over on the C# team) and resource limitations (I cannot reallocate resources temporarily). But (2) you can architect your project to provide, basically, the same benefits.

Microservices make sense where an application has different components with different utilization rates. A product that runs a Super Bowl commercial may see a huge spike in web traffic — but scaling up thousands of complete web servers to handle the load is an inefficient use of resources. There’s a lot of product browsing, but shipping quotes, new account creations, and check-outs are not all scaling linearly to web hits. Adding tens of thousands of browsing components and only expanding the new-account-creation or checkout services as visitors decide to make purchases can be done more quickly to respond in real-time to traffic increases.

Applications where each component gets about the same amount of use … I use Kubernetes to manage a cluster of sendmail servers. As mail traffic increases, additional PODs are brought online. It’s a configuration I’d like to mirror at work — we currently have nine sendmail servers — to provide physical and site redundancy for both employee mail traffic and automated system traffic. With Kubernetes, three servers in each of the two sites (six total) would provide ample resources to accommodate mail flow. Automated systems send a lot of mail at night, and the number of pods servicing that VIP would increase. User mail flow increases during the day, so while automated mailflow pods would be spun down … user mail flow ones would be spun up. With a 33% reduction in servers, I’ve created a solution with more capacity for highly used functions (this function could be the primary usage of all six servers) that is geo-redundant (one of the current systems is *not* geo-redundant as the additional two servers in the alternate site couldn’t be justified). But I didn’t need to decompose sendmail into microservices to achieve this. Simply needed to build a containerized sendmail.

November 2, 2017

Load Runner And Statistical Analysis Thereof

I had offhandedly mentioned a statistical analysis I had run in the process of writing and implementing a custom password filter in Active Directory. It’s a method I use for most of the major changes we implement at work – application upgrades, server replacements, significant configuration changes.

To generate the “how long did this take” statistics, I use a perl script using the Time::HiRes module (_loadsimAuthToCentrify.pl) which measures microsecond time. There’s an array of test scenarios — my most recent test was Unix/Linux host authentication using pure LDAP authentication and Centrify authentication, so the array was fully qualified hostnames. Sometimes there’s an array of IDs on which to test — TestID00001, TestID00002, TestID00003, …., TestID99999. And there’s a function to perform the actual test.

I then have a loop to generate a pseudo-random number and select the test to run (and user ID to use, if applicable) using that number

my $iRandomNumber = int(rand() * 100);
$iRandomNumber = $iRandomNumber % $iHosts;
my $strHost = $strHosts[$iRandomNumber];

The time is recorded prior to running the function (my $t0 = [gettimeofday];) and the elapsed time is calculated when returning from the function (my $fElapsedTimeAuthentication = tv_interval ($t0, [gettimeofday]);). The test result is compared to an expected result and any mismatches are recorded.

Once the cycle has completed, the test scenario, results, and time to complete are recorded to a log file. Some tests are run multi-threaded and across multiple machines – in which case the result log file is named with both the running host’s name and a thread identifier. All of the result files are concatenated into one big result log for analysis.

A test is run before the change is made, and a new test for each variant of the change for comparison. We then want to confirm that the general time to complete an operation has not been negatively impacted by the change we propose (or select a route based on the best performance outcome).

Each scenario’s result set is dropped into a tab on an Excel spreadsheet (CustomPasswordFilterTiming – I truncated a lot of data to avoid publishing a 35 meg file, so the numbers on the individual tabs no longer match the numbers on the summary tab). On the time column, max/min/average/stdev functions are run to summarize the result set. I then break the time range between 0 and the max time into buckets and use the countif function to determine how many results fall into each bucket (it’s easier to count the number under a range and then subtract the numbers from previous buckets than to make a combined statement to just count the occurrences in a specific bucket).

Once this information is generated for each scenario, I create a summary tab so the data can be easily compared.

And finally, a graph is built using the lower part of that summary data. Voila, quickly viewed visual representation of several million cycles. This is what gets included in the project documentation for executive consideration. The whole spreadsheet is stored in the project document repository – showing our due diligence in validating user experience should not be negatively impacted as well as providing a baseline of expected performance should the production implementation yield user experience complaints.

October 20, 2017

New (To Me) WordPress Spam Technique

In the past week, one particular image that I posted has received about a hundred comments. Not real comments from people who enjoyed the image, unfortunately. Spam-bot comments. I get a few spam comments a month, easily just dropped. But exponentially increasing numbers of comments were showing up on this page. The odd thing, though, is it wasn’t a page or a post. It was an image embedded in a post.

Evidently embedded pictures have their own “attachment page” — a page that includes a comment dialogue. I guess that’s useful for someone … maybe an artist who uses a gallery front-end to their media can still get comments on their pictures if their gallery doesn’t provide commentary. Not a problem I need solved. WordPress includes a comments_open filter that allows you to programmatically control where comments are available (provided your theme uses the filter).

How do you add a function to WordPress? I find a lot of people editing WordPress or theme files directly. Not a good idea — next upgrade is going to blow your changes away. If you use an upgrade script, you could essentially ‘patch’ the theme during the upgrade process (append your function to the distributed file). Or you can just add your function as a plug-in. In your wp-content/plugins folder, make a folder with a good descriptive name of your plugin (i.e. don’t call it myPlugin if you have any thoughts of distributing it). In that folder make a PHP file with the same name (i.e. my filterCommentsByType folder has a filterCommentsByType.php file.

For what I’m doing, the comment header is longer than the code! The comment header is used to populate the Plugins page in your admin console. If you omit the header component, your plugin will not show up to be activated. Add your function and save the file:

<?php
/**
* Plugin Name: Filter Comments By Type
* Plugin URI: http://lisa.rushworth.us
* Description: This plugin allows commenting to be disabled based on post type
* Version: 1.0.0
* Author: Lisa Rushworth
* Author URI: http://lisa.rushworth.us
* License: GPL2
*/
add_filter( ‘comments_open’, ‘remove_comments_by_post_type’, 10 , 2 );
function remove_comments_by_post_type( $boolInitialStatus, $iPostNumber) {
$post = get_post($iPostNumber);
if( $post->post_type == ‘attachment’ ){ return false; }
else{ return $boolInitialStatus; }
}
?>

When you go to your admin console’s plugins section, your filter will appear in the list and be deactivated. Click to active it.

Voila, no more comments on attachment posts. Or whatever other type of post on which you wish to restrict commenting.

October 12, 2017

Exchange 2013 Calendar Events In OpenHAB (CalDAV)

We’ve wanted to get our Exchange calendar events into OpenHAB — instead of trying to create a rule to determine preschool is in session, the repeating calendar event will dictate if it is a break or school day. Move the gymnastics session to a new day, and the audio reminder moves itself. Problem is, Microsoft stopped supporting CalDAV.

Scott found DAVMail — essentially a proxy that can translate between CalDAV clients and the EWS WSDL. Installation was straight-forward (click ‘next’ a few times). Configuration — for Exchange 2013, you need to select the “EWS” Exchange protocol and use your server’s EWS WSDL URL. https://yourhost.domain.cTLD/ews/exchange.asmx … then enable a local CalDAV port.

On the ‘network’ tab, check the box to allow remote connections. You *can* put the thumbprint of the IIS web site server certificate for your Exchange server into the “server certificate hash” field or you can leave it blank. On the first connection through DAVMail, there will be a pop-up asking you to verify and accept the certificate.

On the ‘encryption’ tab, you can configure a private keystore to allow the client to communicate over SSL. I used a PKCS12 store (Windows type), but a java keystore should work too (you may need to add the key signing key {a.k.a. CA public key} to the ca truststore for your java instance).

On the advanced tab, I did not enable Kerberos because the OpenHAB CalDAV binding passes credentials. I did enable KeepAlive – not sure if it is used, the CalDAV binding seems to poll. Save changes and open up the DAVMail log viewer to verify traffic is coming through.

Then comes Scott’s part — enable the bindings in OpenHAB (there are two of them – a CalDAVIO and CalDAVCmd). In the caldavio.cfg, the config lines need to be prefixed with ‘caldavio’ even though that’s not how it works in OpenHAB2.

caldavio:CalendarIdentifier:url=https://yourhost.yourdomain.gTLD:1080/users/mailbox@yourdomain.gTLD/calendar
caldavio:CalendarIdentifier:username=mailbox@yourdomain.gTLD
caldavio:CalendarIdentifier:password=PasswordForThatMailbox
caldavio:CalendarIdentifier:reloadInterval=5
caldavio:CalendarIdentifier:disableCertificateVerification=true

Then in the caldavCommand.cfg file, you just need to tell it to load that calendar identifier:

caldavCommand:readCalendars=CalendarIdentifier

We have needed stop openhab, delete the config file from ./config/org/openhab/ related to this calendar and binding before config changes are ingested.

Last step is making a calendar item that can do stuff. In the big text box that’s where a message body is located (no idea what that’s called on a calendar entry):

BEGIN:Item_Name:STATE
END:Item_Name:STATE

The subject can be whatever you want. The start time and end time are the times for the begin and end events. Voila!

October 5, 2017

Setting Up DNSSEC

Last time I played around with the DNS Security Extensions (DNSSEC), the root and .com zones were not signed. Which meant you had to manually establish trusts before there was any sort of validation happening. Since the corporate standard image didn’t support DNSSEC anyway … wasn’t much point on either the server or client side. I saw ICANN postponed a key rollover for root a few days ago, and realized hey, root is signed now. D’oh, way to keep up, huh?

So we’re going to sign the company zones and make sure our clients are actually looking at zone signatures when they exist. Step #1 – signing our test zone. I do this in a screen session because it can take a long time to generate a key. If the process gets interrupted for whatever reason, you get to start ALL OVER. I am using ISC Bind – how to do this on any other platform, well LMGTFY 🙂

# Start a screen session
screen -S LJR-DNSSEC-KeyGen
# Use dnssec-keygen to create a zone signing key (ZSK) – bit value is personal preference
dnssec-keygen -a NSEC3RSASHA1 -b 2048 -n ZONE rushworth.us
# Then use dnssec-keygen to create a key signing key (KSK) – bit value is still personal preference
dnssec-keygen -f KSK -a NSEC3RSASHA1 -b 4096 -n ZONE rushworth.us

Grab the content of the *.key files and append them to your zone

October 4, 2017

Apache HTTP Sandbox With Docker

I set up a quick Apache HTTPD sandbox — primarily to test authentication configurations — in Docker today. It was an amazingly quick process.

Install an image that has an Apache HTTPD server: docker pull httpd
Create a local file system for Apache config files (c:\docker\httpd\httpd.conf for main config, c:\docker\httpd\conf.d for all of the extras like ssl.conf and php.conf, plus web sites), and c:\docker\httpd\vhtml for the web site content)
Launch the container: docker run -detach –publish 80:80 –publish 443:443 –name ApacheWebServer –restart always -v /c/docker/httpd/httpd.conf:/etc/httpd/conf/httpd.conf:ro -v /c/docker/httpd/conf.d/:/etc/httpd/conf.d/:ro -v /c/docker/httpd/vhtml/:/var/www/vhtml/:ro httpd

Shell into it (docker exec -it ApacheWebServer bash) to look around, or just access http://localhost from the Docker host.

September 25, 2017

PPM Via Windows Authenticated Proxy

The office proxy used to use BASIC authentication. Which was terrible: transmission was done over clear text. Some years ago, they implemented a new proxy server that was capable of using Kerberos tickets for authentication (actually the old one could have done it too – I’ve set up the Kerberos realm on another implementation of the same product, but it wasn’t a straight forward clickity-click and you’re done). Awesome move, but it did break everything that used the HTTP_PROXY environment variable with creds included (yeah, I have a no-rights account with proxy access and put that in clear text all over the place). I just stopped using wget and curl to download files. I’d pull them to my Windows box, then scp them to the right place. But every once in a while I need a new perl module that’s available from ActiveState’s PPM. I’d have to fetch the tgz file and install it manually.

Until today — I was configuring a new Fiddler installation. Brilliant program – it’s just a web proxy that you can use for debugging purposes, but it can insert itself into HTTPS communications and provide clear text rendering of encrypted sessions too. It also proxies proxy credentials! There’s a config to allow remote hosts to connect – it’s normally bound to 127.0.0.1:8888, but it can bind to 0.0.0.0:8888 as well. If you have your web browser open & visit a site through the proxy server (i.e. you make sure the browser is authenticating fine) … set your HTTP_PROXY to http://127.0.0.1:8888 (or whatever means the specific program uses to configure a proxy). Voila, PPM hits Fiddler. Fiddler relays the request out to the proxy using the Kerberos token on your desktop. Package installs. Lot of overhead just to avoid unzipping a file … but if you are installing a package with a dozen dependencies … well, it’s a lot quicker than failing your install a dozen times and getting the next prereq!

September 19, 2017

PHP: Windows Authentication to MS SQL Database

I’ve encountered several people now how have followed “the directions” to allow their IIS-hosted PHP code to authenticate to a MS SQL server using Windows authentication … only to get an error indicating some unexpected ID is unable to log into the SQL server.

Create your application pool and add an identity. Turn off fastcgi.impersonate in your php.ini file. Create web site, use custom application pool … FAIL.

C:\Users\administrator.RUSHWORTH<%windir%\system32\inetsrv\appcmd.exe list config "Exchange Back End" /section:anonymousAuthentication
<system.webServer>
  <security>
    <authentication>
      <anonymousAuthentication enabled="true" userName="IUSR" />
    </authentication>
  </security>
</system.webServer>

The web site still doesn’t pick up the user from the application pool. Click on Anonymous Authentication, then click “Edit” over in the actions pane. Change it to use the application pool identity here too (why wouldn’t it automatically do so when an identity is provided?? no idea!).

C:\Users\administrator.RUSHWORTH<%windir%\system32\inetsrv\appcmd.exe list config "Exchange Back End" /section:anonymousAuthentication
<system.webServer>
  <security>
    <authentication>
      <anonymousAuthentication enabled="true" userName="" />
    </authentication>
  </security>
</system.webServer>

I’ve always seen the null string in userName, although I’ve read that the element may be omitted entirely. Once the site is actually using the pool identity, PHP can authenticate to SQL accounts using Windows authentication.

September 16, 2017

Facebook’s Offensive Advertising Profiles

As a programmer, I assumed Facebook used some sort of statistical analysis to generate advertising categories based on user input rather than employing a marketing group. A statistical analysis of the phrases being typed is *generally* an accurate reflection of what people type, although I’ve encountered situations where their code does not appropriately weight adjectives (FB thought I was a Trump supporter because incompetent, misogynist, unqualified, etc didn’t clue them into my real beliefs). But I don’t think the listings causing an uproar this week were factually wrong.

Sure, the market segment name is offensive; but computers don’t natively identify human offense. I used to manage the spam filtering platform for a large company (back before hourly anti-spam definition updates were a thing). It is impossible to write every iteration of every potentially offensive string out there. We would get e-mails for \/|@GR@! As such, there isn’t a simple list of word combinations that shouldn’t appear in your marketing profiles. It would be quite limiting to avoid ‘kill’ or ‘hate’ in profiles too — a group of people who hate vegetables is a viable target market. Or those who make killer mods to their car.

FB’s failing, from a development standpoint, is not having a sufficiently robust set of heuristic principals against which target demo’s are analysed for non-publication. They may have considered the list would be self-pruning: no company is going to buy ads to target “kill all women”. Any advertising string that receives under some threshold of buys in a delta-time gets dropped. Lazy, but I’m a lazy programmer and could *totally* see myself going down that path. And spinning it as the most efficient mechanism at that. To me, this is the difference between a computer science major and an information sciences major. Computer science is about perfecting the algorithm to build categories from user input and optimizing the results by mining purchase data to determine which categories are worth retaining. Information science teaches you to consider the business impact of customers seeing the categories which emerge from user input.

There are ad demo’s for all sorts of other offensive groups, so it isn’t like the algorithm unfairly targeted a specific group. Facebook makes money from selling advertisements to companies based on what FB users talk about. It isn’t a specific attempt to profit by advertising to hate groups; it’s an attempt to profit by dynamically creating marketing demographic categories and sorting people into their bins.

This isn’t limited to Facebook either – any scenario where it is possible to make money but costs nothing to create entries for sale … someone will write an algorithm to create passive income. Why WOULDN’T they? You can sell shirts on Amazon. Amazon’s Marketplace Web Service allows resellers to automate product listings. Custom write some code to insert random (adjectives | nouns | verbs) into a template string then throw together a PNG of the logo superimposed on a product. Have a production facility with an API to order, make the product once it has been ordered, and you’ve got passive income. And people did. I’m sure some were wary programmers – a sufficiently paranoid person might even have a human approve the new list of phrases. Someone less paranoid might make a banned word list (or even a banned word list and source one’s words from a dictionary and look for the banned words in the definition too). But a poorly conceived implementation will just glom words together and assume something stupid/offensive just won’t sell. Works that way sometimes. Bad publicity sinks the company other times.

The only thing that really offends me about this story is that unpleasant people are partaking in unpleasant conversations. Which isn’t news, nor is it really FB’s fault beyond creating a platform to facilitate the discussion. Possibly some unpleasant companies are targeting their ads to these individuals … although that’s not entirely FB’s fault either. Buy an ad in Breitbart and you can target a bunch of white supremacists too. Not creating a marketing demographic for them doesn’t make the belief disappear.