Category: Technology

OpenSSL As A Trusted CA

There are wrappers for OpenSSL that provide certificate authority functionality, but I found myself spending a lot of time trying to get any to work. Since I only wanted to generate a few internal certificates (i.e. not something that needed a simple interface for non-techies), so I set up an OpenSSL certificate authority and used it to sign certificates.

First, generate a public/private keypair for your CA (use however many days you want, this is ten years:

openssl genrsa -aes256 -out ca.key 2048
openssl req -new -x509 -key ca.key -out ca.cer -days 3652 -sha256

Take ca.cer and publish it in our domain GPO as a trusted root certificate authority (Computer Configuration => Policies => Windows Settings => Security Settings => Public Key Policies => Trusted Root Certification Authorities)

If you are impatient, force client to update GPO. Otherwise wait. Eventually you will see your CA in the Windows computer’s certificate store as a trusted root certification authority.

Now generate certificate(s) against the CA (again use whatever value for days is reasonable for your purpose):

openssl genrsa -aes256 -out gitlab.rushworth.us.key 2048
openssl req -new -key gitlab.rushworth.us.key -out gitlab.rushworth.us.req
openssl x509 -req -in gitlab.rushworth.us.req -out gitlab.rushworth.us.cer -days 365 -CA /ca/ca.cer -CAkey /ca/ca.key -sha256 -CAcreateserial

On subsequent requests, you can omit the “-CAcreateserial” option.

In domain clients will trust your certificate. Non-domain clients will need to import the CA public key to their trust store.

Git Deployment

I ‘inherited’ the Git server at work — which means I had to learn how the back end component of Git works (beyond my file-system based implementation where there are just clients and a disk location). It is not as complicated as I feared. The chap who had deployed the Git backend at work chose Bonobo — since he no longer works for the company, I cannot just ask why this particular implementation. It’s Windows based and priced in our 0$ budget, and I am certain these were selling points. It seems quite stripped down compared to GitHub too — none of the issue tracking / Wiki / chat about it features. Which, for what my department does, is fine. We are not software developers. We have a lot of internal code for task automation, we have some internal code for departmental web sites, and we have some sample code we hand out to other developers (i.e. someone wants to start using LDAP or ADFS authentication, we can give them a sample implementation in their language). There aren’t feature requests. Generally speaking, there aren’t simultaneous development tasks on a project.

Since I deciphered the server implementation at work, I wanted to set up a Git server at home too. The limited feature set of Bonobo was off-putting. I wanted integrated issue tracking. Looking at the available opensource and free options, I selected GitLab. As a sandbox — poke around the server, see how it works and what features it offers — I wanted something ready-to-go. I noticed that there is a Docker container for the project. I helped a few friends who were testing Docker as a development and deployment methodology (I’ve even suggested it for my employer’s internal development staff … being able to develop and run an application with an integrated web server *without* needing the Windows permissions and configuration for a web server (and doing it all over again when your computer is replaced) seemed efficient. But I’d never actually used a Docker container before. It is incredibly easy.

Install docker — a bit obvious, but that was the most time consuming part of the process. I elected to install it on my Windows laptop for expediency. If we decide not to use GitLab, I haven’t thrown a bunch of unnecessary binaries on the server. Lenovo, as a default, does not enable virtualisation. Getting into the BIOS config tool (shift then click the power button, keep holding shift whilst you click restart) was the most time consuming bit of the installation.

Once Docker is installed, pull the container from the Docker store (docker pull gitlab/gitlab-ce). Then run it (docker run –detach –hostname gitlab.rushworth.us –publish 443:443 –publish 80:80 –publish 22:22 –name gitlab –restart always –volume /srv/gitlab/config://c/gldata/etc –volume /srv/gitlab/logs:/var/log/gitlab –volume /srv/gitlab/data://c/gldata/data –volume /svr/docker/gitlab/gitlab://c/gldata/gitlab gitlab/gitlab-ce:latest). You can remap ports (e.g. publish 8443:443) if needed.

Not quite there yet — you’ve got to edit the container config (docker exec -it gitlab vi /etc/gitlab/gitlab.rb) for your environment. Set a valid external url (external_url ‘http://gitlab.rushworth.us’). I also enabled LDAP authentication to test that out.


gitlab_rails[‘ldap_enabled’] = true

###! **remember to close this block with ‘EOS’ below**
gitlab_rails[‘ldap_servers’] = YAML.load <<-‘EOS’
main: # ‘main’ is the GitLab ‘provider ID’ of this LDAP server
label: ‘LDAP’
host: ‘ADHostname.rushworth.us’
port: 636
uid: ‘sAMAccountName’
method: ‘ssl’ # “tls” or “ssl” or “plain”
bind_dn: ‘cn=UserID,ou=SystemAccounts,dc=domain,dc=ccTLD’
password: ‘AccountPasswordGoesHere’
active_directory: true
allow_username_or_email_login: false
block_auto_created_users: false
base: ‘ou=ResourceUsers,dc=domain,dc=ccTLD’
user_filter: ‘(&(sAMAccountName=*))’ # Can add attribute value to restrict authorized users to GitLab access, we leave open to all valid user accounts in the OU. Should be able to authorize based on group membership using linked attribute value like (&(memberOf=cn=group,ou=groupOU,dc=domain,dc=ccTLD))
attributes:
username: [‘uid’, ‘userid’, ‘sAMAccountName’]
email: [‘mail’, ’email’, ‘userPrincipalName’]
name: ‘cn’
first_name: ‘givenName’
last_name: ‘sn’

EOS


The default is to retain a lot of log files — 30 days! This might be reasonable in a corporate environment, but even for production at home … that’s a lot of space dedicated to log files.


logging[‘logrotate_frequency’] = “daily” # rotate logs daily
logging[‘logrotate_rotate’] = 3 # keep 3 rotated logs
logging[‘logrotate_compress’] = “compress” # see ‘man logrotate’
logging[‘logrotate_method’] = “copytruncate” # see ‘man logrotate’


And finally configure SMTP for outbound mail. We don’t use authentication on our SMTP server; it controls relay based on source IP. We do use starttls, but the certificate is not going to be trusted without additional configuration … so I set the ssl verify mode to none.


gitlab_rails[‘smtp_enable’] = true
gitlab_rails[‘smtp_address’] = “smtp.hostname.ccTLD”
gitlab_rails[‘smtp_port’] = 25
# gitlab_rails[‘smtp_user_name’] = “smtp user”
# gitlab_rails[‘smtp_password’] = “smtp password”
# gitlab_rails[‘smtp_domain’] = “example.com”
# gitlab_rails[‘smtp_authentication’] = “login”
gitlab_rails[‘smtp_enable_starttls_auto’] = true
# gitlab_rails[‘smtp_tls’] = false

###! **Can be: ‘none’, ‘peer’, ‘client_once’, ‘fail_if_no_peer_cert’**
###! Docs: http://api.rubyonrails.org/classes/ActionMailer/Base.html
gitlab_rails[‘smtp_openssl_verify_mode’] = ‘none’


Once the config has been updated, restart the container (docker restart gitlab).

Access the web site and you’ll be prompted to set a password for the admin user, root. You can click the ‘ldap’ tab and log in with Active Directory credentials. Fin.

If we deploy this for a production system, I would set up SSL on the web site and possibly externalize the GitLab database to MySQL. The external database is more of an academic experiment because we already use MySQL (and I still don’t want  to learn about vacuuming PostgreSQL).

Apache Airflow — No Backfill

A lot of software seems to be designed to save the user from themselves. This is great 90% of the time when you mess up and really want their help (or when the software’s help is cosmetic … my gripe against auto-correcting smart quotes, as an example). But I seem to fall into the other 10% a lot. And I mean a LOT. Apache Airflow jobs try to grab new information all.of.the.time. It’s a feature called “backfill”, and I’m sure it helps all sorts of people do exactly what they really wanted done. Not me 🙁

Having updated to 1.8, though, I now see a configuration parameter to instruct a DAG not to do me any favors. Just do what you’re asked when you’re asked to do it: catchup = False

DAG('testjob', default_args=default_args, schedule_interval='0 * * * *', catchup=False)

Android Mail Client Malfunction On FierceXL

Both Scott and I have an odd issue with our FierceXL using the stock mail client to communicate with Exchange 2013 over the OWA interface. Randomly, one or more of the connected accounts stops receiving e-mail. We know OWA still works and is available from the phone — we can go into Chrome on the phone and log into OWA. There is absolutely no traffic coming across the reverse proxy / Exchange server from the phone IP. Switching between the cellular network and home WiFi has no impact.

I had just been rebooting my phone. Upon startup, communication is again seen on the reverse proxy server. A few seconds later, the backlog of new mail starts popping into the mail client. Scott recently discovered that you can resolve the issue by closing the mail app (bringing up the recent/running application list and swiping mail off of the screen) and re-opening it.

It appears something within the mail client is getting a thread hung — not the mail client en toto as I often cease receiving messages to one of three accounts. Ending the process and re-spawning it clears whatever is hung. Unfortunately, we have not had any updates for these phones since November of last year so there’s not a quick software fix that can be applied to resolve the issue.

The Peril Of Hosting Your Own Services

I love hosting my own services — home automation, file shares, backups, e-mail, web servers, DNS … bit of paranoia, a bit of control freak, and a bit of pride. But every now and again, hosting my own services causes problems because, well, vendors don’t develop processes around someone with servers in their house.

We got a new cable modem. Scott went to a web page (happened to be Google) and got redirected to the TWC activation page. Went through whatever, ended up calling into support, and finally our account was sorted. Woohoo! Everything works … umm, except I cannot search Google.

Turns out TWC manages their activation redirection by serving up bogus DNS info — their server IP instead of the real one. Which then got cached on our DNS server. No idea what TTL TWC set on their bogus data, but it was more than a minute or two. Had to clear the DNS server cache before we were able to hit Google sites again.

OK, Google

Chrome 58 was released last month – and since then, I’ve gotten a LOT of certificate errors. Especially internally (Windows CA signed certs @ home and @ work). It’s really annoying – yeah, we don’t have SAN dnsHost attributes defined. And I know the RFC says falling back to CN is deprecated (seriously, search https://tools.ietf.org/html/rfc2818 for subjectAltName) but the same text was in there in 1999 … so not exactly a new innovation in SSL policy. Fortunately there’s a registry key that will override this for now.

The problem I have with SAN certificates is exemplified in Google’s cert on the web server that hosts the chromium changes site:

Seriously – this certificate ensures that the web site is any of these hundred wild-carded hostnames … and the more places you use a certificate, the greater the possibility of it being compromised. I get why people like wildcards — UALR was able to buy one cert & use it across the entire organisation. Cost effective and easy. The second through nth guy who wanted an SSL cert didn’t need to go about establishing his credentials within the organisation. He didn’t have to figure out how to make a cert request or how to pay for it. Just ask the first guy for a copy of his public/private key pair. Or run everything through your load balancer on the wildcard certificate & trust whatever backend cert happens to be in place.

But the point of security design is not trusting large groups of people do act properly. To secure their data appropriately. To patch their systems, configure their system to avoid attacks, to replace the certificate EVERYWHERE every TIME someone leaves the organisation, and otherwise prevent a certificate installed on dozens of servers from being accessed by a malicious party. My personal security preference would be seeing a browser flag every time a cert has a wildcard or more than one SAN.

Exchange Online

We’re moving users to the magic in-the-cloud Exchange. Is this a cost effective solution? Well – that depends on how you look at the cost. The on prem cost includes a lot of money to external groups that are still inside the company. If the SAN team employs ten people … well, that’s a sunk cost if they’re administering our disk space or not. If we were laying people off because services moved out to magic cloud hosted locations … then there’s a cost savings. But that’s not reality. Point being, there’s no good comparison because the internal “costs” are inflated. Microsoft’s pricing to promote cloud adoption means EOL is essentially free with purchase too. I’m sure the MS cost will go up in the future — I remember them floating “leased” software back in the late 90’s (prelude to SaaS) and thinking that was a total racket. You move all your licensing to this convenient “pay for what you use” model. And once a plurality of customers have adopted the licensing scheme, start bumping up rates. It’s a significant undertaking to migrate over – but if I’m saving hundreds of thousands of dollars a year … worth it. Rates go up, and the extra fifty grand a year isn’t worth the cost and time for migrating back to on prem. And next year that fifty grand more isn’t worth it either. Economies of scale say MS (or Amazon, or whomever) can purchase ten thousand servers and petabytes of disk space for less money than I can get two thousand servers and a hundred terabytes … but they want to make a profit too. There might be a small cost savings in the long term, but nothing like the hundreds of thousands we’re being sold up front.

Regardless – business accounting isn’t my thing. A lot of it seems counter-productive if not outright nonsensical. There are actually features in Exchange Online that do not exist in the on prem solution. The one I discovered today is subaddressing. At home, we use the virtusertable in sendmail to map entire subdomains to a single mailbox. This means I can provide a functional e-mail address, on the fly, to a new company and have mail delivered into my mailbox. Works fine for a small number of people, but it is not a scalable solution. Some e-mail providers started using a delimiter after which any string was ignored. This means I could have a GMail account of DevNull@gmail.com but get mail as DevNull+SomeRandomString@gmail.com or DevNull+CompanyNameHere@gmail.com … great for identifying who is losing your e-mail address out in Internet-land. Also somewhat trivial to write a rule that takes +SomeCompromisedAddress and move it to trash. EOL lets us do that.

Another interesting feature that is available on prem but not convenient is free busy federation (now termed an “organisational relationship”). In previous iterations, both parties needed to establish firewall rules (and preferably a B2B connection) to transfer the free busy data. But two companies with MS tenants should be able to link up without having to enact firewall changes. We still connect to the tenant. The other party still connects to the tenant. It’s our two tenants that communicate via MS’s network. Something I’m interested in playing around with … might try to see if we can link our sandbox tenant up to the production one just to see what exactly is involved.

Git, Version Management, Branches, and Sub-modules

As we have increased in staff, we’ve gained a few new programmers. While it was easy enough for us to avoid stepping on each other’s toes, we have experienced several production problems that could be addressed by rethinking our repository configuration.

Current state: We have a monolithic repository for different batch servers. Each server has a clone of the repository, and the development equivalent has a clone of the same repository. The repository has top-level folders for each independent script. There is a SharedTools top-level folder for reusable functions.

Changes are made on forks located both on the development server and individuals’ computers, tested on the development server, then pushed to the repo. Under a CRQ, a pull is performed from the production server to elevate the new code. Glomming dozens of scripts into a single repository was simple and quick; but, with new people involved with development efforts, we have experienced challenges with changes being lost, unintentional elevation of code, and having UAT run against under-development code.

Pitfalls: Four people working on four different scripts are working in the same repository. We have had individuals developing on their laptop overwrite changes (force push is dangerous, even if force-with-lease is used), we have had individuals developing on the dev server commit other people’s edits (git add * isn’t a good idea in a shared environment – specifically add changed files to your commit), and we’ve had duplication of effort (which is certainly a problem outside of development efforts, and one that can be addressed outside of git).

We could address the issues we’ve seen through training and communication – ensure anyone contributing code to the repository adequately understands what force push means, appreciates what wildcards include, and generally have a more nuanced understanding of git than the one-hour training I provided last year. But I think we should consider the LOE and advantages of using a technical solution to ensure less experienced git users are able to successfully use our repositories.

Proposal – Functional Splits:

While we have a few more individuals with development experience, they are quite specifically Windows script developers (PowerShell, VBScript, etc). We could just stop using the Windows batch server and let the two or three Microsoft guys figure it out for themselves. This limits individual growth – I “don’t do” PowerShell development, the Windows guys don’t learn Linux. And, as the group changes over time, we have not addressed the underlying problem of multiple people working on the same codebase.

Proposal – Git Changes:

We can begin using branches for development efforts and reserve “master” for ready-for-deployment code. Doing so, we eliminate the possibility of inadvertently elevating code before it is ready – only commands targeted to “origin master” will be run on production servers.

Using descriptive branch names (Initials-ScriptFolderName-SummaryOfChange) will help eliminate duplicated efforts. If I notice we need to send a few mass mails with inline images, seeing “TJR-sendMassMail-AddInlineImages” in the branch list lets me know you’ve got it covered. And “TJR-sendMassMail-RecipientListFromLiveLDAPQuery” lets me know you’re working on something else and I’m setting myself up for merge challenges by working on my change right now. If both of our changes are high priority, we might choose to work through a merge operation. This would be an informed, upfront decision instead of a surprise message indicating that fast-forward merging is not possible.

In large development projects, branch management can become a full-time pursuit. I do not think that will be an issue in our case. Minimizing the number of branches used, and not creating branches based on branches, makes branch management a simpler task. We should be able to perform fast-forward merges to push code into master because our branches modify different files in the repository.

To begin a development effort, create a branch and push it to the git server. Make your changes within that branch, and ensure you keep your branch in sync with master – you cannot merge branches that are “behind” into master without force. Once you are finished with your development, merge your branch into master and delete your branch. This approach will require additional training to ensure everyone understands how to create, rebase, merge, and delete branches (and not to just force operations because it lets you complete your task).

Instead of using ‘master’ for production code, the inverse is equally viable: create a “stable” branch that is for production code and only pull that branch to PROD servers. I believe this approach is done to prevent accidental changes to prod code – you’ve got to intentionally target “origin stable” with an operation to impact production code.

Our single repository configuration is a detriment to using branches if development is performed on the DEV server. To illustrate the issue, create a BranchTesting repo and add a single file to master. Create a Branch1 branch in one command window and check it out. Create a Branch2 in a second command window and check it out. In your first command window, add a file and commit it. In your second command window, add a file and commit it. You will find that both files have been committed to Branch2.

How can we address this issue?

Develop on our individual workstations instead of the DEV server. Not sharing a file set for our development efforts eliminates the branch context switching problem. If you clone the repo to your laptop, Gary clones the repo to his laptop, and I clone the repo to my laptop … you can create TJR-sendMassMail-AddInlineImages on your computer, write and test the changes locally, commit the changes and pull them to the DEV server for more robust testing, and then merge your changes into master when you are ready to elevate the code. I can simultaneously create LJR-monitorLDAPReplication-AddOUD11Servers, do my thing, commit changes and pull them to the DEV server (first using “git branch” to determine if someone else is already testing their branch on the DEV server), and merge my stuff into master when I’m ready to elevate. Other than remembering to ensure you verify that DEV has master checked out (i.e. no one else is testing, so the resource is free), we do not have resource contention.

While it may not be desirable to fill up our laptop drives with the entire code set from six different application servers, sparse-checkout allows you to select the specific folders that will come down to your fork.

The advantage of this approach is that it has no initial LOE beyond training and process change. The repositories are left as-is, and we start using them differently.

Unfortunately, this approach may not be viable in some instances – when access to data sources is restricted by IP ACL, you may not be able to do more than linting on your laptop. It may not even be possible to configure a Windows laptop to run some of our code – some Linux requirements are difficult to address in Windows (the PKI website’s cert info check, for instance), and testing code on Windows may not ensure successful operation on the Linux hosts.

Break the monolithic repositories into discrete repositories and use submodules allow the multiple independent repositories to be “rolled up” into a top-level repository. Development is done in the submodule repositories. I can clone monitorLDAPReplication, you can clone sendMassMail, etc. changes can be made within our branches of these completely different repositories and merged into the individual repository’s master branch for release to the production environment. Release can be done for the superset (“–recurse-submodules”) or individual sub-modules.

This would require splitting a repository into its individual components and configuring the sub-module relationships. This can be a scripted operation, and it is an incremental change to the script I used to create the repositories and ingest code; but the LOE for implementation is a few days of script writing / testing. Training will be required to ensure individuals can register their submodules within the top-level repo, and we will need to accustom ourselves to maintaining individual repos.

Or just break monolithic repositories into discrete repositories. The level of effort is about the same initially, but no one needs to learn how to set up a new submodule. We lose single-repo conveniences, but there’s literally no association between our different script folders where someone working in X could inadvertently impact Y.

GoFCCYourself(.com)

You know what you find when you drain a swamp? A whole bunch of rotting detritus. I’m not going to pretend astonishment that a former Associate General Counsel from Verizon thinks net neutrality is a terrible idea. I remember getting an e-mail message from my employer, another network provider, detailing how this terrible proposal was going to drive us all out of business. Or something similarly over-dramatic.

Facilitating public comment on Executive branch proceedings, such as GoFCCYourself.com, is an interesting idea. Take a circuitous government web site that ostensibly allows individuals to post comments on issues and circumvent the terrible user interface by getting your own URL and I assume including the appropriate POST headers to get individuals in exactly the right place to submit their comments.

I’ve used this short-cut to submit my opinion to the FCC, but I also forwarded the same message to my rep in the House and my two state Senators:

I have submitted this to the FCC for Docket 17-108 but wanted to include you as well. If the FCC does roll back net neutrality, as their chairman indicates is his desire, I beseech you to ready legislative controls to prevent ISPs from using speed controls to essentially censor Internet content.

I am writing to express my support for “net neutrality” — while you want to claim it reduces carrier investment or innovation, customer acquisition and retention drives carrier investment and innovation. Lowered cost of operations, creating a service that allows a higher price point, or offering a new service unavailable through a competitor drive innovation. Allowing a carrier to create a new revenue stream by charging content providers for faster access is not innovation – QoS has been around for decades. And it isn’t like the content is being delivered to the Internet for free. Content providers already pay for bandwidth — and a company like Netflix probably paid a LOT of money for bandwidth at their locations. If Verizon didn’t win a bid for network services to those locations, that’s Verizon’s problem. Don’t create a legal framework for every ISP to profit from *not* providing network services for popular sites; the network provider needs to submit a more competitive bid.

What rolling back net neutrality *does* is stifle customers and content providers. If I, as a customer, am paying 50$ a month for my Internet service but find the content that I *want* is de-prioritized and slowed … well, in a perfect capitalist system, I would switch to the provider who ‘innovates’ and goes back to their 2017 configurations. But broadband access – apart from some major metro areas – is not a capitalist system. Where I live, outside of the Cleveland suburbs, I have my choice of the local cable company or sat – sat based Internet introduces a lot of latency and is quite expensive for both the customer and the operator (and has data limits, which themselves preclude a lot of network-intensive traffic that ISPs wish to de-prioritize). That’s not a real choice — pay 50$ to this company who is going to de-prioritize anyone who doesn’t pay their network bandwidth ransom or pay 100$ to some other company that is unable to provide sufficiently low latency to allow me to work from home. So add a hour of commute time, fuel, vehicle wear, and reduced family time to that 100$ bill.

Rolling back net neutrality stifles small businesses — it’s already difficult to compete with large corporations who have comparatively unlimited budgets for advertising and lawyers. Today, a small business is able to present their product online with equal footing. In 1994, I worked at a small University. One of my initiatives was to train departmental representatives on basic HTML coding so the college would have an outstanding presence on the Internet. First hour of the first day of the training session included a method for checking load times off campus without actually having to leave the campus network. On campus, we were 10 meg between buildings and the server room and anything loaded quite quickly. At home, a prospective student was dialing in on a 28.8 modem. If your content is a web page for MIT, a prospective engineering student may be willing to click your site, go eat dinner, and come back. Load time isn’t as much of a problem for an organisation with a big name and reputation. Unknown little University in Western PA? Click … wait … wait, eh, never mind. The advent of DSL was amazing to me because it provided sufficient bandwidth and delivered content with parity that allowed an unknown Uni to offer a robust web site with videos of the exciting research opportunities available to students and the individual attention from professors that small class sizes allow. No longer did we need to restrict graphics and AV on our site because we weren’t a ‘big name’ University. That there ever was a debate about removing this parity astonished me.

Aside from my personal opinion, what is the impact of non-neutral networks on free speech? Without robust legal controls, ISPs engage in a form of quasi-censorship. How do you intend to prevent abuse of the system? Is a large corporation going to be able to direct “marketing” dollars to speeding up their page to the harm of their competitors? Can the Coca-Cola Company pay millions of dollars to have their content delivered faster than PepsiCo’s? Is the ISP then the winner in a bidding war between the two companies? What about political content? Does my ISP now control the speed at which political content is delivered? What happens when Democrats raise more money in the Cleveland metro area and conservative views are relegated to the ‘slow’ lane? What happens when the FCC gets de-prioritized because ISPs want even less regulation??

I would still worry about the legal controls to prevent quasi-censorship, but I would object less if the FCC were to implement the net neutrality requirements like some of the telco regulations for CLEC’s where there were no ILEC’s had been — where there is no or limited competition, net neutrality is a requirement. Where there are a dozen different ISP options, they can try selling the QoS’d packages. Polls and voting aside, the ISP will find out exactly how many customers or content providers support non-neutral networks.

Owntracks Stuck In “Connecting” To MQTT When Using WebSockets

Our home automation presence is maintained through an Android app, OwnTracks, which updates a Mosquitto server via a WebSockets reverse proxy. Mosquitto runs on a Fedora 25 server and was installed from the default RPM repository.

Recently, we stopped receiving location updates – both of our Android clients were stuck “Connecting” to the MQTT server. Nothing appeared in the Apache access or error logs, and capturing network traffic only got a small number of packets (TCP session overhead ‘stuff’). Even bypassing the reverse proxy and using the internal network to communicate directly to the Mosquitto server only created a couple of packets. Using a test client (http://www.hivemq.com/demos/websocket-client/), I saw strange connection failures — so I knew the problem was not specific to the OwnTracks client.

It seems there was a bug in libwebsockets v2.1.1 (and possibly others) — when we updated our Fedora installation, the new libwebsockets broke our MQTT over WebSockets. Currently, the Fedora repository still contains an impacted version of libwebsockets. To resolve the issue, I built the latest stable libwebsockets and built mosquitto against this updated library.

Process: The first step is to remove the dnf managed packages (rpm -e libwebsockets libwebsockets-devel mosquitto). Then build libwebsockets and mosquitto.

To Build LibWebSockets:

wget https://github.com/warmcat/libwebsockets/archive/master.zip
unzip master.zip
cd libwebsockets-master/
mkdir build
cd build
cmake ..
make
make install
cp libwebsockets.pc /usr/lib/
cp lws_config.h /usr/include/
cp ../lib/libwebsockets.h /usr/include/
cp ./lib/libwebsockets.so /usr/lib/

To Build Mosquitto:

wget https://github.com/eclipse/mosquitto/archive/master.zip
unzip master.zip
cd mosquitto-1.4.11
vi config.mk # Line 68, change to “WITH_WEBSOCKETS:=yes”
make
make install

Start the Mosquitto server and try again. Voila, presence works again!