Month: December 2023

Cranberry Crumble

I made a cranberry crumble that Scott found in the local news paper

FILLING:

  • 2 cups cranberries
  • Zest and juice of 1/2 orange
  • 1/2 cup maple syrup
  • 1 1/2 tablespoons cornstarch
  • 1/4 teaspoon ground cinnamon

CRUST:

  • 1 1/2 cups all-purpose flour
  • 1 1/2 cups almond flour
  • 1/2 cup granulated sugar
  • 1 teaspoon baking powder
  • 1/4 teaspoon salt
  • 1/4 teaspoon ground nutmeg
  • Zest of 1/2 orange
  • 4 tablespoons cold unsalted butter, cubed
  • 2 large egg whites
  • 1 1/2 teaspoons vanilla extract

Preheat oven to 375F.

Toss cranberries with corn starch and cinnamon. Add orange zest and mix. Then stir in orange juice and maple syrup.

Combine dry crust ingredients. Cut in butter, then add egg whites and vanilla to form a crumble. Press some into bottom of pan. Cover with cranberry mixture, then crumble crust mixture on top.

Bake at 375F for 40 minutes. Allow to cool.

Lamb Shank Tagine

Cooked lamb shanks in the tagine today

  • 4 smoked ancho chillies
  • 5 dried apricots
  • 4 lamb shanks
  • olive oil
  • 1 red onions
  • 8 cloves of garlic
  • 5 fresh red chillies
  • 1 heaped teaspoon smoked paprika
  • 3 fresh bay leaves
  • 2 sprigs of fresh rosemary
  • 400 g tin of plum tomatoes
  • 1 liter chicken stock

Coat shanks with salt and pepper. Sear shanks in olive oil, then add onions and garlic 1for a few minutes until fragrant. Remove from pan.

Add stock, plum tomatoes, spices, and peppers. Add shanks back, cover, and simmer at low heat for about 3 hours.

Rose Hip Jelly

Anya and I picked more rose hips today — we’ve got a LOT of the little things. About half a pound.

We boiled them (plus an apple) in enough water to cover by 1″, added a little lemon juice, then drained the juice. Added a whole bunch of sugar to make jelly (14 oz for every pint of juice), and let it cool and congeal overnight.

ElasticSearch — Too Many Shards

Our ElasticSearch environment melted down in a fairly spectacular fashion — evidently (at least in older iterations), it’s an unhandled Java exception when a server is trying to send data over to another server that is refusing it because that would put the receiver over the shard limit. So we didn’t just have a server or three go into read only mode — we had cascading failure where java would except out and the process was dead. Restarting the ElasticSearch service temporarily restored functionality — so I quickly increased the max shards per node limit to keep the system up whilst I cleaned up whatever I could clean up

curl -X PUT http://uid:pass@`hostname`:9200/_cluster/settings -H "Content-Type: application/json" -d '{ "persistent": { "cluster.max_shards_per_node": "5000" } }'

There were two requests against the ES API that were helpful in cleaning ‘stuff’ up — GET /_cat/allocation?v returns a list of each node in the ES cluster with a count of shards (plus disk space) being used. This was useful in confirming that load across ‘hot’, ‘warm’, and ‘cold’ nodes was reasonable. If it was not, we would want to investigate why some nodes were under-allocated. We were, however, fine.

The second request: GET /_cat/shards?v=true which dumps out all of the shards that comprise the stored data. In my case, a lot of clients create a new index daily — MyApp-20231215 — and then proceeded to add absolutely nothing to that index. Literally 10% of our shards were devoted to storing zero documents! Well, that’s silly. I created a quick script to remove any zero-document index that is older than a week. A new document coming in will create the index again, and we don’t need to waste shards not storing data.

Once you’ve cleaned up the shards, it’s a good idea to drop your shard-per-node configuration down again. I’m also putting together a script to run through the allocated shards per node data to alert us when allocation is unbalanced or when total shards approach our limit. Hopefully this will allow us to proactively reduce shards instead of having the entire cluster fall over one night.

DIFF’ing JSON

While a locally processed web tool like https://github.com/zgrossbart/jdd can be used to identify differences between two JSON files, regular diff can be used from the command line for simple comparisons. Using jq to sort JSON keys, diff will highlight (pipe bars between the two columns, in this example) where differences appear between two JSON files. Since they keys are sorted, content order doesn’t matter much — it’s possible you’d have a list element 1,2,3 in one and 2,1,3 in another, which wouldn’t be sorted.

[lisa@fedorahost ~]# diff -y <(jq --sort-keys . 1.json) <(jq --sort-keys . 2.json )
{                                                               {
  "glossary": {                                                   "glossary": {
    "GlossDiv": {                                                   "GlossDiv": {
      "GlossList": {                                                  "GlossList": {
        "GlossEntry": {                                                 "GlossEntry": {
          "Abbrev": "ISO 8879:1986",                                      "Abbrev": "ISO 8879:1986",
          "Acronym": "SGML",                                  |           "Acronym": "XGML",
          "GlossDef": {                                                   "GlossDef": {
            "GlossSeeAlso": [                                               "GlossSeeAlso": [
              "GML",                                                          "GML",
              "XML"                                                           "XML"
            ],                                                              ],
            "para": "A meta-markup language, used to create m               "para": "A meta-markup language, used to create m
          },                                                              },
          "GlossSee": "markup",                                           "GlossSee": "markup",
          "GlossTerm": "Standard Generalized Markup Language"             "GlossTerm": "Standard Generalized Markup Language"
          "ID": "SGML",                                                   "ID": "SGML",
          "SortAs": "SGML"                                    |           "SortAs": "XGML"
        }                                                               }
      },                                                              },
      "title": "S"                                                    "title": "S"
    },                                                              },
    "title": "example glossary"                                     "title": "example glossary"
  }                                                               }
}                                                               }

Bulk Download of YouTube Videos from Channel

Several years ago, I started recording our Township meetings and posting them to YouTube. This was very helpful — even our government officials used the recordings to refresh their memory about what happened in a meeting. But it also led people to ask “why, exactly, are we relying on some random citizen to provide this service? What if they are busy? Or move?!” … and the Township created their own channel and posted their meeting recordings. This was a great way to promote transparency however they’ve got retention policies. Since we have absolutely been at meetings where it would be very helpful to know what happened five, ten, forty!! years ago … my expectation is that these videos will be useful far beyond the allotted document retention period.

We decided to keep our channel around with the historic archive of government meeting recordings. There’s no longer time criticality — anyone who wants to see a current meeting can just use the township’s channel. We have a script that lists all of the videos from the township’s channel and downloads them — once I complete back-filling our archive, I will modify the script to stop once it reaches a video series we already have. But this quick script will list all videos published to a channel and download the highest quality MP4 file associated with that video.

# API key for my Google Developer project
strAPIKey = '<CHANGEIT>'

# Youtube account channel ID
strChannelID = '<CHANGEIT>'

import os
from time import sleep
import urllib
from urllib.request import urlopen
import json
from pytube import YouTube
import datetime

from config import dateLastDownloaded

os.chdir(os.path.dirname(os.path.abspath(__file__)))
print(os.getcwd())

strBaseVideoURL = 'https://www.youtube.com/watch?v='
strSearchAPIv3URL= 'https://www.googleapis.com/youtube/v3/search?'

iStart = 0		# Not used -- included to allow skipping first N files when batch fails midway
iProcessed = 0		# Just a counter

strStartURL = f"{strSearchAPIv3URL}key={strAPIKey}&channelId={strChannelID}&part=snippet,id&order=date&maxResults=50"
strYoutubeURL = strStartURL

while True:
    inp = urllib.request.urlopen(strYoutubeURL)
    resp = json.load(inp)

    for i in resp['items']:
        if i['id']['kind'] == "youtube#video":
            iDaysSinceLastDownload = datetime.datetime.strptime(i['snippet']['publishTime'], "%Y-%m-%dT%H:%M:%SZ") - dateLastDownloaded
            # If video was posted since last run time, download the video
            if iDaysSinceLastDownload.days >= 0:
                strFileName = (i['snippet']['title']).replace('/','-').replace(' ','_')
                print(f"{iProcessed}\tDownloading file {strFileName} from {strBaseVideoURL}{i['id']['videoId']}")
                # Need to retrieve a youtube object and filter for the *highest* resolution otherwise we get blurry videos
                if iProcessed >= iStart:
                    yt = YouTube(f"{strBaseVideoURL}{i['id']['videoId']}")
                    yt.streams.filter(progressive=True, file_extension='mp4').order_by('resolution').desc().first().download(filename=f"{strFileName}.mp4")
                    sleep(90)
                iProcessed = iProcessed + 1
    try:
        next_page_token = resp['nextPageToken']
        strYoutubeURL = strStartURL + '&pageToken={}'.format(next_page_token)
        print(f"Now getting next page from {strYoutubeURL}")
    except:
        break

# Update config.py with last run date
f = open("config.py","w")
f.write("import datetime\n")
f.write(f"dateLastDownloaded = datetime.datetime({datetime.datetime.now().year},{datetime.datetime.now().month},{datetime.datetime.now().day},0,0,0)")
f.close

Honey Comb Spice Storage

There are a few different places selling hexagonal bottom spice containers with magnetic lids … and maybe the really expensive (over a hundred bucks for like a dozen containers?!?) ones are more than just a magnet under the lid. But the rest? Are literally a hexagonal spice jar with a small disc neodymium magnet attached on the underside of the lid.

Voila — a lot of little spice jars that don’t cost a lot of money. But arranging them all in a solid honeycomb pattern doesn’t work so well. Invariably, you need the spice jar right in the middle of the mass. Instead, I am making a giant circle.

Tableau: Upgrading from 2022.3.x to 2023.3.0

A.K.A. I upgraded and now my site has no content?!? Attempting to test the upgrade to 2023.3.0 in our development environment, the site was absolutely empty after the upgrade completed. No errors, nothing indicating something went wrong. Just nothing in the web page where I would expect to see data sources, workbooks, etc. The database still had a lot of ‘stuff’, the disk still had hundreds of gigs of ‘stuff’. But nothing showed up. I have experienced this problem starting with 2022.3.5 or 2022.3.11 and upgrading to 2023.3.0. I could upgrade to 2023.1.x and still have site content.

I wasn’t doing anything peculiar during the upgrade:

  • Run TableauServerTabcmd-64bit-2023-3-0.exe to upgrade the CLI
  • Run TableauServer-64bit-2023-3-0.exe to upgrade the Tableau binaries
  • Once installation completes, run open a ​new​ command prompt with ​Run as Administrator and launch “.\Tableau\Tableau Server\packages\scripts.20233.23.1017.0948\upgrade-tsm.cmd” –username username

The upgrade-tsm batch upgrades all of the components and database content. At this point, the server will be stopped. Start it. Verify everything looks OK – site is online, SSL is right, I can log in. Check out the site data … it’s not there!

Reportedly this is a known bug that only impacts systems that have been restored from backup. Since all of my servers were moved from Windows 2012 to Windows 2019 by backing up and restoring the Tableau sites … that’d be all of ’em! Fortunately, it is easy enough to make the data visible again. Run tsm maintenance reindex-search to recreate the search index. Refresh the user site, and there will be workbooks, data, jobs, and all sorts of things.

If reindexing does not sort the problem, tsm maintenance reset-searchserver should do it. The search reindex sorted me, though.

Kanban for Kids

I find it interesting that Anya’s school had very good “lessons” for taking notes — the teacher had a class where she would talk and the kids took notes. The kids then submitted the notes, and she basically graded the notes. “This was just a funny story, no need to include in notes” or “when I mention something three times, it needs to be in your notes. Add ‘whatever got mentioned repeatedly in class’ here”. I won’t say that Anya loved note taking class, but she did it. And, since kids got to use their notes for their tests … she saw the benefit of having decent notes.

Time management, though, the school seems to take the “throw in water, if you don’t drown … well, you can swim!” approach. They assign a bunch of stuff, generally due around the same time for extra fun. And then they don’t say anything to the kid if they’re a month behind. So I found myself explaining Kanban boards to Anya.

We do digital things at work, but she needs something everyone can see just walking by. So paper cards, magnets, and white board it is! You make a column for “stuff I am going to need to do” — we call this a backlog. New assignments go here first. We thought color coding the classes would be cool — so take a square of paper, write the name of the assignment, the date it is due, how long you guess it will take to complete (1 hour, 1 day, 1 week).

You then have other columns for “in progress”, “done”, and “stuck” — we have additional columns at work because they make sense for what we do — “UAT testing”, “Awaiting Feedback”. You may find there are other columns that make sense for your classes too — I used to have a “Researching”, “Draft”, “Editing”, and “Final Draft” columns because everything was a research paper.

At work, we plan a week or two of work — you pick enough cards to fill up the week, and that’s what you are working on. This means our cards could represent a week of work — I’m only going to finish one card this week, but that’s a full week of work. For school work, picking the cards daily kind of makes sense because new assignments pop in all the time. It would be difficult to shoehorn new assignments into an already planned out week.

If the “in progress” items will be picked daily, then a card shouldn’t represent more than a day of work. So “Write the paper” is too generic. That would need to be broken out into “select topic”, “start research”, “continue research”, “finish research”, “start draft”, “continue draft”, “finish draft”, “review draft”, “edit draft”, and “finalize report” might all be reasonable items to accomplish in a day.

Using this method, you can see when things are due, realize when you have two or three big things due at the same time (so some are going to need to be finished early), and can keep track of anything where you are stuck (had to ask the teacher for clarification, waiting for a book to be available from the library, etc).

If nothing else, she seems happy about the “moving it to the completed column” bit!

Amazon Luna Black Friday Scam

This is really silly — we ordered a Luna controller on Black Friday because it was on sale, with the idea of it being a gift for when Anya finishes school for the half. I specifically did not associate the controller with our account. Today, we noticed that the “free trial” had already been activated.

Trying to talk to Amazon didn’t really go anywhere — I couldn’t even get the fellow to understand why automatically starting a free trial (before the item actually even arrives) during a gift-giving season where people are generally buying something weeks before it would be opened was problematic. Evidently there’s no such thing as a Luna controller plus 30 days Luna+ bundle — there’s a controller and a free trial. And the free trial starts when you place your order — not when you get the controller, not when you start using it. Ultimately, they say you’ve got to call back during M-F 9-5 kind of hours when someone from Luna is available.

Not something I’m available to deal with today, but a very odd practice for a company that size. Like no one thought through the logic on this one?!? Buy it on Black Friday, 30 days later is December 27th. So people getting Christmas gifts have three days to check it out?

If they’re offering a free trial to anyone who signs up, why bundle it with the controller??? I could have gotten the controller and just signed up for the trial the day the present would be given.