QZ qz thoughts
a blog from Eli the Bearded
Tag search results for 2020,~blog Page 2 of 4

Sitemap Plugin


About a month ago I posted about finding a lot of Blosxom plugins on github. I've been looking at some them. There are a family of them, one original and a few modifications to the original, for enabling comments. I have not gotten those to work: best I've gotten is I can leave comments, but not see them on pages. I may end up writing my own, so that it follows my idea of what is needed for comments.

But also in that batch of plugins was one for Google Sitemap. The documentation is non-existant in the repo. Searching the web I did find blog posts from the author, in Japanese. From those I gather the version in the github repo is the old, memory intensive way to build a sitemap. I didn't find the new improved version.

I decided to make do. The gsitemap plugin is dead simple. It just sets a few variables to use in templates, and then when the sitemap flavor is desired, disables pagination. The rest of the magic happens in the flavour templates.

As part of making do, I'm not going to reference or link to the URL that generates the output (if you are reading and want to use this for your own site, the gsitemap plugin in the original configuration would generate it for https://example.com/blosxom/index.xml, assuming /blosxom/ is the root of your Blosxom blog).

Instead I've reconfigured the templates to generate a fragment of a sitemap XML file, changed the flavour to sitemap from xml, and have scripted up a sitemap builder for the whole qaz.wtf site that curls the proper URL and includes() the xml fragment. Blosxom then can remain the source of truth for blog permalinks while find and some per-directory configuration can build URLs for other parts of the site.

I decided I should run that script from SAVE-DATES.sh under the theory that any time I save post timestamps is a likely time I want to rebuild the sitemap. This works for qaz.wtf because the blog is the only thing updating more frequently than monthly, and I typically run SAVE-DATES.sh shortly after posting an entry.

This is all prompted by looking (again) at just how much bot traffic the site gets. I figure a sitemap will stop well-behaved bots from crawling as much or as frequently. And for non-well-behaved bots, I've belt and suspendered things by adding entries to robots.txt and more user-agents to my browser_block plugin.

Similarly in the name of improving search engine interaction, I've got a new (trivial) plugin called extrameta that gets used by other plugins, namely the newly modified tags plugin and pagination plugin to add a <meta name="robots" content="noindex"> header (in a naive way) to search result pages, to avoid duplicated content.

May Updates


A bunch of things of small changes this first week of May.

  1. The qzpostfilt tool has a bug fix with inline <code> and a new close & open paragraph operator: .pp
  2. The is a search box on the blog to run tag searches
  3. The aaa_tags and paginateqz plugins have slight changes to make search results and later pages marked as such
  4. There are flavour template and css changes to go with that
  5. I have finished going through all old blog posts and fixing the easily fixable links.
  6. I have gone through all the old blog posts and fixed non-UTF-8 characters and title formatting.

Landfill


Landfill: Notes on Gull Watching and Trashpicking in the Anthropocene by Tim Dee. Copyright 2018, first printing February 2019.

I've read a bit about garbage, most recently Waste and Want by Susan Strasser (1999, but apparently used as a textbook, so easy to find new), so I thought this might be good to get some fresher information. The title, and subtitle, certainly pulled me in.

No. This is a British author writing a lot about his personal experiences, often as a reporter following more serious bird watchers than him. There are, it seems, a fair amount of bird watchers who specialize in watching sea gulls. In many cases these people hang around landfills and transfer stations because the gulls like the easy pickings.

Every chapter is essentially a self-contained essay with at least some tangential connection to gulls. There's one that compares and contrasts Hitchcock's The Birds to the original short story, with some attention devoted to the gulls in each for example. It's not what I wanted, but it's not a bad book.

When I found this in a bookstore (the famous "City Lights Books" in San Francisco, which I was visiting with some out-of-town house guests), I was drawn to the title and picked it up to read a couple of pages from inside. I happened upon chapter eight "London Labour and London Poor". This is one of the least gull-ish chapters, but also one of the most interesting to my tastes.

That chapter is about Henry Mayhew's three volume 1851 (based on 1840s work writing a newspaper column; volume four came out 1862) London Labour and the London Poor (at Wikipdia and volume 1 at Gutenberg, volume 2 at Gutenberg, volume 3 at Gutenberg, but apparently no volume four). Mayhew interviewed and wrote about the most marginal people of the time. The excerpt that made me buy the book:

Trash has a deep and determing place in Mayhew's cosmology. Waste management, in its widest sense, is vital to the story. This begins with the lowest class (Mayhew calls them low but was clearly sympathethetic to such people). The endeavoured to eke out scraps for a penny or two from what others had decided was useless. Contemplating suc lives and such labour makes Mayhew ask big questions. When do objects — or people — cease to have value?

There are dustment in Mayhew — men in the vanguard of professional waste collection. But they were far outnumbered by informal rubbish collectors. On these people Mayhew performs a kind of rescue anthropology. He describes them as if they were members of a ramshackle federation:

  • Bone gubbers and rag-gatherers
  • Pure-finders
  • Cigar-end finders
  • Old wood gatherers
  • Dredgers, or river finders
  • Sewer-hunters
  • Mudlarks
  • Dustmen, nightmen, sweeps, and scavengers

"Pure" is dog shit. Its name alone indicates our classificatory anxiety about its status. It was sold to tanneries, where it was used to cleanse and purify leather. In London, 200 to 300 men were "engaged solely in this business." A covered basket and a glove were required, though many dispensed with the glove, "as they say it is much easier to wash their hands than to keep the glove fit for use." There were even those who worked fakes and passed "mortar" off as pure.

That's great reading. The connection to gulls for this chapter? How the presence of so many and so varied human trash pickers squeezed the gulls out of the easy trash-pickings niche.

Cherry Picking


One thing I have seen time and time again is people using two different branches of code for staging and production and then letting those branches slowly drift further and further apart.

I can fully understand staging should exist for testing stuff before you put it in production, but there should be a steady march of stuff that goes into staging and then maybe a few pieces get backed out, then all of staging syncs to production. Without the regular sync-ups, staging ceases to be an effective place to test things because it's not production + just these small changes but instead it's kinda like production but it has that, this, and some other differences.

But sometimes you have to deal with the hand you're dealt and can't ask for a reshuffle. Which brings us to git cherry-pick.

xkcd 1596

If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of "It's really pretty simple, just think of branches as..." and eventually you'll learn the commands that will fix everything.

I understand the "graph view" of git quite well. I still find the command line view of git to be a PITA. But it's a truism that people explaining Git get caught up that graph. Here's a page that came up in the first page of search results for git cherry-pick and provides a perfect example.

Once executed our Git history will look like:
    a - b - c - d - f   Master
         \
           e - f - g Feature

Yes, but how do I actually use git cherry-pick?

So with the background that my current most common environment has an "upstream" that represents the real (in-use) repo, a personal fork called "origin" (author dev), and a main (staging) branch of "master" and a production branch of "prod", here's a workable example of using cherry picking.

# If you have the same sort of setup, but haven't yet added prod
# locally, start with:
git checkout --track origin/prod

# Now switch to the prod branch
git checkout prod

# Make this current prod branch match the upstream prod.
# The rebase will discard your local prod history, but presumably
# the upstream prod history is what really matters
git pull --rebase upstream prod

# Now find some commit IDs that are in master but not prod
# and filter those with grep (the grep is optional, but may
# be very helpful). "master" here is a branch you've actually
# commited stuff to, and tested those commits there.
git cherry -v prod master | grep commit-description

# Armed with commits you want to apply, create a branch off
# your current prod to apply those
git branch -c PROD-CHERRYPICK

# Then use "git cherry-pick" to apply the commits to prod.
# If the diffs can be applied cleanly, this will completely
# "commit" the differences. If there are conflicts, you'll
# need to manually edit the files and select which set of
# changes are correct.
git cherry-pick commithash

# Changes applied, you can push that branch of prod to origin
# and make a pull request for origin PROD-CHERRYPICK to
# upstream prod
git push -u origin PROD-CHERRYPICK

Your workflow might not involve branching your origin "prod" for the changes, but it's a little cleaner and allows you to put a meaningful name on the branch. At $WORK, those meaningful names are by convention the Jira ticket names. It's a little awkward because I'll have a branch for PROJECT-NNN, apply that to master, then have to delete that branch and create a new one for my prod changes. The policy exists for the benefit of the computer code tracking my work, though. And poor simple computers are not great at subtly, so that's that.