QZ qz thoughts
a blog from Eli the Bearded
Tag search results for plugin Page 1 of 3

Sitemap Plugin

About a month ago I posted about finding a lot of Blosxom plugins on github. I've been looking at some them. There are a family of them, one original and a few modifications to the original, for enabling comments. I have not gotten those to work: best I've gotten is I can leave comments, but not see them on pages. I may end up writing my own, so that it follows my idea of what is needed for comments.

But also in that batch of plugins was one for Google Sitemap. The documentation is non-existant in the repo. Searching the web I did find blog posts from the author, in Japanese. From those I gather the version in the github repo is the old, memory intensive way to build a sitemap. I didn't find the new improved version.

I decided to make do. The gsitemap plugin is dead simple. It just sets a few variables to use in templates, and then when the sitemap flavor is desired, disables pagination. The rest of the magic happens in the flavour templates.

As part of making do, I'm not going to reference or link to the URL that generates the output (if you are reading and want to use this for your own site, the gsitemap plugin in the original configuration would generate it for https://example.com/blosxom/index.xml, assuming /blosxom/ is the root of your Blosxom blog).

Instead I've reconfigured the templates to generate a fragment of a sitemap XML file, changed the flavour to sitemap from xml, and have scripted up a sitemap builder for the whole qaz.wtf site that curls the proper URL and includes() the xml fragment. Blosxom then can remain the source of truth for blog permalinks while find and some per-directory configuration can build URLs for other parts of the site.

I decided I should run that script from SAVE-DATES.sh under the theory that any time I save post timestamps is a likely time I want to rebuild the sitemap. This works for qaz.wtf because the blog is the only thing updating more frequently than monthly, and I typically run SAVE-DATES.sh shortly after posting an entry.

This is all prompted by looking (again) at just how much bot traffic the site gets. I figure a sitemap will stop well-behaved bots from crawling as much or as frequently. And for non-well-behaved bots, I've belt and suspendered things by adding entries to robots.txt and more user-agents to my browser_block plugin.

Similarly in the name of improving search engine interaction, I've got a new (trivial) plugin called extrameta that gets used by other plugins, namely the newly modified tags plugin and pagination plugin to add a <meta name="robots" content="noindex"> header (in a naive way) to search result pages, to avoid duplicated content.

May Updates

A bunch of things of small changes this first week of May.

  1. The qzpostfilt tool has a bug fix with inline <code> and a new close & open paragraph operator: .pp
  2. The is a search box on the blog to run tag searches
  3. The aaa_tags and paginateqz plugins have slight changes to make search results and later pages marked as such
  4. There are flavour template and css changes to go with that
  5. I have finished going through all old blog posts and fixing the easily fixable links.
  6. I have gone through all the old blog posts and fixed non-UTF-8 characters and title formatting.

Plugin Bonanza

I just noticed someone made an effort to hunt down every blosxom plugin ever (seemingly ever) and stick them in one Github repo.

The current official site for Blosxom has a long list, now filled with 404 errors, but not as long as this one. I see someone else wrote a tags plugin, unfortunately the documenation for that appears to be in Japanese, so I can't be sure how it works without studying the code. There are also a few commenting plugins. I will be looking at those, too.

Improved Tags Plugin

First major revision to the aaa_tags plugin. Several new features with this.

  1. AND search for multiple tags. Use ^ to separate tags.
  2. OR search for multiple tags. Use | to separate tags.
  3. NOT search for excluding tags. Prefix with ~ to exclude.
  4. All of those can be (crudely) combined for a complex search. Use , to combine them.
  5. Multiple uses of tag= CGI parameter allowed, if individual words, they will be considered an OR list.
  6. UTF-8 tags are now allowed.
    Tags can be ASCII letters, numbers, hyphen and underscore, plus any Unicode codepoints above U+00A0 (A0 is non-breaking space; that is not allowed in tags).
  7. If there are stories before filter() but the tag search has removed them all by the end of filter(), then story() will (attempt to) display an error page. Due to the way date filtering works in blosxom, story() may not get the chance. But it works for date unfiltered stuff.
  8. New template for that error: aaa_nothing_left
  9. New interpolatable variables.
    • Configuration parameters $aaa_tags::top_count and $aaa_tags::threshold
    • $aaa_tags::frequent_tags has a list of tags, much like the old $aaa_tags::top_tags, but alpha-sorted and a list of all tags with more than $aaa_tags::threshold uses.
    • $aaa_tags::this_search_terse has a value suitable for stuffing into a tag= CGI parameter for repeating the current search. Can be used anywhere.
    • $aaa_tags::this_search_verbose has a text description of the search, only for use in the error template.
    • $aaa_tags::this_search_table has a table of usage frequencies of the tags in the current search, only for use in the error template.

This is a lot of new code. There are probably bugs lurking in all of this. I've only done cursory testing, but I've been very happy with the results.

Reviews from 2020
2003, 2004, 2005, 2006, 2007, and 2008 posts without "deadlink" tag (but see below)
Posts from 2020 not tagged with "blog"
Blosxom or "administrivia" posts, as two tags
UTF-8 tag, unique to this specific post.
A search that will return no results.

Testing this has made me realize (a) I've got a real problem with bad "deadlink" tagging, as in posts that should have that tag don't; and (b) the deadlink test was kinda flawed in that in only looked for 200 responses so sites that redirected http: to https: got mistakenly flagged. I need to fix both of those issues before the results of the example deadlink search reflect what really should be called deadlinks.