QZ qz thoughts
a blog from Eli the Bearded
Tag search results for blosxom Page 1 of 4

So, Why Blosxom?


Although it was moderately popular when new, calling Blosxom a dead blog tool now is fairly accurate. No one is using it for new sites and many former power users — people dedicated and involved enough to write plugins &mhash; have abandoned the platform. So it probably bears answering "Why do I still want it?"

Here are some of the things Blosxom has going against it:

  1. No active development to lean on for community improvements.
  2. Somewhat simplistic hook model for plugins.
  3. Very rudimentary interpolation engine.
  4. Very easy to accidentally change posting time on posts.
  5. Without plugins, lacks many features considered standard now:
    • Comments
    • Post composer
    • Search
    • Search engine tools like sitemap support
    • Cookies for analytics, user preferences, and/or user logins

The main selling points Rael Dornfest had for Blosxom, as I remember it, where:

  1. Edit posts without a post composer.
  2. Import and export of posts is trivial since they are all just individual text files.
  3. Small code base with easy install on your own server.
  4. Simple to create plugins.

Most of those are not things I think people appreciate. GUI composers are very common these days, some more WYSIWYG then others, but having buttons for bolding, dialogs for links, etc, seems to be a thing people want. And maintaining code, installing things on an Internet server, that seems to to be things people don't want. You can get started in Tumblr in seconds after getting an email address and an Internet connection. Finding somewhere to install a Perl script, configure it to work with a web server, and then "how do you add images?" is too much.

So you've got deliberate features people don't care about, and drawbacks people will quickly notice. Blosxom is a hard sell these days.

But for me, it is what I have always done. My first forays into web page construction were done composed in vi, served by NCSA HTTPd, and viewed in Mosaic on university computers. From there, moved to a Unix shell account on an ISP by 1996. I had my own personal colocated server serving content on my own personal domain name by 1997, and was saturating a T1 at times by 1998. All of that original work was 100% my HTML and CGI coding. I wrote a CGI libary in 1999 that I still use for some personal projects.

For work, I've used blogging tools like Moveable Type and Wordpress. I've used content management systems like Plone and Drupal. I've stored content in Berkeley DB files, MySQL, and Postgres. I've worked with content accelerators like Varnish, Fastly (which is basically Varnish managed by someone else), and Akamai. I understand how and why to scale horizontally or vertically in Internet deployments.

I like coming back to the simplicity of knowing how every bit of the page gets transmitted from the first line of the HTTP request to the closing <BODY> tag in the HTML. That's what I get out of Blosxom. It is tiny and knowable. I have to do more work to enable things, but I know what that work is or how to find out. Until I did it for the sitemap tool last week, I had never actually built a sitemap, only parsed them for site scraping. It wasn't a hard task from deciding to do it and having it completed, even if it was a task that wouldn't be necessary with other tools..

Last month, May 2020, qaz.wtf moved 21,650,398,268 bytes in web content, that's without headers, an average of 8083 bytes per second all month long. Most of that (68.9%) was from the Unicode Toys which are configured to send compressed-on-the-fly 100% text html generated by CGI scripts I've 100% written. Second biggest top level item was /favicon.ico, sadly a binary file with a crappy name because Microsoft invented that concept. Third this blog (including images and CSS). If I were using Wordpress, I doubt the pages would be as small, and if I were using Medium the Javascript alone per page would be more than my blog HTML content put together.

I'm happy controlling it all and knowing where my byte budget goes.

Sitemap Plugin


About a month ago I posted about finding a lot of Blosxom plugins on github. I've been looking at some them. There are a family of them, one original and a few modifications to the original, for enabling comments. I have not gotten those to work: best I've gotten is I can leave comments, but not see them on pages. I may end up writing my own, so that it follows my idea of what is needed for comments.

But also in that batch of plugins was one for Google Sitemap. The documentation is non-existant in the repo. Searching the web I did find blog posts from the author, in Japanese. From those I gather the version in the github repo is the old, memory intensive way to build a sitemap. I didn't find the new improved version.

I decided to make do. The gsitemap plugin is dead simple. It just sets a few variables to use in templates, and then when the sitemap flavor is desired, disables pagination. The rest of the magic happens in the flavour templates.

As part of making do, I'm not going to reference or link to the URL that generates the output (if you are reading and want to use this for your own site, the gsitemap plugin in the original configuration would generate it for https://example.com/blosxom/index.xml, assuming /blosxom/ is the root of your Blosxom blog).

Instead I've reconfigured the templates to generate a fragment of a sitemap XML file, changed the flavour to sitemap from xml, and have scripted up a sitemap builder for the whole qaz.wtf site that curls the proper URL and includes() the xml fragment. Blosxom then can remain the source of truth for blog permalinks while find and some per-directory configuration can build URLs for other parts of the site.

I decided I should run that script from SAVE-DATES.sh under the theory that any time I save post timestamps is a likely time I want to rebuild the sitemap. This works for qaz.wtf because the blog is the only thing updating more frequently than monthly, and I typically run SAVE-DATES.sh shortly after posting an entry.

This is all prompted by looking (again) at just how much bot traffic the site gets. I figure a sitemap will stop well-behaved bots from crawling as much or as frequently. And for non-well-behaved bots, I've belt and suspendered things by adding entries to robots.txt and more user-agents to my browser_block plugin.

Similarly in the name of improving search engine interaction, I've got a new (trivial) plugin called extrameta that gets used by other plugins, namely the newly modified tags plugin and pagination plugin to add a <meta name="robots" content="noindex"> header (in a naive way) to search result pages, to avoid duplicated content.

May Updates


A bunch of things of small changes this first week of May.

  1. The qzpostfilt tool has a bug fix with inline <code> and a new close & open paragraph operator: .pp
  2. The is a search box on the blog to run tag searches
  3. The aaa_tags and paginateqz plugins have slight changes to make search results and later pages marked as such
  4. There are flavour template and css changes to go with that
  5. I have finished going through all old blog posts and fixing the easily fixable links.
  6. I have gone through all the old blog posts and fixed non-UTF-8 characters and title formatting.

UTF-8 Fixes


It looks like there will only be documentation fixes to the bug I filed about the open pragma not working with the FileHandle module. The pragma lets you specify the expected encoding of all files being open()ed. The FileHandle module provides an alternative syntax for open(). The original blosxom code used that alternative syntax. So depending on the code path, the expected encoding was one thing or another.

I've patched blosxom again to not use FileHandle and specify all UTF-8 input. Net result is two fewer lines of code, because I can use "goes out of scope" to automatically close file handles now. That didn't work with one global $fh. One of Rael Dornfest's original points of pride about Blosxom is how few lines of code were used to implement it, so I feel good about this patch.

(Most people don't care about how many lines of code are used; particularly when it means skimping on features. But it certainly worked for me.)

Along with this change, my /qz/plugins/flavour_dir was modified significantly to match. It originally was based on Blosxom's own template reader, so it had the same use FileHandle issue. I made slight changes to a couple of other plugins to ensure UTF-8 handling, too, but those are minor.

Now I can have a link to the UTF-8 post like so:

/qz/blosxom?tag=utf-8✓
without having the UTF-8 check character come out as mojibake.