Here's my five latest blog posts - or you can browse a complete archive of all my posts since 2008.

Migrating from Blogger to GitHub Pages

So, you might have noticed I’ve done a bit of decorating… welcome to the 2019 incarnation of dylanbeattie.net. If you’re interested in how I migrated ten years’ worth of blog posts onto a Jekyll site hosted on GitHub Pages, read on. If you’re not, there’s funny videos over on /music that you might enjoy.

Still here? Cool! OK, so the very first version of dylanbeattie.net was a site I built in PHP back in 1999 or so, which I hosted on my own physical server – way back in the days when both of those things were still considered a pretty neat idea. About ten years I ago I started using Blogger, and before long I scrapped the old PHP site, pointed the main domain at my Blogger site, and just sort of got on with it.

Blogger’s been a pretty good platform, but these days I find I actually need a website to do a lot more than just host blog posts and the occasional ‘about me’ page, so a few months back I started looking around for alternatives. I’d considered setting up something like Umbraco, or a custom Wordpress site, but even those felt a bit like over-engineering for what I actually needed to do. My requirements basically boiled down to:

  • A really easy way to post articles – text and links with some simple formatting and the occasional image
  • The option to use rich HTML pages and custom layouts
  • Preserving the URIs of all my existing posts and pages – remember, cool URIs don’t change. Besides, I’ve got a decades’ worth of Google-fu invested in those pages. It’d suck if all those links and bookmarks stopped working.
  • Something that looked good, and a responsive layout that worked well across devices.

There’s also a couple of things I decided I definitely didn’t want:

  • Comments. It’s been years since I saw a decent discussion in the comments thread of a blog post. We have Twitter and Reddit for that now.
  • Bootstrap. I’m sure it’s lovely. I just don’t like it.

Back in January, I was chatting with Todd Gardner about how he created the PubConf website, and he suggested I take a look at GitHub Pages and a thing called Jekyll. I spent an evening playing around with it, and was basically hooked.

Jekyll and GitHub Pages

Note: the code for this site is available on GitHub at github.com/dylanbeattie/dylanbeattie.net

So here’s how it works. You write your site in HTML or Markdown, and create datasets in YAML. Jekyll compiles them into a static HTML site. It supports a templating language called Liquid, which lets you create templates, conditionals, navigation – you can actually do some pretty sophisticated stuff with it, but everything happens at build time. Which is actually very cool.

The really great part is that GitHub Pages has built-in support for Jekyll – so you build your site locally, using bundle exec jekyll serve to view it, then you just push the repo to GitHub and it’ll build and deploy it for you. It’s a beautifully simple workflow, but one that affords a surprising amount of flexibility once you get the hang of it.

Here’s some fun little things I learned along the way, and a couple of gotchas to watch out for if this inspires you to try something similar.

Design and layout

The design for my new site is the Arcana template by HTML5 UP, which is not only beautiful, responsive and really well put together, but is also released – like all HTML5 UP’s templates – under a Creative Commons Attribution 3.0 License:

“…which means you can use them for personal stuff, use them for commercial stuff, change them however you like … all for free, yo. In exchange, just give HTML5 UP credit for the design and tell your friends about it :)”.

Which is lovely. You see, I’m quite happy doing my own design work – and that’s the problem. I enjoy it. I get sidetracked. I spend hours – days – tweaking layouts and playing around with things, when I should be writing content and fixing bugs… and I end up with something that’s OK, but nothing like as good as some of the amazing templates and layouts that are available online, for free. So credit (and thanks!) to HTML5 UP for the design. I’ve made a couple of tweaks and added a few extra bits, but the layouts, grid, responsive design, navigation, typography – that’s all them. Oh, and it’s all built on SASS – which is also supported by GitHub Pages. Did I mention how much I love this platform?

Migrating old blog posts

This turned out to be a lot easier than I expected. There’s a Ruby gem designed to do exactly this – check out the Jekyll documentation and this blog post from Kris Rice which goes into a bit more detail about it.

Once I’d migrated all the posts from my old Blogger site, I wanted to make sure all the old page URLs would still work. I didn’t want to mess too much with Jekyll’s conventions about structuring posts, though – Jekyll uses a YYYY-MM-DD-title format for URLS compared to Blogger’s YYYY-MM-title format, and so I simultaneously wanted to adopt the Jekyll convention for new posts, to make things easier, but also to preserve the addresses of all my old posts so bookmarks and links still work.

Turns out the jekyll-migrate gem adds some lines to the top of each migrated HTML file – what Jekyll refers to as ‘front matter’ – that can help:

---
layout: post
title: How to *really* break the internet.
date: '2016-03-23T12:08:00.000Z'
author: Dylan Beattie
tags: 
modified_time: '2016-03-23T14:06:37.095Z'
blogger_id: tag:blogger.com,1999:blog-7295454224203070190.post-8996121207980120854
blogger_orig_url: http://www.dylanbeattie.net/2016/03/how-to-really-break-internet.html
---

There’s also a Jekyll plugin called redirect-from, which is part of the github-pages bundle (and, yes, it works a little like the infamous COMEFROM flow control statement beloved of sadistic esolang designers). So all I had to do was trawl through every file in the _posts folder, find that line with the blogger_orig_url in it, parse the path part out of the URL, and add a line underneath like this:

blogger_orig_url: http://www.dylanbeattie.net/2016/03/how-to-really-break-internet.html
redirect_from: "/2016/03/how-to-really-break-internet.html"

(Sorry, awk fans – I actually did this with a global search & replace in VS Code…)

That’s it – Jekyll now hosts all those pages at their new /2016/03/23/how-to-really-break-internet.html addresses, and requesting the original URL returns this:

<!DOCTYPE html>
<html lang="en-US">
<meta charset="utf-8">
<title>Redirecting&hellip;</title>
<link rel="canonical" href="/2016/03/23/how-to-really-break-internet.html">
<script>
  location="/2016/03/23/how-to-really-break-internet.html"
</script>
<meta http-equiv="refresh" content="0; url=/2016/03/23/how-to-really-break-internet.html">
<meta name="robots" content="noindex">
<h1>Redirecting&hellip;</h1>
<a href="/2016/03/23/how-to-really-break-internet.html">Click here...</a>
</html>

It’s not quite perfect – an HTTP 301 Moved Permanently would strictly speaking be a better way of handling this – but hey, perfect is the enemy of good enough. It works. Ship it.

Events schedule and flags

You see that little sidebar there, with all the events I’m speaking at and the flags in it of the countries I’m gonna be visiting? The flag images are from GoSquared, and available under an MIT license. To display them in the schedule, I’ve used a SASS list to generate CSS rules for each flag in the set:

/* _flags.scss: generate .flag-xx CSS classes for elements with flag backgrounds */

$countries: _abkhazia, _basque-country, _british-antarctic-territory, _commonwealth, 
  _england, _gosquared, _kosovo, _mars, _nagorno-karabakh, _nato, _northern-cyprus, 
  _olympics, _red-cross, _scotland, _somaliland, _south-ossetia, _united-nations, _wales, 
  AD, AE, AF, AG, AI, AL, AM, AN, AO, AQ, AR, AS, AT, AU, AW, AX, AZ, BA, BB, BD, BE, BF, 
  BG, BH, BI, BJ, BL, BM, BN, BO, BR, BS, BT, BW, BY, BZ, CA, CC, CD, CF, CG, CH, CI, CK, 
  CL, CM, CN, CO, CR, CU, CV, CW, CX, CY, CZ, DE, DJ, DK, DM, DO, DZ, EC, EE, EG, EH, ER, 
  ES, ET, EU, FI, FJ, FK, FM, FO, FR, GA, GB, GD, GE, GG, GH, GI, GL, GM, GN, GQ, GR, GS, 
  GT, GU, GW, GY, HK, HN, HR, HT, HU, IC, ID, IE, IL, IM, IN, IQ, IR, IS, IT, JE, JM, JO, 
  JP, KE, KG, KH, KI, KM, KN, KP, KR, KW, KY, KZ, LA, LB, LC, LI, LK, LR, LS, LT, LU, LV, 
  LY, MA, MC, MD, ME, MF, MG, MH, MK, ML, MM, MN, MO, MP, MQ, MR, MS, MT, MU, MV, MW, MX, 
  MY, MZ, NA, NC, NE, NF, NG, NI, NL, NO, NP, NR, NU, NZ, OM, PA, PE, PF, PG, PH, PK, PL, 
  PN, PR, PS, PT, PW, PY, QA, RO, RS, RU, RW, SA, SB, SC, SD, SE, SG, SH, SI, SK, SL, SM, 
  SN, SO, SR, SS, ST, SV, SY, SZ, TC, TD, TF, TG, TH, TJ, TK, TL, TM, TN, TO, TR, TT, TV, 
  TW, TZ, UA, UG, US, UY, UZ, VA, VC, VE, VG, VI, VN, VU, WF, WS, YE, YT, ZA, ZM, ZW;

@each $country in $countries {
  .flag-#{to-lower-case($country)} {
    background-image: url(images/flags/flat/64/#{$country}.png);
  }
}

The actual schedule is a YAML file, and Jekyll generates the markup with the correct CSS classes based on the country codes from schedule.yml. It’s easy to add new dates. The only thing it won’t do is automatically archive old ones- ‘cos hey, static content, remember? – but I reckon I can live with that for now.

(And yes, you can absolutely invite me to speak at your event by editing schedule.yml and sending me a PR. That would be really cool. :) )

Speaker bios

I’ve got a bunch of different speaker bios over at my about me page, and I really wanted a way to pick one and copy it to the clipboard as either HTML, Markdown or plain text – different events use different formats when you’re submitting to their CFP or filling out speaker details, and it’s a little thing that would make life easier.

The bios themselves are stored as multiline Markdown snippets in /_data/speaker_bios.yml. Instead of allowing Jekyll to just render Markdown > HTML automatically, I’ve got this little loop in the code for the about me page:

{% for bio in site.data.speaker_bios %}
<article>
  <hr />
  <h3>
    <span class="clipboard-links">
      <a href="#" data-src-id="{{bio.id}}-html">copy html</a>
      <a href="#" data-src-id="{{bio.id}}-markdown">copy markdown</a>
      <a href="#" data-src-id="{{bio.id}}-text">copy text</a>
    </span>
    {{ bio.word_count }} Word Bio ({{ bio.char_count }} characters)
  </h3>
  {{ bio.content | markdownify }}
	<input type="hidden" id="{{bio.id}}-markdown" 
	  value="{{ bio.content }}" />
	<input type="hidden" id="{{bio.id}}-html"
	  value="{{ bio.content | markdownify | escape_once }}" />
	<input type="hidden" id="{{bio.id}}-text"
	  value="{{ bio.content | markdownify | strip_html }}" />
</article>
{% endfor %}

so each snippet is rendered to the page as HTML (via the markdownify filter), and also captured in three hidden variables – one markdown, one HTML, one plain text. Finally, there’s some JavaScript attached to the ‘copy xxx’ links that’ll copy the hidden input value into an invisible textarea and copy it.

Syntax Highlighting

One of the great things about writing posts and pages in Markdown is that it’s so easy to embed code samples - just wrap them in three backticks either side, something known as a fenced code block. Jekyll has a code formatting plugin called Rouge, that’s included with the github-pages bundle, which allows you to specify a language for your code snippets and it’ll highlight them for you:

``` html
<p>This HTML snippet will get <a href="http://rouge.jneen.net/">highlighted</a></p>
```

To get this to work, I had to enable highlighting in _config.yml

markdown: kramdown
# enable rouge syntax highlighting
highlighter: rouge

I also had to add a stylesheet to the site with the various coloring rules - all Rouge does is wrap all the code keywords, etc. in <span> tags with CSS classes on them, so I found this syntax.css file on GitHub, dropped it into my site’s CSS, and it worked.

Gotchas

Pretty much everything I tried to do with Jekyll and GitHub Pages worked as documented, but I did hit two weird gotchas that caused a fair bit of head-scratching until I figured out what was going on…

Case sensitive filesystems

I’ve done most of the dev work for this site locally, on macOS 10.14, using bundle exec jekyll serve to preview and test things. And when I finally pushed the whole thing up to GitHub and switched on the GitHub Pages feature, everything worked – except the flag images in the schedule sidebar. And it took me a good couple of hours to figure out what was going on.

My original flags.scss file looked like this:

$countries: ad, ae, af, ag, ai, al...

@each $country in $countries {
  .flag-#{$country} {
    background-image: url(images/flags/flat/64/#{$country}.png);
  }
}

which generated a bunch of CSS rules that looked like this:

.flag-ad { background-image: url(images/flags/flat/64/ad.png); }
.flag-ae { background-image: url(images/flags/flat/64/ae.png); }

Now, if you look closely at the files in the GoSquared flagset I’m using, you’ll notice the filenames are:

AD.png
AE.png
AF.png

I’d dropped their entire flag set into my project, and pushed the whole thing to GitHub.

Now, I’m guessing that GitHub Pages is hosted on Linux. And Linux uses case-sensitive filesystems, whereas macOS – where I’d been testing everything – uses a ‘case preserving’ filesystem. In other words, if you ask macOS for ae.png, it’ll happily give you AE.png, but on Linux, ae.png and AE.png are different files.

So my CSS rule was telling my browser to ask GitHub Pages for ae.png, and GitHub’s Linux servers are going “nope. Not found.” – ‘cos the only file they’ve got is AE.png, which is COMPLETELY DIFFERENT. Obviously.

I could see two ways to fix this. One was to rename all the files in the GoSquared collection… but it turns out it’s surprising fiddly to rename a file from FILENAME to filename on macOS. The other, easier way is just to uppercase all the ISO country codes in the SCSS list, like this. I threw a to-lower-case($country) into the class name there because, well, I think uppercase CSS rules are vulgar.

$countries: AD, AE, AF, AG, AI, AL...

@each $country in $countries {
  .flag-#{to-lower-case($country)} {
    background-image: url(images/flags/flat/64/#{$country}.png);
  }
} 

DNS, HTTPS, Cloudflare and GitHub Pages

The last piece of the puzzle was to get the whole site using dylanbeattie.net (no www) as the canonical URL, and to get everything running over HTTPS. I host all my DNS with Cloudflare, because it’s free and works really well – and, like a lot of people, I’d relied on Cloudflare’s HTTP+DNS proxy service for a while to provide HTTPS for my old Blogger site (check out Troy Hunt’s Here’s Why Your Static Website Needs HTTPS for more on why this is a good idea.)

I wanted to use GitHub Pages to enforce HTTPS, but when I switched the Cloudflare DNS for dylanbeattie.net to point to GitHub’s servers, I couldn’t switch on the option to enforce HTTPS – instead, I got this error:

Enforce HTTPS — Unavailable for your site because your domain is not properly configured to support HTTPS

and when browsing my new site, I got a warning that:

This server could not prove that it is dylanbeattie.net; its security certificate is from www.github.com. This may be caused by a misconfiguration or an attacker intercepting your connection.

After about a day of head-scratching – change a DNS setting, wait six hours to see if anything happens, change something else, repeat – I contacted GitHub Support.

Turns out that Cloudflare’s DNS+HTTP Proxy feature actually interferes with the certificate issuing mechanism used to support HTTPS on GitHub Pages. GitHub asks for DNS servers so it can issue a certificate, but if you’ve enabled Cloudflare’s HTTP proxy feature (which is on by default, even on their free plan), Cloudflare responds to the DNS query with its own server addresses and so GitHub can’t see that your domain is pointing at GitHub Pages.

Logged into Cloudflare, switched the records for dylanbeattie.net over to DNS only, and boom – certificate was issued within the hour and everything was up and running.

Posted by on 14 August 2019 • permalink

Saint-Petersburg, Russia - Tips for Tech Travellers

As I'm heading back to Saint-Petersburg for DotNext 2019, this seemed like a nice moment to repost something I wrote after my first visit to Russia back in 2017.

1. As a guest in Russia, it is vitally important to keep moving at all times. This is because if you stop moving for more than, say, fifteen seconds, your hosts will assume this is because you have run out of roast pork and sausages. It is impossible to sit down at a table in Saint Petersburg without somebody serving you a massive plate of meat. My hosts explained that the Russian phrase for 'no thankyou' is 'nyet, spasiba" but based on my experience I think this literally translates as "please bring me roast duck and cabbage now, and after that some more sausages".

2. Sausages in Russia occupy the culinary niche that is normally reserved for, say, some carrots in Western cuisine. You will find sausages chopped in a salad, boiled, fried, steamed, and served as accompaniments to all sorts of things.

3. Saint Petersburg is BIG. The average street here is wider than most London postcodes. If you have to walk three blocks, wear good shoes and take a bottle of water with you. Monuments and war memorials here are built to such a scale I can only assume they are intended to leave civilisations in neighbouring star systems in no doubt as to the nobility and sacrifice of the Russian military. We should give a special mention to the churches, which are not only huge, but we can conclude from the style of their decorated domes and minarets that the builders thought God had a bit of a thing for cupcakes.

4. Riding the Saint Petersburg metro will seem uncannily familiar to anybody who has read Jules Verne's "Journey to the Centre of the Earth". After buying your metro token from the ticket machine and passing through one of the metal detector arches - for which you are not required to empty your pockets or anything, meaning that they go off constantly and are consequently ignored by everybody including the police - you step onto an escalator approximately fourteen miles in length. The tunnels and station interchanges suggest that tunnelling machines in the former Soviet Union were available in two sizes - Extra Large and Stupidly Extra Large - and the interchanges are so big that the connected stations have different names. If, say, the Red Bull Air Race ever needed to be held indoors due to inclement weather, the pedestrian interchange tunnel at Spasskaya/Sadovaya would provide an ideal venue.

5. For typography enthusiasts, the Cyrillic alphabet is an absolute delight... except when it comes to handwriting. Cursive Cyrillic is a minefield of hilarity and ambiguity. Any doctors who feel they have exhausted the possibilities of the Latin alphabet when it comes to writing illegible prescriptions will find Cyrillic a rich seam of possibility.

6. The Russian people are lovely and friendly... once you get used to the fact that a Russian telling you a joke will initially sound like they're interrogating about some war crimes you may or not have committed. It helps to keep a Polish person with you, since they seem to know the correct point to start laughing, thus giving a handy cue to the slightly baffled English speakers participating in the conversation.

7. An English person trying to speak Russian is the funniest thing that has ever happened. The Russian equivalent of the Edinburgh Festival consists entirely of English people attempting to pronounce the names of Saint Petersburg metro stations whilst the audience drink vodka and roar with laughter.

8. Vodka must be served no warmer than -273.1499 degrees Celsius. To offer someone vodka that is merely refrigerated could cause a serious diplomatic incident.

9. Most consonants in Russian have a 'hard' and a 'soft' pronunciation, which, like tonal Cantonese or the tongue-clicks of the Khoisan language family, is completely impenetrable to foreigners. It is very important, however: based on my attempts to speak Russian to waiters, I have concluded that the elusive hard vs soft 'T' sound must be the difference between saying "no thank you, I have eaten so much food I think I need to to go the hospital" and "could I please have some more roast pork and boiled sausages"

10. There are a lot of Chinese tourists in Saint Petersburg. Local regulations prohibit them from travelling in groups of fewer than fifty. If you arrive to check in to your hotel moments after a Chinese tour party has arrived, you may wish to pass the time whilst you wait by reading the collected works of Dostoyevsky or walking to Vladivostok and back.

11. Every single car in Russia has a dashboard camera recording video footage of the journey, presumably so that when your cab gets cut up by another one and causes a six-car pile-up, the driver can pay the repair bills by sharing the crash footage on YouTube and hoping it goes viral. Most cab drivers keep their radio tuned to the local high-energy Europop station, so when they do inevitably have a massive crash, the resulting YouTube footage already has the appropriate soundtrack.

In short... it was AWESOME. The city is beautiful and vast and unlike anywhere I have ever been, the people could not have been more welcoming and friendly; arrival and departure was an absolute breeze thanks to the brand new, hyper-modern airport terminal building, and the metro is a great way to get around (and clearly signed in English throughout.) And if you don't fancy jumping through the administrative hoops of getting a Russian visa, you can visit SPB on a cruise ship from Tallinn or Helsinki and stay for up to 72 hours without having to get a visa, which is kinda cool.

Just don't stay up drinking vodka the night before your 5am departure. Trust me on this. :) Posted by Dylan Beattie on 14 May 2019 • permalink

A release notes bookmarkdownlet for Pivotal Tracker

One of the best ways to keep the rest of your team up to speed with what your dev teams are doing is release notes - even if all you're doing is gently reassuring the rest of the organisation that yes, you are patching security vulnerabilities, fixing bugs and quietly making things better.

I love using Slack for this - set up a channel where you post a friendly summary of everything that's being released whenever you deploy to production. Now, here at Skills Matter we're not quite doing continuous deployment yet - we work off short-lived branches that merge to master several times a day, but then once master has passed regression testing on our staging environment, the actual deployment to production is a manual process - we open a master > production pull request in GitHub, merge it, and Heroku does the rest.

We track work in progress using Pivotal Tracker, and we use the various ticket state transitions as:

  • Start > a developer has created a branch and begun coding the features
  • Finish > the code is done; time for code review
  • Deliver > the code has been reviewed, merged to master and deployed to the staging environment
  • Accept > the code has been tested on staging; stakeholders know it's ready, and we're good to go live.

Now, this definitely isn't the best branch/merge strategy in the world, but it's the one we've inherited and the one we're using until we've made enough changes to the codebase to be able to deploy PRs directly to review environments.

So when we do a production release, one of the things I do is to check which features are included in that release - that'll form a note that's part of the master > production pull request, and we'll also share it with the company via Slack. And this is a bit tedious, so today I threw together a little JavaScript bookmarklet that'll automate it for you.

Select the stories you want in Pivotal Tracker, click the bookmarklet, and it'll copy them to your clipboard as Markdown-formatted bullet points with the story IDs linked to your Pivotal project.

The JS code is here - add a bookmark, paste this whole lot (including the javascript: into the URL field:


And here's what it looks like in action:






Posted by Dylan Beattie on 22 March 2019 • permalink

PostgreSQL, Heroku, .NET Core and npgsql

I've been having fun this week building data visualisation dashboards that pull information straight out of our PostgreSQL databases and use ASP.NET Core to do some LINQ transformations and aggregation on the data. All our data is hosted on Heroku, I'm using Npgsql as an ADO.NET Data Provider, and it works beautifully - once you've worked out the exact connection string syntax needed to connect to a Heroku PostgreSQL database using ASP.NET Core.

So here it is. You'll need to get the host, port, username, password and database from your Heroku dashboard - and don't forget that if you're connecting from apps that aren't attached to Heroku directly, you'll need to manually update the configuration if you rotate your Heroku database credentials.

const string herokuConnectionString = @" Host=<host.domain.com>; Port=<port>; Username=<user>; Password=<password>; Database=<database>; Pooling=true; Use SSL Stream=True; SSL Mode=Require; TrustServerCertificate=True; ";

Happy querying!

Posted by Dylan Beattie on 20 November 2018 • permalink

The Horrors Lurking in your Legacy Codebase

We’ve all come across design patterns, right? Common solutions to common problems, more specific than a language or a platform, less prescriptive than a component or a framework. The pattern movement originated in building architecture, and in the years since the Gang of Four published their groundbreaking work Design Patterns: Elements of Reusable Object-Oriented Software, we’ve seen patterns embraced right across the spectrum of software development. We’ve got architectural patterns and infrastructure patterns and organisational patterns. We’ve even got anti-patterns.

Patterns are intentional. They have purpose. Even anti-patterns generally involve some element of premeditation. But patterns aren’t the only things in software where you’ll see familiar structures and characteristics playing out across different systems and organisations… There are also the emergent phenomena. The creeping horrors that we’ve all unwittingly summoned in the course of our illustrious careers. Things that don’t happen on purpose, that nobody ever set out to do, and yet which keep spontaneously manifesting in organisations all over the world. Here, for your education and entertainment, I present a bestiary: a field guide to the monsters and the creeping horrors that are lurking somewhere in your IT systems.

1. The Reliquary

The reliquary is that one repository full of really good ideas. Clean code. Brilliant algorithms. The OpenID implementation that you optimised until it shone. Classes so beautifully designed and perfectly documented that they’d make a senior architect weep.

You remember the big rewrite? The project that was going to fix everything, only you never worked out how to actually launch the thing, or get any revenue from it? The reliquary is where you’ve preserved it, pickled in revision control like a fabulous museum specimen. A treasury of good code and good ideas; maybe even an entire codebase that was “a couple of weeks” away from shipping before somebody finally looked at the number of critical features the team had somehow forgotten to include and discovered — to everybody’s surprise — that validated XHTML, normalised data models and 95% test coverage are not actually features any of your end users cared about. Like Buran or the Spruce Goose, the surviving artefacts stand as a testament to the quality of your engineering… and a poignant reminder of just how much fun engineers can have building high-quality stuff that nobody actually wants to use.

2. The Doctor Gonzo

Named for the attorney in Hunter S. Thompson’s ‘Fear and Loathing in Las Vegas’, the Doctor Gonzo is that application that’s “too weird to live, and too rare to die”. It’s written in Visual Basic 6, or Delphi, or maybe even Microsoft Access. Your dev team has to keep a couple of antique-grade virtual machines around to fix the occasional show-stopping bug — with instructions that the machines are absolutely not to be Windows Updated on pain of immediate defenestration.

Of COURSE you’ve tried to replace it. Team after team, project after project has come up with a plan, hired some contractors, captured some requirements, and shipped a couple of prototypes. And, as inevitably as night follows day, they have failed… all of their engineering brilliance powerless against the unholy triumvirate of bureaucracy, Stockholm syndrome and undocumented use cases.

In fifty years time when we’re all running genetic algorithms on bioengineered quantum hardware that eschews physical user interfaces in favour of superimposing consciousness patterns directly into our brains by inducing cross-dimensional electrical fields in a neighbouring parallel universe, at least one company will have made a fortune creating a post-singularity hosting environment for running Visual Basic 6 line-of-business applications in the quantum realm.

3. The Epic of Gilgamesh

You know this one. It started out as a simple database query — something that pulled out the sales figures for the last quarter. Then somebody tweaked it to account for currency fluctuations. Somebody else cross-referenced it against website traffic logs. Somebody else added a half-dozen LEFT OUTER JOIN statements so you could find out which web browsers the customers who created the accounts who raised the invoices that generated the revenue were using.

Sometime around 2008, the SQL query in question surpassed Queen’s Bohemian Rhapsody in length and scope. By 2012 it was longer than Beowulf - and about as readable. It now stands as one of the great literary epics of our generation, a heartbreaking work of insane genius that is as incomprehensible as it is breathtaking.

4. The Chasm of Compliance

“It is a truth universally acknowledged that a profitable organisation in possession of a rigorous security policy must be ignoring a few things”
— Jane Austen.

Some friends of mine used to work in the software division of a company that made scientific instruments. Big government clients, universities, hospitals, research laboratories. As part of the conditions of doing business with these kinds of clients, they had a very strict policy that forbade sending email attachments — which was backed up with a set of firewall rules that would block incoming and outgoing mail with any files attached to it.

If you needed to share a file with a colleague who was working from home or away on a business trip, standard operating procedure was to attach it to a draft email in a Hotmail mailbox and then invite your colleague to download it and delete it when they were done. I believe they learned this particular trick from al-Qaeda. It satisfied the letter of the policy — after all, nobody ever SENT an email with any confidential material attached to it — and it got around the firewall blocks and restrictions.

You see that chasm? That gulf between ‘playing by the rules’ and ‘getting caught’? In there is a rich, thriving ecosystem of free-tier AWS accounts, Dropbox folders full of Excel spreadsheets and SSH proxies running on port 53 so the firewall thinks they’re DNS queries.

It would terrify you how much of your operating revenue comes out of that chasm.

5. The Shibboleth

Once upon a time, there was The Password. Being entrusted with The Password was a rite of passage. The Password was the nuclear launch codes, the keys to the city. Maybe it was the root password to the production web server. Maybe it was the sa password to the main database stack, or the master account password for the COBOL mainframe that handled all the financial records.

Of course, in these enlightened times, we have much, much better ways to restrict access to our systems. Federated authentication, PEM keys and one-time codes and 2FA. Single sign-on, cross linked to Sarbanes-Oxley auditing mechanisms so sensitive that if you so much as exhale whilst you’re logged in to production, it’ll analyse your breath content and record the fact that you had a beer with lunch, just in case they ever need to throw you under a bus in court.

Naturally, when you rolled out your new GDPR-compliant single sign-on, you changed the password. Of course you did. And then, when every single piece of software in your organisation went into a screaming panic, you immediately changed it back. And somewhere, there’s a backlog of the applications that need to be updated before you can change The Password. You’ve done the easy ones, obviously. But you haven’t got around yet to remoting into the VM where the Doctor Gonzo (qv) resides, and even if you could, you’ve no idea what algorithm the developers used to encrypt the connection string before pasting it into the INI file that’s eventually got transplanted into the Windows registry. And besides, you’re going to be replacing the Doctor Gonzo any day now, so you can change the password once that’s done.

The shibboleth is a powerful incentive to ensure that your tech staff leave on good terms. They know full well that if they give you any reason to doubt their integrity and trustworthiness following their departure, it’ll be a lot cheaper and easier just to have them killed than it will be to change the master password.

6. The Masquerade Column

This one’s a doozy. Somewhere in your organisation there’s a text column in a database. It was designed for your staff to make notes. It’s probably called something like ‘Comment’ or ‘Description’. 250+ characters of beautiful, unvalidated text. And, for many years, that’s exactly what it was used for… until that one fateful afternoon, when a couple of developers were sat around trying to plan a feature. And one of them said ‘can we add a field to the database to store the order type?’ And somebody else said ‘we COULD — but then we’d need to update the stored procedures and regression test any apps that are using column indexes and… it’ll turn a three-point ticket into a couple of weeks of work’.

At which point some bright spark says ‘hey, what if we use the Comment field? We can parse it, and if there’s a pipe character in there, anything after the pipe indicates the order type. And we’ll just tell the sales team not to edit anything in there that looks funny.’

Of course, it’s not always a pipe. Sometimes it’s a comma-separated list of product IDs, that your code knows to split, parse, translate back into orders and use those to populate the invoices. Or maybe you decided a comma was too obvious so you used a backtick instead — clever, huh? Sometimes, if you’re REALLY smart, you’ll use a newline character because you know full well that the UI element that’s bound to the field is a single-line textbox and so your users won’t see the hidden data you’ve stashed on line 2.

Note that if you’ve gone as far as storing actual JSON or XML in a free text field, that’s not really a masquerade any more, that’s just creative repurposing. The whole point of the masquerade is that your code has to run half-a-dozen different parsing and validation routines on that little piece of text before deciding whether it’s doing anything special or not.

Back in the glorious days of the first dotcom bubble, I once proposed using the description field on a table to store a SQL statement that needed to be generated when an order was raised, but not actually executed until the order was confirmed. I may have used the phrase ‘continuation passing’ to make it sound impressive. Fortunately the client pulled the plug on the entire project before it went anywhere.

7. The Folly

This is probably linked from your homepage. It definitely features prominently in your sales literature. Maybe you even ran a campaign about it. It’s a massively complex feature that was designed, built, shipped… and in the five years since it went live, it’s been used by exactly nine people. The folly is normally built as a sweetener. Like the senator who won’t sign off on a nuclear power bill unless they put a clause in it about reducing the excise duty on liquor sold in golf clubs, it was some weirdly specific scenario that one of your key stakeholders was absolutely hellbent on pushing through to production. Eventually, your team gives up trying to sell them on the idea of an MVP, and builds the whole damn thing just to shut them up.

To be a genuine folly, the system in question needs to be so inextricably wired into the rest of your software platform that shutting it down would be a major undertaking in its own right, and so it quietly purrs along in production, using up a couple of hundred bucks a month of cloud hosting and not really hurting anybody.

8. The Slow Loris

The slow loris is a nocturnal primate that’s indigenous to south-east Asia. Slow lorises have big round eyes and curious furry faces, and grow to about 30 cm long. They’re small, they’re cute, they’re not remotely threatening. And if you touch them, they can literally kill you.

Does that remind you of anything in your software? Sure it does. You know. That one innocent-looking database table that if somebody adds a row to it the entire website crashes. The button in your intranet dashboard that says something like ‘download CSV’ and if anybody clicks it the main database server instantly hits 100% CPU and stops accepting any more connections for a good twenty minutes or so. The DLL that has to be installed into C:\Users\Temp\ (because Reasons) and if it’s not there it’s a toss-up as to whether the phone starts ringing before the website goes down, or immediately afterwards.

That file that can’t possibly be doing anything? Probably best not to touch it. No matter how cute and fluffy it might look.

9. The Phantasm

Somewhere in your organisation, there’s a piece of critical physical infrastructure that is so well hidden it’s indistinguishable from magic. We call this the phantasm. And a true phantasm always starts out with somebody trying to make things look nice.

Back in the days of wired networks, a really good phantasm was hard to accomplish — you could always follow the wires. But in these days of wireless networks, it’s the easiest thing in the world to stick a wi-fi access point up above a ceiling tile somewhere and forget about it. Ten years later, your successors will thank you when they get asked to find out why the internet has stopped working. After furiously arguing for three or four hours that the internet NEVER worked because there’s clearly no wi-fi points anywhere on that floor, they end up ripping the entire ceiling down in a desperate attempt to work out what’s going on — and find a dust-bunny the size of a basketball with the last few inches of a wifi antenna forlornly poking out of it.

10. The Paradox

The paradox is a rare and beautiful artefact in modern software systems. It’s something that can’t possibly exist according to its own rules, and yet it does. The classic paradox is a table full of customers with no email address, in a system that (a) defines a customer as invalid unless the email address is populated, and (b) rejects any update to objects that are not valid. This can lead to HOURS of fun chasing your own tail up and down the aggregate graph trying to work out why you’ve managed to get seventeen validation errors without changing a single value.

There are other paradoxes as well. There’s the server that appears to be responding to pings despite the fact you’ve switched it off, removed the power supply and disconnected all the hard drives. The misconfigured IAM security policy that eventually turns out to serve absolutely no purpose other than preventing it from knowing about itself. The dashboard that’s reporting 100% availability because every single other system in the organisation has failed, including the systems that are supposed to report downtime to the dashboard aggregator.

A really good paradox will vanish without trace when you begin investigating it, like some sort of quantum phenomenon that can only exist until an observer attempts to measure it. This shouldn’t be confused with a Heisenbug, which is a bug that only becomes invisible whilst actively being observed. No, the paradox doesn’t just hide — it vanishes, leaving you sat in a retrospective meeting insisting that you really saw it whilst your team-mates look at you with raised eyebrows and wonder whether it’s time you took some holiday. The experienced paradox hunter doesn’t even run a database query until they’ve loaded Camtasia with silver bullets and started a screen recording.

Survival Tips for Aspiring Code Archaeologists

As you explore the canyons and catacombs of an unfamiliar codebase, keep your eyes peeled, for these are just a few of the weird and wonderful creatures you’ll find lurking in the dark places. You may seek to understand them. You may even seek to catalogue a few of your own — but be wary as you study their ways, for to familiarise yourself with these beasts is to walk a narrow and dangerous path. For as the great systems architect Nietzsche once said:

“Beware that, when fighting monsters, you yourself do not become a monster… for when you gaze long into the abyss, the abyss gazes also into you, and when you truly understand the legacy codebase, only then will you realise you yourself have become part of the legacy.”
Posted by Dylan Beattie on 29 August 2018 • permalink