The Horrors Lurking in your Legacy Codebase
Posted by Dylan Beattie on 29 August 2018 • permalinkWe’ve all come across design patterns, right? Common solutions to common problems, more specific than a language or a platform, less prescriptive than a component or a framework. The pattern movement originated in building architecture, and in the years since the Gang of Four published their groundbreaking work Design Patterns: Elements of Reusable Object-Oriented Software, we’ve seen patterns embraced right across the spectrum of software development. We’ve got architectural patterns and infrastructure patterns and organisational patterns. We’ve even got anti-patterns.
Patterns are intentional. They have purpose. Even anti-patterns generally involve some element of premeditation. But patterns aren’t the only things in software where you’ll see familiar structures and characteristics playing out across different systems and organisations… There are also the emergent phenomena. The creeping horrors that we’ve all unwittingly summoned in the course of our illustrious careers. Things that don’t happen on purpose, that nobody ever set out to do, and yet which keep spontaneously manifesting in organisations all over the world. Here, for your education and entertainment, I present a bestiary: a field guide to the monsters and the creeping horrors that are lurking somewhere in your IT systems.
1. The Reliquary
The reliquary is that one repository full of really good ideas. Clean code. Brilliant algorithms. The OpenID implementation that you optimised until it shone. Classes so beautifully designed and perfectly documented that they’d make a senior architect weep.
You remember the big rewrite? The project that was going to fix everything, only you never worked out how to actually launch the thing, or get any revenue from it? The reliquary is where you’ve preserved it, pickled in revision control like a fabulous museum specimen. A treasury of good code and good ideas; maybe even an entire codebase that was “a couple of weeks” away from shipping before somebody finally looked at the number of critical features the team had somehow forgotten to include and discovered — to everybody’s surprise — that validated XHTML, normalised data models and 95% test coverage are not actually features any of your end users cared about. Like Buran or the Spruce Goose, the surviving artefacts stand as a testament to the quality of your engineering… and a poignant reminder of just how much fun engineers can have building high-quality stuff that nobody actually wants to use.
2. The Doctor Gonzo
Named for the attorney in Hunter S. Thompson’s ‘Fear and Loathing in Las Vegas’, the Doctor Gonzo is that application that’s “too weird to live, and too rare to die”. It’s written in Visual Basic 6, or Delphi, or maybe even Microsoft Access. Your dev team has to keep a couple of antique-grade virtual machines around to fix the occasional show-stopping bug — with instructions that the machines are absolutely not to be Windows Updated on pain of immediate defenestration.
Of COURSE you’ve tried to replace it. Team after team, project after project has come up with a plan, hired some contractors, captured some requirements, and shipped a couple of prototypes. And, as inevitably as night follows day, they have failed… all of their engineering brilliance powerless against the unholy triumvirate of bureaucracy, Stockholm syndrome and undocumented use cases.
In fifty years time when we’re all running genetic algorithms on bioengineered quantum hardware that eschews physical user interfaces in favour of superimposing consciousness patterns directly into our brains by inducing cross-dimensional electrical fields in a neighbouring parallel universe, at least one company will have made a fortune creating a post-singularity hosting environment for running Visual Basic 6 line-of-business applications in the quantum realm.
3. The Epic of Gilgamesh
You know this one. It started out as a simple database query — something that pulled out the sales figures for the last quarter. Then somebody tweaked it to account for currency fluctuations. Somebody else cross-referenced it against website traffic logs. Somebody else added a half-dozen LEFT OUTER JOIN statements so you could find out which web browsers the customers who created the accounts who raised the invoices that generated the revenue were using.
Sometime around 2008, the SQL query in question surpassed Queen’s Bohemian Rhapsody in length and scope. By 2012 it was longer than Beowulf - and about as readable. It now stands as one of the great literary epics of our generation, a heartbreaking work of insane genius that is as incomprehensible as it is breathtaking.
4. The Chasm of Compliance
“It is a truth universally acknowledged that a profitable organisation in possession of a rigorous security policy must be ignoring a few things”
— Jane Austen.
Some friends of mine used to work in the software division of a company that made scientific instruments. Big government clients, universities, hospitals, research laboratories. As part of the conditions of doing business with these kinds of clients, they had a very strict policy that forbade sending email attachments — which was backed up with a set of firewall rules that would block incoming and outgoing mail with any files attached to it.
If you needed to share a file with a colleague who was working from home or away on a business trip, standard operating procedure was to attach it to a draft email in a Hotmail mailbox and then invite your colleague to download it and delete it when they were done. I believe they learned this particular trick from al-Qaeda. It satisfied the letter of the policy — after all, nobody ever SENT an email with any confidential material attached to it — and it got around the firewall blocks and restrictions.
You see that chasm? That gulf between ‘playing by the rules’ and ‘getting caught’? In there is a rich, thriving ecosystem of free-tier AWS accounts, Dropbox folders full of Excel spreadsheets and SSH proxies running on port 53 so the firewall thinks they’re DNS queries.
It would terrify you how much of your operating revenue comes out of that chasm.
5. The Shibboleth
Once upon a time, there was The Password. Being entrusted with The Password was a rite of passage. The Password was the nuclear launch codes, the keys to the city. Maybe it was the root password to the production web server. Maybe it was the sa password to the main database stack, or the master account password for the COBOL mainframe that handled all the financial records.
Of course, in these enlightened times, we have much, much better ways to restrict access to our systems. Federated authentication, PEM keys and one-time codes and 2FA. Single sign-on, cross linked to Sarbanes-Oxley auditing mechanisms so sensitive that if you so much as exhale whilst you’re logged in to production, it’ll analyse your breath content and record the fact that you had a beer with lunch, just in case they ever need to throw you under a bus in court.
Naturally, when you rolled out your new GDPR-compliant single sign-on, you changed the password. Of course you did. And then, when every single piece of software in your organisation went into a screaming panic, you immediately changed it back. And somewhere, there’s a backlog of the applications that need to be updated before you can change The Password. You’ve done the easy ones, obviously. But you haven’t got around yet to remoting into the VM where the Doctor Gonzo (qv) resides, and even if you could, you’ve no idea what algorithm the developers used to encrypt the connection string before pasting it into the INI file that’s eventually got transplanted into the Windows registry. And besides, you’re going to be replacing the Doctor Gonzo any day now, so you can change the password once that’s done.
The shibboleth is a powerful incentive to ensure that your tech staff leave on good terms. They know full well that if they give you any reason to doubt their integrity and trustworthiness following their departure, it’ll be a lot cheaper and easier just to have them killed than it will be to change the master password.
6. The Masquerade Column
This one’s a doozy. Somewhere in your organisation there’s a text column in a database. It was designed for your staff to make notes. It’s probably called something like ‘Comment’ or ‘Description’. 250+ characters of beautiful, unvalidated text. And, for many years, that’s exactly what it was used for… until that one fateful afternoon, when a couple of developers were sat around trying to plan a feature. And one of them said ‘can we add a field to the database to store the order type?’ And somebody else said ‘we COULD — but then we’d need to update the stored procedures and regression test any apps that are using column indexes and… it’ll turn a three-point ticket into a couple of weeks of work’.
At which point some bright spark says ‘hey, what if we use the Comment field? We can parse it, and if there’s a pipe character in there, anything after the pipe indicates the order type. And we’ll just tell the sales team not to edit anything in there that looks funny.’
Of course, it’s not always a pipe. Sometimes it’s a comma-separated list of product IDs, that your code knows to split, parse, translate back into orders and use those to populate the invoices. Or maybe you decided a comma was too obvious so you used a backtick instead — clever, huh? Sometimes, if you’re REALLY smart, you’ll use a newline character because you know full well that the UI element that’s bound to the field is a single-line textbox and so your users won’t see the hidden data you’ve stashed on line 2.
Note that if you’ve gone as far as storing actual JSON or XML in a free text field, that’s not really a masquerade any more, that’s just creative repurposing. The whole point of the masquerade is that your code has to run half-a-dozen different parsing and validation routines on that little piece of text before deciding whether it’s doing anything special or not.
Back in the glorious days of the first dotcom bubble, I once proposed using the description field on a table to store a SQL statement that needed to be generated when an order was raised, but not actually executed until the order was confirmed. I may have used the phrase ‘continuation passing’ to make it sound impressive. Fortunately the client pulled the plug on the entire project before it went anywhere.
7. The Folly
This is probably linked from your homepage. It definitely features prominently in your sales literature. Maybe you even ran a campaign about it. It’s a massively complex feature that was designed, built, shipped… and in the five years since it went live, it’s been used by exactly nine people. The folly is normally built as a sweetener. Like the senator who won’t sign off on a nuclear power bill unless they put a clause in it about reducing the excise duty on liquor sold in golf clubs, it was some weirdly specific scenario that one of your key stakeholders was absolutely hellbent on pushing through to production. Eventually, your team gives up trying to sell them on the idea of an MVP, and builds the whole damn thing just to shut them up.
To be a genuine folly, the system in question needs to be so inextricably wired into the rest of your software platform that shutting it down would be a major undertaking in its own right, and so it quietly purrs along in production, using up a couple of hundred bucks a month of cloud hosting and not really hurting anybody.
8. The Slow Loris
The slow loris is a nocturnal primate that’s indigenous to south-east Asia. Slow lorises have big round eyes and curious furry faces, and grow to about 30 cm long. They’re small, they’re cute, they’re not remotely threatening. And if you touch them, they can literally kill you.
Does that remind you of anything in your software? Sure it does. You know. That one innocent-looking database table that if somebody adds a row to it the entire website crashes. The button in your intranet dashboard that says something like ‘download CSV’ and if anybody clicks it the main database server instantly hits 100% CPU and stops accepting any more connections for a good twenty minutes or so. The DLL that has to be installed into C:\Users\Temp\ (because Reasons) and if it’s not there it’s a toss-up as to whether the phone starts ringing before the website goes down, or immediately afterwards.
That file that can’t possibly be doing anything? Probably best not to touch it. No matter how cute and fluffy it might look.
9. The Phantasm
Somewhere in your organisation, there’s a piece of critical physical infrastructure that is so well hidden it’s indistinguishable from magic. We call this the phantasm. And a true phantasm always starts out with somebody trying to make things look nice.
Back in the days of wired networks, a really good phantasm was hard to accomplish — you could always follow the wires. But in these days of wireless networks, it’s the easiest thing in the world to stick a wi-fi access point up above a ceiling tile somewhere and forget about it. Ten years later, your successors will thank you when they get asked to find out why the internet has stopped working. After furiously arguing for three or four hours that the internet NEVER worked because there’s clearly no wi-fi points anywhere on that floor, they end up ripping the entire ceiling down in a desperate attempt to work out what’s going on — and find a dust-bunny the size of a basketball with the last few inches of a wifi antenna forlornly poking out of it.
10. The Paradox
The paradox is a rare and beautiful artefact in modern software systems. It’s something that can’t possibly exist according to its own rules, and yet it does. The classic paradox is a table full of customers with no email address, in a system that (a) defines a customer as invalid unless the email address is populated, and (b) rejects any update to objects that are not valid. This can lead to HOURS of fun chasing your own tail up and down the aggregate graph trying to work out why you’ve managed to get seventeen validation errors without changing a single value.
There are other paradoxes as well. There’s the server that appears to be responding to pings despite the fact you’ve switched it off, removed the power supply and disconnected all the hard drives. The misconfigured IAM security policy that eventually turns out to serve absolutely no purpose other than preventing it from knowing about itself. The dashboard that’s reporting 100% availability because every single other system in the organisation has failed, including the systems that are supposed to report downtime to the dashboard aggregator.
A really good paradox will vanish without trace when you begin investigating it, like some sort of quantum phenomenon that can only exist until an observer attempts to measure it. This shouldn’t be confused with a Heisenbug, which is a bug that only becomes invisible whilst actively being observed. No, the paradox doesn’t just hide — it vanishes, leaving you sat in a retrospective meeting insisting that you really saw it whilst your team-mates look at you with raised eyebrows and wonder whether it’s time you took some holiday. The experienced paradox hunter doesn’t even run a database query until they’ve loaded Camtasia with silver bullets and started a screen recording.
Survival Tips for Aspiring Code Archaeologists
As you explore the canyons and catacombs of an unfamiliar codebase, keep your eyes peeled, for these are just a few of the weird and wonderful creatures you’ll find lurking in the dark places. You may seek to understand them. You may even seek to catalogue a few of your own — but be wary as you study their ways, for to familiarise yourself with these beasts is to walk a narrow and dangerous path. For as the great systems architect Nietzsche once said:
“Beware that, when fighting monsters, you yourself do not become a monster… for when you gaze long into the abyss, the abyss gazes also into you, and when you truly understand the legacy codebase, only then will you realise you yourself have become part of the legacy.”