In the german language, there is a proverb that means “being alert” or “being on guard”. It’s called “auf der Hut sein” and would mean, if translated without context, “being on hat”. That doesn’t make sense, even to germans. But it’s actually directly explainable if you know that the german word “Hut” has two meanings. It most of the time means the hat you put on your head. But another form of it means “shelter”, “protection” or “guard”. It turns up in quite a few derived german words like “Obhut” (custody) or “Vorhut” (vanguard). So it isn’t so strange for germans to think of a hat when they need to stay alert and vigilant.
Vigilant developers
Being mindful and careful is a constant state of mind for every developer. The computer doesn’t accept even the slightest fuzziness of thought. But there is a moment when a developer really has to take care and be very very precautious: When you operate on a live server. These machines are the “real” thing, containing the deployed artifacts of the project and connecting to the real database. If you make an error on this machine, it will be visible. If you accidentally wipe some data, it’s time to put the backup recovery process to the ultimate test. And you really should have that backup! In fact, you should never operate on a live server directly, no matter what.
Learning from Continuous Delivery
One of the many insights in the book “Continuous Delivery” by Jez Humble and David Farley is that you should automate every step that needs to take place on a live server. There is an ever-growing list of tools that will help you with this task, but in its most basic form, you’ll have to script every remote action, test it thoroughly and only then upload it to the live server and execute it. This is the perfect state your deployment should be in. If it isn’t yet, you will probably be forced to work directly on the live server (or the real database) from time to time. And that’s when you need to be “auf der Hut“. And you can now measure your potential for improvement in the deployment process area in “hat time”.
We ain’t no cowboys!
In our company, there is a rule for manual work on live servers: You have to wear a hat. We bought several designated cowboy hats for that task, so there’s no excuse. If you connect to a server that isn’t a throw-away test instance, you need to wear your hat to remind you that you’re responsible now. You are responsible for the herd (the data) and the ranch (the server). You are responsible for every click you make, every command you issue and every change you make. There might be a safety net to prevent lethal damage, but this isn’t a test. You should get it right this time. As long as you wear the “live server hat”, you should focus your attention on the tasks at hand and document every step you make.
Don’t ask, they’ll shoot!
But the hat has another effect that protects the wearer. If you want to ask your collegue something and he’s wearing a cowboy hat, think twice! Is it really important enough to disturb him during the most risky, most stressful times? Do you really need to shout out right now, when somebody concentrates on making no mistake? In broadcasting studios, there is a sign saying “on air”. In our company, there is a hat saying “on server”. And if you witness more and more collegues flocking around a terminal, all wearing cowboy hats and seeming concerned, prepare for a stampede – a problem on a live server, the most urgent type of problem that can arise for developers.
The habit of taking off the hat after a successful deployment is very comforting, too. You physically alter your state back to normal. You switch roles, not just wardrobe.
Why cowboy hats?
We are pretty sure that the same effects can be achieved with every type of hat you can think of. But for us, the cowboy hat combines ironic statement with visual coolness. And there is no better feeling after a long, hard day full of deployments than to gather around the campfire and put the spurs aside.
How do you get everyone on the team to understand the “gravity” of working on a live system? Aside from getting them some cool looking hats 🙂
Hi Rodnee, thanks for your question. First, we are a small company, so we can achieve the miraculous state of the whole team being like-minded. And then, there is the pedagogically valuable effect of failure. We aren’t perfect, but we try to learn (as a team) from our mistakes. This is where retrospectives (the metaphorical campfire after a deployment) facilitate our learning.
Oh, and one more thing: We let our newbies deploy their changes as soon as possible. That’s risky, but sharpens the sense of gravity a lot.
Hi Schneide-Team,
glad to see the idea of the Live-System-Hat being put into action. Albeit I’m a little sad, that you have chosen cowboy hats over a captain’s hat 😉
“Oh, and one more thing: We let our newbies deploy their changes as soon as possible. That’s risky, but sharpens the sense of gravity a lot.”
From first-hand experience, they really put newbies in the front lines as soon as possible, sometimes within the first month of working on a project. And that certainly sharpens the sense of gravity.