Peter Gillard-Moss

Transient opinion made permanent.

It Won’t Stay That Way

Every good developer knows that trying to design your system around future requirements is wasteful. You Ain’t Gonna Need It tells us that we should focus on the functionality we need now and not that which may occur in the future.

Every good developer knows that we should only model what we are certain about and we can only be absolutely certain about now.

To keep the code simple, to keep the code lean, every good developer works to model what we know now as accurately as possible.

But what if I was to tell you that this is wrong? What if I was to tell you that designing your code around now is potentially as futile and wasteful as designing your code around a hypothetical future state?

Now is always invalid

The problem is now, as a state, is continuously shifting. You cannot model now with any degree of accuracy. The best you can do is attempt to represent an interpretation of the information you have at a specific point in time. And it’s going to be wrong.

Any model of now is constantly being invalidated. Invalidated by the quality of information you have, the next line of code, the next story, the next bug, the next product owner’s meeting, the next user’s action, the next security bug in rails, the next learning.

Trying to produce code that represents now is a losing battle under such a barrage.

 Model change

What you can model, however, is a system under change.

Good systems, that just keep going and going, in a manner that is cheap to maintain and extend, don’t model now, they model change.

Future capable

Modelling change is about placing the capability of delivering future versions at the heart of the system.

Each delivery is not only a representation of now but opens paths to deliver the next set of changes and those changes have the capability to deliver the changes after that.

Use the future to test your design

The future is an inevitability even if it’s form is not. But those hypothetical future scenarios still have their uses.

They are great ways of testing your design. Throw future scenarios at it and see how it will cope. What could happen next?

You don’t want it to have solved the hypothetical future scenarios but you do want the design to have the capacity to adapt to them.

If you throw a likely future change at your system design and you discover that it will be expensive, cumbersome or even dangerous under those circumstances then you have failed to design for change.

This is, of course, a flawed process. We’re always going to miss something. We’re always going to get something wrong. But it is the best process we have.

Don’t over-engineer, don’t under-engineer

Both over, and under-engineering are symptoms of placing too much emphasis on now.

If we believe that our designs are going to last indefinite periods of time we become tempted to over-engineer. We create abstractions that aren’t designed to change but are based on the assumption that today’s knowledge and understanding are both accurate and permanent. This creates a system which is brittle and difficult to change.

If we consider now the ultimate goal, without considering the future change, we become tempted to under-engineer. We dismiss the future out of hand and accept overly naive solutions to problems that are soon to be invalidated. The result is a code base that becomes rapidly unwieldy, overcome with technical debt and expensive to maintain and change.

Source control embraces change

Source control exists purely in recognition of change. It exists purely to allow change.

It achieves this by modelling change, by giving you a means to represent and capture change. It does not model a concept of now. Instead it allows the creation of a representation at a point in time.

Distributed VCS actively embrace this and become all the more powerful for it.

Continuous delivery

Continuous delivery is a great catalyst for ensuring that you design your system for change. By it’s very nature it is a mechanism for delivering change.

Teams that truly embrace continuous delivery cannot ignore the future because the future is immediately upon them.

Teams that practice continuous delivery think in terms of change. Every checkin, every architectural decision is not simply about now but also how to deliver the changes after that. To achieve this they try to keep changes as small, but as functional, as possible.

It won’t stay that way

When going to add that line of code tell yourself You Ain’t Gonna Need It, but also tell yourself It Won’t Stay That Way.

When you write your code be aware that the next story, the next production event, the next standup, has the potential to invalidate that line of code. Within moments that line could be deleted, it could be modified, it’s behaviour could be changed by something that gets put above it.

Think of your code as cutting a path to the next set of changes rather than just delivering what is in your current story. This way you will build systems that will quietly adapt and evolve of time.

Monitor Don’t Log

Look at the market and you see a bunch of products springing up around monitoring, alerting and logging. Graphite, logstash, logster, graylog2, Riemann, splunk to name a few.

To my mind there’s a whole lot of confusion going on. I’m sending logs here, stats there, filtering in this place, alerting over that way somewhere. Logs go this way, that way. Apps send stuff to logs and to alerting systems, then to monitoring dashboards.

It’s a bloody mess that’s what it is. #monitoringsucks

They all solve the same problem

But from different angles. But the problem is the same. It’s about understanding your system. Looking after it in production. Checking it’s working OK and diagnosing and fixing it when it isn’t.

So why so many different solutions?

The problem is logs

Ask a room of developers what they’d do to help diagnose a live bug in their app. Most of them would say ‘add some logging’. Ask them how and they’d reply ‘use log4x to write to a log file’. They’d write some text out to a disk. There’s your problem.

What only a very few would say is add some monitoring. Rather than write out a line to a file that says ‘Connecting to the database’ they’d record a key value pair for active database connections. Rather than writing out ‘Home page requested’ they’d prefer a meaningful data structure detailing the request received. Rather than output ‘Integration point unreachable’ they prefer to send out a tick with the integration point status.

I’m not saying writing stuff to disk is wrong…

Just that it’s a side effect. It’s far from your primary concern. It’s a story way down the priority list. It’s the last thing you should do, not the first.

Observation, checking and recording

To be a caretaker of any running system your first priority is to check. To check any running system you need to be able to observe it; observe how it’s performing, to check it’s not in trouble.

First you observe, then you record what you observe.

Monitoring vs Logging

Monitoring is observation, checking and recording. Logging is recording.

If you were a doctor and had a patient who was complaining of heart problems what would you do: use an ECG or read their Twitter feed?

Logging isn’t monitoring

Logging is primarily about recording. It’s not about observation. You can’t watch masses of stdout pouring out and derive anything useful.

In the past we have employed logging as means to retrospectively diagnose a system. In the present we have attempted to retrofit monitoring by massaging logs that didn’t have the intent to convey meaningful information for machines to make decisions on.

The problem is that logging was never originally intended to do anything more than output, usually to disk, a record of arbitrary activities in an arbitrary format for human consumption at some other point in time (the future).

In fact logging has historically been designed to enable a disconnect from the system: send me your logs so I can see what’s going on. Monitoring is the exact opposite: it is connected and engaged.

State vs events

Good monitoring provides two things. The state of the system at any point in time and a notification of events.

The state of the system helps you understand the current condition. Technical things like the number of database connections, the number of http connections, the amount of memory consumed, the amount of disk space used, whether a connection is open or closed etc. Domain things like the number of orders being processed, the number of customers logged on. From this information you can check if the system is working well or not.

Events enable you to understand what is taking place within the system. Technical things like an http request has been received, an error occurred. Domain things like an order has been placed, a new customer has been created.

Monitoring is about providing the ability to observe, check and record these two distinct things.

Prefer state

Logs are typically event streams. They rarely output state.

However state is often a more robust method for understanding the system. It is more reliable to record the total number of orders the system has taken and act on a change to that state rather than respond to a single notification per order which you may or may not receive.

The problem is that the traditional paradigm around logging encourages an event model (by outputting an description of an activity) rather than reporting the more useful state of the system.

State at a point in time vs state change

There are also two ways of reporting state: at a regular point in time or at the point the state changes.

The first method has latency (as you need to regularly poll/push the data at intervals). The second reduces latency but is essentially an event with stateful metadata so suffers from the same pitfalls as events.

Both can be used together.

Alerts can be on state or events

We can alert on state (or combination of state) such as high CPU plus low memory. Or we can alert on events, such as application errors or a new order coming in.

However we can derive events from state change (for example if the number of orders increase we can send out a New Order alert).

Structured logging

Structured logging provides a solution to the limitations of human readable logs. By logging machine readable data rather than human readable a number of the problems of traditional logging implementations can be overcome.

Design your system to be monitored, not logged

When building and designing your system stop thinking about logging. Think about monitoring. That’s monitoring not logging.

This doesn’t mean that a form of logging can’t be employed in monitoring (such as structured logging) or even than traditional human readable logging doesn’t have it’s uses (for debugging in development). It’s just that your goal is to monitor and monitoring is far more important than logging.

So, before introducing log4X think about how best to monitor your system and implement that in your design in whatever way is best (status pages, push notifications, structured logging). For other problems consider more appropriate first class, machine readable, solutions such as event feeds which can be leveraged more effectively and cheaply.

Then, if you find you really, genuinely still need it, you can write out some human readable stuff to disk somewhere.

Weighing the Cost of Expediency

Here is a situation familiar to us all: you’re working hard towards a release, a story comes up that is essential but its implementation seems expensive especially given the time frames. One of the devs on the team that prides themselves for their pragmatism offers a cheap workaround. It’s slightly ugly, not something you’d want in the long term, but, the team reasons, it’ll do the trick and everyone agrees to clean it up first opportunity, or even better, when the next story touches it. Sounds like the sensible, pragmatic thing to do so the team agrees.

Except something bugs you about it. Not sure what but you’re flicking through Bob Martin’s Clean Code feeling that this isn’t right. Yet the pragmatic dev’s reasoning is irrefutable, it’s the only sane thing to do.

A few months later, perhaps even longer, you pick up a story in a similar area and realise that the workaround is still there. Not only is it still there but it’s grown and it’s going to take you a long time to pick it apart. So, the pragmatic member of the team pipes up and suggests a work around…

The above is a text book case of tech debt. The team wants to get value quickly so it deliberately accumulates some debt in order to achieve it. Given the situation the value outweighs the debt so it sounds like a good trade off.

Where the team went wrong

Unfortunately the team’s original reasoning is flawed. When they weighed up the expedient solution vs the right solution they incorrectly attributed cost. They did this by deferring the cost of doing ‘the right thing’ and assuming that nothing changes. This leads to the following incorrect assumptions:

  1. The opportunity to do ‘the right thing’ will still be available or will resurface in the very near future
  2. The lifespan of the expedient implementation will be very short
  3. There will be little or no side effects from the expedient implementation
  4. The cost of doing ‘the right thing’ will be the same
  5. The cost to remove the expedient implementation will be low

The problem is that very rarely are all (or any) of those assumptions correct. In fact, the only point at which any of those assumptions can be correct are immediately after the expedient implementation has been released. From that point on each of those assumptions rapidly become increasingly invalid.

The first assumption that goes out the window is that the opportunity will still be available. The reality is that the only safe assumption is that the opportunity is available only at this moment. If you don’t take the opportunity at the point while the need exists then the need will be removed and thus, so will the opportunity (see how this works?). You would have to rely on another story to create a new opportunity and you cannot do that: priorities change, demand changes. The only assumption you can make is, that there may or may not be another opportunity in the future, and if there is you cannot predict when it will be and it is highly unlikely to be soon. It will also be highly likely that the pressures will be equally great. So even if another opportunity arrises (say another story in the same area) the team may feel that it would be an inappropriate time to fix up old debt. In fact it is reasonable to assume that if the team takes the expedient option this time round then they are equally likely to make the same decision next time unless a major variable has changed.

This then invalidates the second assumption, that the lifespan of the expedient implementation will be short. With any changes you make to software you must always assume that they will be long lived. Why? Refer to the first refutation.

You must consider the expedient implementation part of your system design: because it is. I think people don’t, they file them in another part of their brain away from the real system. The reality is that it will become a part of your system equal to any other part. This means that all future design decisions in related areas will be affected, in varying degrees, by that implementation. This invalidates the third assumption: the expedient implementation will create definite side effects.

As the expedient implementation is long-lived and has side effects on the system design, this means that changing the code to do the right thing would have an increased cost. All those design decisions that were made to compensate will require change to align with the right design. This not only invalidates the fourth assumption but the fifth: the expedient implementation must be detangled from the system design and it will be costly.

The result

Interest on the technical increases rapidly over time. Mainly because tech debt works on a compound interest basis (due to the highly interwoven nature of software). Tech debt attracts other tech debt, it is like a cancer. Once you make one expedient decision you then find yourself making placing another expedient implementation to compensate on top. Repeat over and over.

The correct reasoning

So, what are the correct assumptions to make when weighing up these decisions?

  1. It is unlikely the team will have the opportunity to fix this anytime soon
  2. The expedient implementation will be long lived
  3. The system’s design will change to reflect the expedient implementation and will affect the quality of the design in related areas
  4. The expedient implementation will become increasingly more costly to remove
  5. The correct design will become increasingly more expensive to implement

Of course, it may be perfectly valid to still implement the expedient solution especially if the value far outweighs the cost. Just be sure that you’ve done your maths correctly otherwise you’ll be in for a nasty surprise :)

Install Files Using CloudInit

Cloud-init is one of those killer apps that makes working with Ubuntu a breeze on the cloud (or even other virtualisations such as lxc).

Two of the most basic but awesome features of CloudInit is that it supports multi-part data and custom part handlers. This allows you to do two things: separate your user data into multiple files and deal with those files in whatever way you please. So you could upload a shell script and execute it, or upload an sql script to be run against mysql for example.

When setting up a new box you’ll undoubtedly have to upload quite a few files (application configuration files etc.) and put them in the right place on the filesystem. Although the multipart helps you get the files onto the box you end up writing a shell script to copy them to the correct destination.

Well there’s an easier way: tarballs.

1. Fakeroot

Start by creating a fake root. You don’t need to use fakeroot to do this but the intent is the same. Instead replicate the parts of the directory structure you want.

As an example I want to drop my-app-defaults into /etc/default:

/home/me/my-app/fake-root/
|-- etc
|   |-- default
|   |   | my-app-defaults

Remember to get your permissions right too.

2. Tarball your fake root

This bit’s the easy bit. Simply move to your fake-root and make a tarball:

(
  cd fake-root
  tar --create --file /home/me/my-app/out/config.tar .
)

3. Add a part handler

Perhaps the best thing about CloudInit is that it’s written in Python :). So it’s really simple to write a part handler to extract any tarballs you’ve uploaded.

#part-handler
import os
import tarfile

def list_types():
  return(['application/x-tar'])

def handle_part(data, ctype, filename, payload):
  target = "/root/%s" % filename
  print("[tarball-file-handler] %s %s" % (ctype, target))
  if ctype == '__begin__' or ctype == '__end__':
     return

  with open(target, 'w') as f:
    f.write(payload)
  tarfile.open(target).extractall('/')

4. Add the part handler and tarball to userdata

This is down to you how you do it. You can follow these instructions or you can use something like Ruby’s mail:

mail = Mail.new
files.each do |file|
  mail.attachments[File.basename(file)] = {
    :encoding => '7bit',
    :content_type => `file --mime --brief #{file}`,
    :content => File.read(file)
  }
end
mail.to_s.gsub("\r\n", "\n")

And that’s it. Now cloud-init will do all the hard work for you. Just give it the tarball and job done!

Bonus tip:

Gzip your user data. CloudInit will automatically unzip it on the other side. This allows you to squeeze more into the user data’s 16kb limit and also keeps things nice and simple as you don’t have to worry about compressing that tarball.

Extreme Architecture

Not touched by human hands

Here’s a rule: you can’t ssh on to your production boxes. Not just you, don’t feel like you’re being singled out, nobody else can do it, especially not if they’re human. Not even if they are a monkey or a dog come to think of it.

If you really, really, have to ssh then you definitely, definitely won’t have root access, or sudo, or even the ability to edit a single file. You certainly, certainly won’t be allowed to install anything.

Sound a bit harsh? Well, you wouldn’t let me ssh onto the production web server and start fiddling with the website’s Ruby code in vi would you? So why the hell do we think it’s acceptable to do the same with /etc/passwd?

Everything about your infrastructure should be in source control. Everything about your infrastructure should be repeatable and reliable.

Failure means death

If a server fails, kill it. Even if it fails a little bit. Zero tolerance. If it gets a little slow, well give it the chop. Be merciless.

In fact, sometimes just kill it anyway for the hell of it.

Fast

A single server should be built in minutes. You should be able to build your whole system, from scratch, really, in small multiples of that. So, let’s say twenty minutes? With everything installed. With everything working.

Small

No big things allowed. Only small things. Small servers, small amounts of storage, small amounts of RAM. If you need a big thing then you build it out of lots of small things.

Even environments should be small, which brings me nicely on to:

Autonomous

Build autonomous, small environments around your capabilities. They have their own rules, their own security, their own services, their own monitoring. If you need to share capabilities (say a Credit Card payment gateway), then you do so via services that sit in their own firewalled, protected, autonomous environment.

Funnily, this is kinda how the internet works.

Then, blow your SSO environment away and replace it with another one. Go on, no one will notice.

Production is a state, not an environment

There’s no such thing as a production environment, there are only environments in production. So create an environment and put it test, then put it in production, then put it in test again if you want to, or dev, or wherever, it’s just a state.

Share nothing

Be a selfish individualist, demand your own of everything. Own DNS, own LDAP, own firewall, own reverse proxy etc. Then refuse, absolutely, point blank, unconditionally, no special cases, preferential treatment, zero tolerance, to share anything. Anything.

Forward and replicate over centralisation

It would be madness to have multiple sources of truth or a dozen places to go to do administration for cross cutting concerns across capabilities, especially for things like security, or monitoring. Equally it’s madness sharing all infrastructure in one big environment (or across multiple environments).

Sharing nothing is great, but for cross cutting concerns you need some repetition, so replicate data or forward it or shard it or whatever, just don’t share infrastructure.

So, each environment could have it’s own LDAP which is replicated from a master. But each environment is free to add its own users or security concepts that no other environment needs.

Stateless

It’s easier to remember the important things if you don’t bother trying to remember the unimportant ones. Not having to remember anything at all is even easier.

So make your servers really, really forgetful. So forgetful that if you turned them off then everything you’d done since you turned it on would be gone. Forever.

Of course there’s important stuff that you’re going to absolutely need to remember, like all your customer’s details, but that stuff should be provided by services on your infrastructure like clustered, networked, high availability, failsafe, filesystems or databases. Servers and their configs just aren’t the kind of important that we should worry about remembering.

Immutable

Servers don’t change. Whatever way you built them you leave them that way. You don’t upgrade them, you don’t modify their configs, you don’t patch them, you just leave them exactly the way you found them.

Disposable

Throw everything away. Don’t bother trying to fix it, upgrade it, renew it. Just trash it and get another one. Do this regularly.

Apply this principle to as much as you can. Don’t bother remembering the ssh keys your app needs, just revoke the old ones and dish out a load of new ones, it’s much easier.

Short lived

Nothing lasts forever so don’t try to pretend it does. Kill it before it attempts to. A day’s a long time, every day should be a new day and a new day means a new server.

Self diagnosing

It’s much easier to know what’s wrong if the thing that’s wrong can tell you what’s wrong with it.

Pretty obvious really.

Self healing

If something in the environment is wrong, and that thing knows what’s wrong, then why doesn’t the environment just fix it itself? Database has gone down? Bring it back up. Server’s crashed? Throw it away and replace it.

Not so obvious but still fairly obvious really.

N* environments

Creating a new environment should be as easy as asking for a new environment. Which should be as easy as running a script and giving it the name of your new environment.

Environments aren’t an Enum of ‘Production’, ‘Test’, ‘Dev’, they’re an array full of whatever labels you happened to come up with that day. And you’ll probably find you can come up with a lot of different labels in just one day, and lots and lots of environments to go with them.

We’re Not Special

“A key differentiator of highly productive teams is the ability to identify what is core to their domain, and thus brings them competitive value, and what is commodity, in order to focus their energies on solving core problems and not commodity problems.”

Most of us, at some point in our career - especially those that have laboured in ‘enterprise’ - would have experienced working in teams where most of the energy seems to go into debugging, patching and developing the highly customized, and specialist ORM, or web framework, or messaging system, or unit testing framework, or… To the sane amongst us it’s an exercise full of futility and waste. These custom packages lag dramatically behind what’s available, often for free, on the internet and seem to suck up velocity like a child with a milkshake, seemingly immune to the inevitable ‘brain freeze’.

All to often, the reasons for teams investing in solving commodity problems is a misconception that they are actually solving a core problem. It is argued that a custom ORM is required because the way the team do databases is fundamentally different and therefore a unique solution, particular to their own particular situation, is required. Where as the truth is often that the system has been built is such an unnecessarily arcane way that it gives all the appearance of a special situation when in fact it is simply an inappropriate one. Solving problems made by special thinking requires special thinking and it’s turtles all the way down.

Identifying the commodity isn’t always so trivial as choosing an ORM and building your system around it. Sometimes a team finds itself working in a new field of technology that they have little experience in and, through ignorance and the nature of empirical discovery, it is only after months of investment into a custom solution for a commodity problem that you discover the truth.

Sometimes, however, you may find that you are special after all. In those rare cases where there are commodity solutions it may be to the teams advantage to ditch it and build there own. This may be because:

  • it doesn’t meet your design/architectural principles - if simple testing is important and all the solutions on the market fail to make that simple then it may be worth while rolling your own
  • it gives you a business advantage - pushing the technological boundaries in a commodity area improves your brand somehow
  • it enables you to learn - if your system is heavily tied to a particular technology then rolling your own may allow you to make discoveries about that tech that provide value to your product
  • the effort to adapt is greater than building something ourselves - if you are working on a legacy app that had never be built to consider the approaches used by commodity packages you may find
  • the commodity doesn’t yet exist - if you are very cutting edge, a commodity solution has yet to appear and you are stuck rolling your own. After all, at one point there were no CI servers, no unit testing frameworks, no mocking libraries
  • you genuinely believe you can push the concept forward - JMock existed but that didn’t stop Szczepan coming up with Moqito and improving the world for the rest of us.

The truth is, for ninety percent of the projects none of the above will apply. The commodity problem of MVC web frameworks has been solved, logging has been solved, deploying rails apps to the cloud has been solved. Is your web/logging framework/cloud solution genuinely going to compete? If you truly believe this to be true then start by commoditizing it now - either by building a product or open sourcing it - otherwise it’s just a pipe dream and you are doing nothing more than developing another custom commodity solution.

In some cases you may genuinely discover that what’s on the shelves isn’t quite right. Every solution only always leaves you one step away from reaching your goal. For example, you’re using gdash to pull in all your graphite graphs but you wish to mix in AWS cloudwatch graphs as well. Rather than building a separate, brand new dashboard solution, consider forking gdash and enhancing it to pull from cloudwatch as well, then contribute back. For the majority of problems this will be less effort than rewriting your own solution from scratch.

Grade Delusion

There is a terrible failure in reasoning amongst politicians and the media that exam grades and standards have a causal link, that somehow a rise or fall in grades denotes the opposite movement in standards. Not only is this reasoning flawed and the argument both invalid and unsound but the premise that grade inflation or deflation are somehow critical is irrelevant.

The media have been whipping up a storm about grade inflation for years now. As more and more students obtain higher grades they’ve deduced that the only reasonable cause is that exams are getting easier. Apart from the fact that correlation does not imply causation grades are not even a measurement of difficulty anyway. Grades are orthogonal to difficulty. Two people could take the same exam with the same difficulty of questions and give the same answers to those same questions. The difficulty of the individual questions is fixed. However, you could use two different marking schemes and the two different individuals would get different grades. This could be achieved by simply adjusting a single variable, such as the grade boundaries. This is, essentially, what Ofqual did.

Where grades relate to difficulty is in terms of the difficulty in obtaining the grade, not in the difficulty of the exam itself. So, reusing our above example, the questions are the same level of difficulty, but by moving the grade boundaries up or down we make it harder or easier to achieve a particular grade. This is an important distinction as what this means is you can have an easy exam where it is hard to get a top grade (because you’d have to get 100%) or a hard exam where it is easy to get a top grade (because you’d only need 5%).

Incidentally this relationship is one of the many reasons why fixed percentages in grade boundaries (that is to say top 10% get A, top 20% get B etc.) are fundamentally flawed. If you change the difficulty of the exam the grades move in a way that makes it appear that the two exams are equal in difficulty and this breaks the critical purpose of the grading system: to compare peers.

Ofqual really, really cocked up when they moved the grade boundaries suddenly. They should know better, this is their business, they need their heads placed on a chopping block. The single most crucial purpose of exam grades (grades, not exams) is to enable the comparison of peers. That is to say I advertise a job for someone with nine GCSEs, some recently qualified students apply for the job, and I need to be able to tell which one is better qualified. I will use their grades to compare them. Ofqual screwed that up, they broke the grades. If I had two people who took their exams in Spring 2012 and two who took them before I would not be able to compare their grades side by side. The same would be true for University, College or Sixth Form applicants.

What Ofqual forgot was that peers are compared across a time period. So two applicants who took their exams a year apart need to be distinguishable. This is why there is a preference for calculating grades by attempting to keep the exams at the same level of difficulty and fixing the boundaries around marks. This way, students who sit two separate exams at different times can be compared. Using fixed percentages breaks this, moving the grade boundaries breaks this.

The problem is there is a risk of grade creep. The only way of protecting against grade inflation is to make performance something both measurable and repeatable. Because we can do the former people make the mistake of believing we have the later which is not the case. Athletics is a good example of where both can be achieved: an indoor 100m sprint, for example, where the biggest influence is the length of track and runners, can be run again and again fairly consistently by the runners who can then make direct comparisons between themselves and each other. But exams need to protect themselves against cheating which means that inherently they cannot be repeatable. If we can’t make exams repeatable we can’t measure difficulty in an absolute way and because of a number of statistical and rounding errors you have to accept that inherent in this system is a risk of grade inflation or deflation.

But grade inflation and deflation are irrelevant. Firstly, as already demonstrated, the grades don’t reflect the difficulty of the exam. Even if you’re trying your hardest to link the difficulty of the questions and the difficulty in obtaining the grades (apparently Gove can achieve the impossible however) it’s not the end of the world if you let a little drift in. Why? Because we don’t break the core purpose of exams: to compare peers.

Now this is where the media get’s really riled. They’ll point to the number of students who got A grades in the 1980s and the number now and say look, massive grade inflation, exams are getting easier (repeat: correlation is not causation). But it is irrelevant. No sane person is trying to compare someone’s English GSCE from nearly thirty years ago with someone’s now - apart from those students who want to gloat that they did better than their parents. It’s absolutely irrelevant. Those exam grades are only useful for a set period of time. Once your GCSE grades get you into college and you’ve done your A levels nobody cares about them anymore, likewise once your A levels get you into university and you have your degree no-one looks at them, and further more after a few years of work experience people forget your degree. Do you care what GCSEs your mechanic got? Or your lawyer or doctor? No, it’s absolutely irrelevant, they are measured on other things.

The other statistic is our position on International Tables such as the OECD/Programme for International Student Assessment (PISA). People point out that while our grades go up we fall in the tables. Again a bucket load of faulty reasoning trying to link two complex variables together and extract causation (please stop doing this, it’s getting boring). Apart from the fact we could be getting better, just not as fast as everyone else, or that the criteria is biased toward certain education systems or a host of other causes table positions are a stupid measure of performance. Here’s an example: in the 2012 Olympic 100m mens final, apart from Asafa Powell, all runners finished under 10 seconds. An amazing achievement: the difference between Usain Bolt, in first place and Richard Thompson in seventh was a paltry 0.35 seconds. Yet if you look at the tables Richard Thompson came second to last. This is the same error we are in danger of making by looking at positions in tables, instead we should be using more sophisticated statistical methods to determine if we are the educational difference of 0.35 seconds away from the fastest man in the world or several.

What is important is that our exams are robust, challenge our children, instil knowledge and prepare them for the future. But we must first acknowledge that grades are absolutely orthogonal to this otherwise we risk ruining our education system in pursuit of a worthless goal. At the moment it is worrying that our politicians, the media, and Ofqual and even Ofsted are obsessing over the wrong things.

Layering the Cloud

One of the great things about the cloud is the way you can just run a bit of code or a bash script and before a Windows admin can open their GUI you’ve got a running box.

This opens up a host of opportunities and new patterns. Martin Fowler recently posted about the Phoenix Server pattern where machines are simply thrown away rather than upgraded. But these things require looking at the way you architect infrastructure slightly differently.

To help you can split your cloud architecture into three layers:

  • Visible
  • Volatile
  • Persistant

Visible

This is the layer between the cloud and the rest of the world. It is mainly public DNS entries, load balancers, VPN connections etc. etc. These things are fairly static and consistent. If you have a website you will want the same A Name records pointing at your static IP.

Things in the visible layer rarely change or if they do they are for very deliberate reasons.

Volatile

This is where the majority of your servers are. It’s called volatile because it tends to change a lot. The known state (how many servers, what versions of which software they run etc.) are changing frequently, perhaps even several times per hour.

Even in a traditional data centre your servers are being upgraded, having security patches applied, new versions of software deployed, extra servers added etc. On the cloud this is even more volatile when you use patterns such as the Phoenix server and machines are rebuilt with new IP addresses etc.

You should be able to destroy the entire volatile layer and rebuild it from scratch without incurring any data loss.

Persistent

This is where all the important stuff is kept, all the stuff you can’t afford to lose.

Ideally only the minimum infrastructure to guarantee your persisted state is here, so for example, the DB server itself would be in the volatile layer but the actual state of the DB server, its transaction files etc. would be kept on some sort of robust storage that is considered ‘permanent’.

By organising your infrastructure around these three layers you are able to apply different qualities to each layer, for example the persistent layer would require large investment into things like backup and redundancy to protect it whilst this is completely unnecessary for the volatile layer. Instead the volatile layer should be able to accommodate high rates of change while you will want to maintain a considerably more conservative attitude towards the persistent and visible layers.

Later: A Story Graveyard?

In a previous post I discussed how we divide our backlog into Now, Next and Later. One of the things we’ve observed is how Later seems to be the place where stories go to die. That’s not to say that some stories eventually make it into Next and eventually Now but for the vast majority of them sitting in Later is a terminal condition.

This raises a number of interesting questions: what should we do with stories that are likely to live their lives in Later? and why do we generate so many stories that will never be implemented anyway?

It makes sense to start by looking at why we generate so many stories that are unlikely to get implemented. Looking at a recent project nearly half of all stories were unimplemented at the time the project was considered complete. In a recent article Joel on Software made a similar observation.

So what’s going on? Why does it appear that we create a story that doesn’t get implemented with every story that does?

There are two opposing forces at play: the need to brainstorm and capture as much information to allow us to understand what we’re building and the need to keep scope as small as possible. Both activities are continuos: as we discover we learn about all the things we’ve missed, as we discover we learn about all the things we don’t need.

These opposing forces come together to form a decision point. When writing software we are continuously rationalising all the different options: can we get away with Paypal only and forget about all those credit/debit card stories for now? what if we save to the filesystem and not worry about a db, can we get away with that? We plot out all the different options (and to do this we create cards) and, treating each one as a hypothesis to be proven, we attack them from different angles until we find the one that stood the best. The ones that stand the test get implemented, those that don’t go to the Later column to die.

Except they might not.

The nature of software means that those decisions don’t stand indefinitely, they only stand while the propositions that proved them hold. Using Paypal only made sense when the goal was to test the market with a small set of users; if we change the goal to maximise sales then we’ll have to go back and revisit the decision. That means that those stories we stuck in Later may, at some point in the future come back to life when we revisit those decisions.

A key question is whether we need to keep those cards or whether we should simply regenerate them when we re-evaluate our decision? The world of electronic data where storage is cheap has fostered a conservative attitude where preservation is the default. We really don’t like to delete things.
But is there any value in having the original reasons to hand when when we re-evaluate a decision? What influence will they have, or should they even have any?

There are problems with keeping information hoping that we will make use of it later on. The first problem is that having the information doesn’t mean we will necessarily unearth it again when we come to challenge our decisions. To even know it existed is to rely on memory and the team may have lost that memory one way or another; even if the team does remember its existence it may be buried deep under stacks of noise that means digging it out may not be so simple.

Another question is whether the original information is even relevant. The decisions represent a point in time and therefore capture the state of the world at that time. It is unlikely that the entire state held. So, for example, a decision rejected on the cost of hardware may be made irrelevant by cloud providers for example. People often describe this phenomena as a form of decay.

There is another variable and that is discovery cost. Some information will cheap to discover. Researching a mocking framework for example is a few dozen minutes trawling links bought up by Google searches. A spike into the suitability of MongoDB however may be a weeks work for a pair.

In which case I would argue that we should capture decisions that are likely to provide a legacy and/or are expensive to rediscover. But these decisions should be presented in a different format so they can both be found easily and revisited. Cards that represent cheap decision points, on the other hand, we should leave them to their fate. Then we can simply archive all old cards to the dark depths and not worry ourselves about them again expect if we wished to go on a nostalgia trip or use their attributes for statistical analysis.

Now, Next and Later

When starting a project, or a new phase of an existing project, it is common for teams to try and capture a decent breadth of stories and prioritize and estimate them to form a backlog. From that backlog the team can then start to organize the stories to form some sort of plan and an idea of the overall size.

This is often an intense process. Teams usually size projects using the entire story list forcing them to analyse stories in detail and make decisions at the point when they know the least. This low level of knowledge has teams struggling to achieve the impossible task of forming a complete picture. It also requires that they undertake a number of time consuming activities - prioritization, estimation etc. - on stories that may not be played for many months, if at all. Not only does this give them a false sense of confidence but it is wasteful.

The team I work with in Thoughtworks Studios devised a method of organizing our backlog that tackled these issues.

We start by gathering the stories we believe we need for the project. We then divide our backlog into three columns: Now, Next and Later. Each column is based on when we plan to deliver them: so the goal we are currently working towards (Now), the goal we believe we’ll work on once we’ve completed this one (Next) and anything Later than that. We then move our stories into these columns.

This organization gives us two things:

  1. We have a prioritized workload (after all, if we’re not working on it Now it can’t be the highest priority).
  2. We understand clearly what we have to focus on (i.e. Now).

Once we have roughly sorted our stories we start by deeply analysing Now; we do a little bit of analysis about Next (but to a far lesser degree than Now); and we do no analysis of stories in Later as we know very, very little. Therefore, we can estimate with higher confidence the size of the work in Now; less, but some small confidence for Next; but any estimation of Later is considered futile and valueless.

We also apply the same thinking to completeness: so Now should be a fairly accurate picture of what we will deliver, Next is a starting point but likely to be quite volatile, and Later has no guarantees at all.

A side effect of this organisation is we consider anything in Later a pipe dream. The result is that Later becomes a dumping ground for anything we suspect may be required in the future. That doesn’t mean some stories may move into Next - and eventually move into Now - but chances are the majority will live out their days in the Later column.