*“What do we want? Scientific Certainty! When do we want it? Within a certain timeframe!”*

The public, the media and especially politicians like to make a big thing about scientific uncertainty. For scientists, it’s just a fact of life. So what is this ‘uncertainty’ and how does this affect our lives?

We scientists perform research just so that we can understand the world around us. To do so, we use various scientific and statistical techniques, and especially where the latter is concerned, these result in ‘measures of confidence’ in the data (and thus conclusions drawn there from). It means that we present data with ‘error bars’, which are designed to show a range of values within which the ‘reality’ may lie. These error bars represent upper and lower limits that are determined on the basis of our confidence in the results. This is largely a statistical calculation, and it results in mind-bending statements such as “plus-minus 6% with 95% confidence”. What does this all mean?

Three concepts: Confidence, Error and Likelihood.

Imagine this scenario: you decide to determine whether the morning light is a result of the rising of the Sun in the morning (hear me out, this is going to be scientific!).

You’ve noticed that it seems to get quite light at around about the same time as when the Sun rises, but you’re not sure that it’s actually related to the Sun rising (stay with me!). So you hypothesise that the morning light is due to the Sun rising. To test this, you take a series of measurements over numerous days – the amount of light, the time of day, and the position of the Sun with respect to the horizon.

Your data looks a bit like a curve when you plot it – that is to say, there is no definite point at which dark becomes light (anyone who’s been up before it’s light will know this, but for the benefit of an undergraduate audience…).

So you do the statistics on it (yes, there is a point to paying attention to those stats classes!). This shows that there is a correlation between the position of the Sun and the amount of light (durrr, I know…), but wait! There is variation in the data. Not every day is the same! How could this be? Well, it could be that your instrument is near some artificial light sources, it could be that the very light of God is shining upon your scientific research (hey, appealing to all audiences here). How do you decide?

To the rescue – the null hypothesis!

For this you decide to generate a completely random version of the sunlight data (even your phone could do that these days). And then you compare, statistically, the random set to the experimental set. Sure enough, it tells you that only a percentage of the data could be explained by the random data. The rest could be considered to be explainable by the hypothesis (that the amount of light is a result of the position of the Sun).

Now, just say you decide you want to know what 95% of the data is saying. It is telling you that the light patterns match to the Sun position patterns to within, say, plus or minus 2 minutes every day. That is to say, the middle 95% of the light data matches within the same times plus or minus 2 minutes *every day*. What have you learnt? Well, you’ve probably confirmed that the position of the Sun is the dominant factor in the amount of light in a given place at a particular time of day (yes, yes, assuming you are outside, etc). That is, your 95% Confidence Interval.

But why all the scary stats and numbers? Why should we be only 95% confident of this match, plus or minus 2 minutes? Well, because we have measured things in the real world, with human-made devices and their associated problems; nothing is infallible. But also because there might actually be other factors at work – street lights, machine error, etc. But if we take that null hypothesis test we did before, we’ll see a pattern. In the above example, had we taken the middle 99% of data, we may have had a result that was plus or minus 30 minutes in the time data. That’s starting to sound a bit dodgy. Had we taken the middle 66% percent of the data, we may have been within plus or minus a few seconds, but that would have left a third of the data unexplained. What’s going on here?

Well, fortunately, these numbers I’ve been picking relate to ‘standard deviations’ (SDs), a highly statistical term that essentially means the amount to which the data show ‘weirdness’. A small SD means the data is pretty tight – it’s all showing the one thing. One full SD is around 66%, which we’ve agreed is a pretty poor test of the data. 2 SDs however, is 95% of the data, “almost all” in most people’s parlance. 3 SDs puts you in the 99% category, which is ridiculously definite!

Imagine a diagram of confidence versus error; the Y-axis shows Error, measured as a percentage deviation (that is, how much it differs from the average), while the X-axis shows the confidence level, measured in those Standard Deviations. Remember, we choose our confidence level, then see what the error level is. Choose your confidence interval, and then see where your error margins plot. This will give you an idea of how strong your result is. This is, the likelihood that you have made an observation of reality; your science has revealed a ‘truth’ about the world around us.

So, those studies that have low error margins at high levels of confidence, those are the ones we can be pretty darn certain are *likely to represent* the real world. The ‘Nobel Committee’ area of certainty represents experiments that start to demonstrate ‘theory’ – that summit of science where things are considered to be the closest thing that science has to ‘fact’.

Examples of things that fall in to that ‘Nobel’ area: gravity being responsible for the apple falling from the tree; the Sun rising in the east and causing ‘daytime’; human influence on climate causing global warming. Yes, I said it. Ask the climate scientists – this is where the data lies.

So I hope this has helped you pick out the problem we scientists have in communicating ‘certainty’ to you. We’re never certain, we’re just certain within certain error bounds, at a confidence level of X.