People are quick to point out that statistics can be deceptive, malleable, or unreliable. We hear this so much, I think people start to doubt that statistics have much value at all.

This is extremely unfortunate. Statistics are one of the most valuable tools at a marketer’s disposal.

Statistics free you from having to rely on guesswork, rules-of-thumb, hearsay, and the marketing advice echo chamber. You don’t have to look at industry benchmarks that may or may not hold true for you. You don’t have to shoot from the hip.

With statistics, you can answer questions you couldn’t otherwise answer. You can conduct your own experiments, come to your own conclusions, and learn about the specific things that matter to your business based on data from your unique customers.

I think it’s fair to say that many of us didn’t become marketers because we love math (though some did!). Some of us are intimidated by it and do pretty much everything we can to avoid it.

Trust me, I’m not naturally gifted at math. But, I was able to calculate the statistics I discuss here (by hand) in college. Some very informative statistics are actually very, very simple to understand and calculate. Where it’s not quick and easy to calculate, people have created tools to help.

There is little to fear and so much to be gained. Even if you have no intention of jumping in and using statistics to understand and improve your marketing processes, it’s still worth reading through this guide so that you understand what people are saying when they use these statistics. As much as possible, I used simple language and avoided getting caught up in unnecessary detail. That also means you should look at this post as an introduction to these topics and urge you to learn more about the statistics that seem valuable to you.

Averages are used constantly in marketing. Everyone understands them and they’re a great way to summarize a lot of data. But, an average can be misleading. So much information is lost when many numbers is boiled down to a single number… particularly when those numbers have a wide range.

Averages become more informative when they’re combined with another statistic called a standard deviation. Knowing the standard deviation tells you how much variability there is.

Variability matters. If you hear that a company’s average starting salary is $90k, you might get excited. But, if you learn that starting salary can vary by $50k, you might be a little more hesitant to apply.

A standard deviation is a way of saying, “On average, there is this much variability.” Or, put another way, the deviation from the average is usually this much.

If I tell you Company A has an average order size is $200, you get a very different picture than if I say Company A’s average order size is $200 with a standard deviation of $175.

That’s just another way of saying, “Company A’s orders usually range from $25 to $375 and the average is $200,” but it gives you a much more accurate picture of what those orders look like day to day. They get small orders and they get big orders. The orders average out to $200, but they’re all over the place.

Company B might have an average order of $200 with a standard deviation of $5. Their order size barely varies at all.

Both Company A and Company B have an average order of $200 but their orders are vastly different and only the standard deviation reveals that.

Standard devision is useful for understanding:

- Time on page – Time on page is communicated as an average but if there is a large standard deviation, it tells you that visitors are engaging with that content in very different ways. Some people might be leaving shortly after landing there while others are staying an inordinate amount of time. A large standard deviation might prompt you to investigate why visitors are behaving so differently on that page.
- Conversion rate by source – Our conversion rate is usually given as an average across traffic sources but you get a much clearer understanding when know how much variability there is between those sources. Some sources might convert at 6% and others at .5% so stating that your average conversion rate is 3% with a standard deviation of 1.5% provides a lot more context.

Calculating standard deviation can be cumbersome but there are online calculators that make this much easier.

We often hear that correlation doesn’t equal causation. This leads to a lot of people rejecting correlation data wholesale, which is very unfortunate. Just because there isn’t a perfect causal relationship, doesn’t mean there is no relationship at all, or that a correlation is meaningless.

Correlation is an immensely important statistic for a marketer to understand because it indicates whether there’s a relationship between two variables *and* the strength of that relationship. It’s true that doesn’t mean there is causation, but it is still an immensely useful indication of where you should investigate to find the cause.

Correlation can help you answer questions like:

- Are people who attend a particular webinar more likely to purchase than people who don’t?
- Is the length of your content related to how many social media shares it receives?
- Are customers who get in touch with support more or less likely to be customers after a year?

You may think these are valuable, but unanswerable, questions. Something only a data scientist could tackle. In reality, calculating correlations is relatively simple (we would calculate them by hand in college) and it’s made even easier by Excel which includes a CORREL function for just this purpose.

Correlations are measured on a scale of -1 to 1. A negative number indicates a negative correlation (when one thing decreases, so does another) and a positive number indicates a positive correlation (when one thing increases, so does another). The closer the correlation coefficient is to -1 or 1, the stronger that relationship is. A correlation of 1 or -1 means that the two sets of data are 100% correlated (it also means you probably made a mistake in your calculations).

If I found that the relationship between images in emails and clickthroughs was .85, that would indicate a relatively strong positive relationship between those variables. It informs me that I should be including images in all my emails.

Correlation is useful for:

- Understanding the factors related to conversion events. Are customers who are exposed to certain resources, email campaigns, web pages, and other content more likely to make a purchase?
- Making data-backed decisions rather than guesses or following the crowd. There’s a lot of advice and rules of thumb we accept on face value. With correlation, you’re able to see whether that advice holds true for you (and how true it is).
- Avoiding confirmation bias. The human mind isn’t “built” to make sense of massive amounts of data. We’re prone to rules of thumb and oversimplification. What we observe, and the conclusions we come to, can mislead us. Correlation gives us a way to test whether what we’ve observed is actually “in the data.”

To get started with calculating correlation, I recommend this guide to Excel’s CORREL function.

Let’s say you’ve calculated a strong correlation but you have a relatively small data set. Can you draw a conclusion based on what 20 or 30 customers did? How can you know whether you’re drawing conclusions based on random events or if there’s actually a relationship there?

With statistical significance you can know precisely how confident you should be in a statistic.

Statistical significance is often measured in percentages. In an academic setting, a less than 1 in 20 (less than 5%) chance, that what you’ve observed is random, is considered acceptable. The closer that chance is to 0%, the more confident you can be that what you’re observing is not random.

If the statistical significance is more than 5%, you probably shouldn’t rely on that statistic as fact. The higher that number is, the less you should trust it. In business settings, there’s often more tolerance for error than in academia so, in general, you shouldn’t dismiss a statistic just because it has a more than 5% chance of error.

On MarketingLand, Benny Blum relays an excellent anecdote around this point:

A few years ago, I got into a conversation with an old professor about business statistics. I asked him “what’s my target confidence interval for making a good business decision based on statistics?” His response has stuck with me for years as a reminder that business isn’t academia because there’s money to be made on each decision.

He said, “In academics we strive for high confidence intervals because our success hinges on proving a point. In business, if you tell me I’ll have a 51% likelihood of achieving more success with option A over option B then I’m likely to prefer option A.”

Basically, your tolerance for error is a personal thing. You have to decide how much faith to put in the numbers based on the risks involved with the decision you’re making. I think most of us would be better off going where our statistics guide us than dismissing statistics that don’t meet the stringent standards set by academia.

To get started with calculating statistical significance, I recommend this excellent WikiHow guide.

In spite of their negative reputation, statistics are valuable tools for helping us make sense of otherwise difficult to interpret data. The more you understand about statistics, the more comfortable you’ll be leveraging them in your business. I hope that this was a useful primer to some common statistics. If you’d like us to cover more of this content in the future, please let us know in the comments.