Online Advertisements and Statistical Analysis

Quite a few years ago, I was in the online advertising business. My team and I created banner ads to run alongside web site content, to entice viewers to click on ads and find out more about advertiser offers. We scheduled ads to run alongside specific content. We targeted ads towards users in specific geographic regions, thanks to a browser cookie that told us their ZIP code. And we constantly managed inventory.

Although we made animated ads, we avoided anything that blinked. There were no monkeys to punch.

Click-Through Rate (or Click-Thru Rate or CTR) was a key measurement of an ad’s success. Although at first we would sometimes see click CTRs between 1-2% (meaning that an ad was clicked between 10 and 20 out of every 1000 views, or impressions), as online advertising proliferated, and as our systems got better at filtering out false impressions and clicks from various robots, crawlers, and spiders, CTRs trended much lower: 0.25% suddenly looked good, and 0.10% was not uncommon in some cases. That’s 1 click for every 1000 impressions.

That’s why we were insanely interested in a blog post we found, now presumably lost to the ages, that ran a set of 6 banner ads, which varied only slightly, and analyzed the results to determine what aspects of the ads could improve CTRs. Did including the phrase “click here” really help? If the words “click here” were in blue and underlined, like a typical web link, would that improve the CTR?

They ran 30,000 impressions of each ad. I don’t recall if they set a frequency cap (wherein you limit the number of impressions a specific viewer sees); I don’t believe they did. I don’t recall exact details of each ad or the results, but it looked something like this:

Ad for amazing offer: 0.27% CTR
Same ad with “click here” in black on the left side of the ad: 0.28% CTR
“Click here” in black on the right side of the ad: 0.30% CTR
“Click here” in blue on the right side of the ad: 0.32% CTR
“Click here” underlined in blue on the right side of the ad: 0.33% CTR
“Click here” in blue, inside a button on the right side of the ad: 0.36% CTR

Naturally, the result of this was that all of our ads soon had a gray button in the lower-left corner with the words “Click Here” in it, underlined and in blue:

Example banner ad

I was a bit skeptical, though. Could we really say from 30,000 impressions that a 0.33% CTR is significantly different from 0.36%? It’s a difference of 9 clicks. Could that be attributed to random chance?

I don’t have a background in statistics, so I asked my father, a scientist. He handed me a 1000-page epidemiology textbook and said–and I love this part–that a banner ad click is a lot like a disease state: an individual either has the disease (a click) or does not have the disease. Needless to say, I didn’t make a lot of headway into the world of epidemiology, but the question still troubled me.

Now I am taking an introductory class on statistical analysis, and although my analysis may oversimplify things greatly, I think it is safe to say that we should not have concluded that every ad needed a button-like box with the words “click here” in blue in the lower-right corner.

If we look at any one of the banners in isolation, the CTR is really just a sample ~~mean~~ proportion [thanks to Patrick for the correction]. We could run millions of impressions of the same banner–would it have the same CTR? What is the standard error? To find the confidence interval for the best performing ad, we can run it through this equation:

p +/- z*sqrt((p(1 - p))/n) p = 0.0036, z = 1.96 (for 95% confidence), and n = 30,000.

The result? 0.36% +/- 0.07%. We are 95% confident that the true population CTR is somewhere between 0.29% and 0.43%. Well–yikes! I’m 95% confident that our measurement isn’t very precise. When we’re dealing with such low proportions, we could really use more precision. We would need to run a test with more than 30,000 impressions.

What if we wanted to run a test where we were 95% confident that our value was within just one-one hundredth of a percent (0.01%) of the population mean? In other words, CTR% +/- 0.005%? We have an equation for that too:

n = (z^2*p(1-p))/e^2 n = desired number of impressions, z = 1.96 (for 95% confidence), p = 0.0036, e = 0.00005

To be 95% confident that our sample CTR is within 0.01% of the population CTR, we would need to run
5,517,522 impressions.

Although the data presented in that blog post from years ago seemed compelling, I think I was right to be skeptical. As I said, this is based on what I’ve learned from an introductory course on statistical analysis. If you think I’m way off base, feel free to enlighten me in the comments.

3 thoughts on “Online Advertisements and Statistical Analysis”

Pingback: Online Advertising Click-Thru Rates, Revisited | The Accidental Developer
Chris Herdt says:

16 May 2011 at 9:30 pm

On the advice of a friend, I also analyzed these data using a chi-square test in Online Advertising Click-Thru Rates, Revisited
Chris Herdt says:

26 Jun 2021 at 6:45 pm

I believe the journal article that inspired this was “The Impact of Content and Design Elements on Banner Advertising Click-through Rates” in the December 2003 edition of the Journal of Advertising Research, authors Ritu Lohtia, Naveen Donthu, and Edmund K. Hershberger.

However, the analysis above is not based on the data presented in the journal article and should not be taken as such.

3 thoughts on “Online Advertisements and Statistical Analysis”

Leave a Reply