You Should NOT Only Buy Cheap Liquor

You Should NOT Only Buy Cheap Liquor

by Neal MacDonald, Editor

Business Insider published an article called Why You Should Only Buy Cheap Liquor written by Ben Taylor of FindtheBest. They came to the following conclusion in their article (our emphasis):

We compiled ratings and awards from the San Francisco World Spirits Competition, Wine Enthusiast Magazine, and the Beverage Testing Institute. We rated each bottle based on its performance at these competitions. Unlike grocery-store consumers, who tend to believe more expensive liquor tastes better, the experts at these competitions taste each bottle blind. They have no idea what the price of the liquor is, and can give a more unbiased rating as a result…

There's essentially zero correlation between price and quality.

They even made a nice chart graphing out scores to prove their point.

This is an entirely inappropriate conclusion. Here at Proof66, we have read every result from these judging institutions for the last 8 years. We have interviewed the judges. We have run our own statistical analyses on a variety of subjects. We have run the numbers to hone a very complex algorithm to address issues like consistency in scores and biases towards one particular style of spirit versus another. We have spent hours every single day for the last 5 years thinking about these numbers and what they mean. In short, we feel there is only a handful of people on this planet that know more about these scores from these judging institutions than we do.

So with that as context, here is why this conclusion is entirely inappropriate.

1)      Their numbers don’t work. The “smart rating” that Business Insider is using is based upon the numbers from Find the Best… a review aggregation service. We don’t know anything about the algorithm that they run - we’re sure it’s proprietary - but at a glance we can determine that it doesn’t correct for issues that we find important: that is, consistency in scoring and spirit type. Why does this matter? At a glance, we can tell that they’re making no distinctions between when an award is given, from whom, or how the bell curve operates within the statistical set of judging results. They treat a double-gold medal result from San Francisco in 2013 is not worth the same as it was in 2012, or 2011, or any other year; but it’s not really the same. They treat a double-gold medal awarded in scotch as the same as one awarded for vodka or rum; but they’re not really equal. High scores from different institutions are given similar weight; but those judging criteria are not really equal. They also appear to be taking a general average of reviews, which means spirits are being punished for scoring high in one but low in another rather than sitting on their laurels. In short, these figures are treated far too lightly—almost like batting averages—in the sense of understanding how the subjective opinions of the judges come through in the numbers.

2)      If their numbers did work, it still doesn’t work. Even if they used our algorithm at Proof66 they would find only a weak correlation between price and quality. But that’s still an inappropriate conclusion. If it was true, we’d have made similar proclamations ourselves; but we don’t. And here’s where own algorithm has to yield to some objective statistical and marketing issues and why we go to such pains to color it with user reviews:

  • Marketing Stunts. These judging competitions are not like the federal NHTSA that crash tests every single car every year. These judging entities are marketing companies where producers voluntarily submit their products for assessment. They do this at their whim and at their discretion. If they get a good result, they may never submit again. If they get a bad result, they may never submit again. If they sell a lot of their stuff (say, Grey Goose) they probably feel very little need to submit to judging in the first place… why would they? Business Insider has published a statistical analysis pretending proud and universal coverage when that’s not at all the case. The population of all spirits is not even remotely properly sampled.
  • Not Recommended. In addition to the poor sampling, because these are marketing companies acting as judging institutions when they run across a spirit that they wouldn’t recommend drinking - the “not recommended” category - they simply decline to give it a score and it’s as if it never happened. This last year in 2013, this represented something on the order of 10% of all spirits entered in the competition that simply went unacknowledged in the results. In essence, there is an entire class of critical scores entirely unrepresented in the statistical analysis that Business Insider published… it’s as if anyone who took the SAT and scored below an 800 combined were simply lopped off and the average reported only on those who scored above 800. It’s foolish and misleading.
  • Limited Editions and Expensive Editions. Every year, there are a number of limited edition bottlings. They may come from a specific warehouse, be in a specific style, or use a specific aging technique, or any of a hundred other ways that liquor might be produced. These people seldom enter a judging competition - due to limited supply - or if they do, they enter a single time within a single year. This makes them eligible for a single year’s analysis but entirely inappropriate for a universal analysis. Pile on to that fact that expensive spirits are seldom entered into competition precisely because these competitions require a number of bottles to sample. Why would you send $10,000 worth of liquor to a judging outfit routinely? Only if you thought that there was a commercial payoff… and this is often far from the case. What this means is that some of the very best and/or most expensive spirits in the world are not represented at all in this analysis that is purported to test correlation between price and quality.
  • Other Reasons to Buy. People buy spirits for any number of reasons and they don’t always have to do with quality… that’s ok. There are cool looking bottles (Crystal Head Vodka); celebrity endorsed labels (Casamigos Tequila); spirits from a particular locale that bring excitement or nostalgia (Reyka Vodka). They may involve gold flakes (Goldschlagger); promise aphrodisiac qualities (Schwartzhog); or involve certain charitable work (Heroes Vodka). They may even invoke childhood icons (Three Olives Marylin Monroe Vodka) or childhood tastes (Tartz Vodka). It may even be a little bit of history (Mackinlay’s Shackleton’s Scotch Whisky) or just remind you of a favorite golf course (Bruichladdich Links St Andrews Scotch Whisky). Suggesting one buy spirits based only on price vs quality is to completely ignore an entire dimension of spirits: the theatrical. Theatrics is a big part of why people drink and why it is entertaining in the first place.
  • Unusual Flavor Profiles. Finally, judges judge based upon their experience and how they’re trained. There is constant tension between the judges’ respect for the “old” and traditional while resisting the nouveau. We see this all the time with new flavor profiles (the modern gin movement) and interesting ingredients (just about any liqueur from the Far East). Many judges have an idea of what a spirit is “supposed” to taste like and subjectively demote spirits that don’t fall within those constraints. Think how hard it is for the art critic to accept a new style of painting or a sports profession to accept a new style of play. They always resist. Liquor is no different. New spirits are always facing an uphill battle in these kinds of aggregation exercises.
  • In short, there’s nothing that makes sense about the conclusion from Business Insider when you look a little more closely at what’s going on.

    So why do we do it here at Proof66?

    Partly, because we believe in judging and critical assessment as a guide. We think that by raising the profile of these competitions they gain more market impact and encourage more participation. More participation will help supply a sample that truly represents the population… never perfectly, but better. We also think it happens to be better than the alternative, which is guesswork, advertising, and label appeal. Finally, we go through a lot of effort to try and balance out how those aggregated results appear so that they showcase the spirit in the most appealing way.

    But most importantly, these scores are meant to guide an exploration of spirits. These results are designed primarily for the middle-market: those spirits in the $30 - $100 category that serve as the next rung up for people reaching out from big market brands. These scores are excellent at finding value in the middle market but a very poor assessment of the truly elite spirits. That’s the next level of connoisseurship when one steps beyond the middle-market bracket.

    Business Insider did this because it’s easy. But it’s also very sloppy. A better conclusion would’ve been “How judging can help find great value in the middle-priced brands.” But that’s not at all as sexy a title as “Why You Should Only Buy Cheap Liquor.”

    Here, Business Insider fell prey to the siren call of the sexy but untrue headline.  

    Published by