IO9 Says that “Wine Tasting is Bullshit.” Really?

IO9 Says that “Wine Tasting is Bullshit.” Really?

by Neal Duncan MacDonald, Editor

Robert Gonzalez of IO9 is (primarily) a technology and science-fiction blog that entered the beverage world with an article proclaiming: “Wine tasting is bullshit. Here’s why.” In fact, Gonzalez openly wonders if wine tasting and the ratings that go with them are “bullshit” or “complete and utter bullshit.” To clarify his position, Gonzalez asks the rhetorical question: can a person consistently rate the quality of wine? “No. You can’t.”

Mr. Gonzalez: thank you for that kick in the nuts! After 4+ years and hundreds of hours diligently monitoring spirits ratings—not to mention tasting and re-tasting hundreds of different styles of spirits and interviewing experts from distillers to press—we feel compelled to respond.

First, we grant that we’re dealing with spirits rather than wine. Maybe there’s more opportunity for bullshit in wine ratings but we rather think not. So while we speak from a spirits perspective, we feel it applies across the board.

The IO9/Gonzalez argument runs on the following points:

  • Experts often differ. As IO9 faithfully reports via the Wall Street Journal, Robert Hodgson (statistician) ran the numbers and sees a high degree in volatility between expert panels from one competition to the next. The analysis also showed high volatility in the same label in different years of expert panels. Even when the same bottle is given to blindfolded experts they showed a +/- of 4pts on an 80-100pt scale. In his mind, all this volatility = bullshit.
  • Experts can be easily duped in embarrassing ways. Frederich Brochet played a trick on 54 wine experts dying a white wine red and presented the same wine in colored and un-colored versions asking for a judgment between the two. No one noticed the difference. Similar experiments were repeated by other reporters in other years. These duped individuals are experts and therefor all expert opinion is bullshit.
  • At least one expert admits it’s bullshit. Joe Power of Another Wine Blog is quoted as saying “[Our reviews] are bullshit too. It is just the nature of the beast.” One guilty expert = all guilty experts as  being bullshit artists.
  • Gonzalez takes exception with a few other facets as well but this is the gist: score volatility, high-profile gotchas, and self-admissions.

    We take this rather seriously because it calls into question an entire industry of ratings and expert panels, not to mention our own personal judgments. The Wall Street Journal article is actually fascinating and in-depth filled with quotes from the wine experts themselves, many of whom agree that there is some variability. This supports our own interviews with judges who often express worry about palate fatigue and generally getting things right. There is room for subjective opinion; there is room for concern. But “concern” is a long, long ways from bullshit and certainly not in the realm of complete and utter bullshit.  

    Gonzalez is making bad conclusions from inconclusive data and here’s why. (Or, in other words, spread your legs Robert because we’re going to try and kick you right back!)

    Technical Defects versus Subjective Taste

    Even if blogger Joe Power disagrees, the vast majority of people we’ve personally interviewed and who take spirits classes and try very hard to get things right. The classes teach taste recognition; not opinion. They train to look for elements of flavor from the base distillation (say corn in bourbon or grape in cognac) as well as the background flavors offered by fermentation. We’ve been to laboratories where they’ve isolated different kinds of alcohol and put them in vials—some reek of banana, some chocolate, some literally of cow manure… but all mix together in interesting ways and create different aromas and flavors that form the background of a spirit.

    The colored wine stunt aside, anybody who’s ever tasted the German dessert wine Gewürztraminer would never, in a million years, mistake it for a dry Pinot Noir red wine. Nor would they mistake an Islay scotch for a blanco tequila if they’d ever had one before. It’s absurd. Far from these extreme examples, experts train themselves to detect more finely graded flavors particularly those signifying styles of distillation or quality of ingredient.

    Or, to be more blunt, defects.

    Distillation is the art of claiming particular kinds of alcohol molecules in the distillate. It is very easy for the trained palate to taste defects if the spirit includes the wrong kind of molecule. We’ve done it. We’ve seen others do it. You, too can witness this in action if you look up Mythbusters when spirits expert Anthony Dias Blue absolutely nailed a vodka challenge that the mythbusters themselves failed. It’s not because he’s rating tastes… he’s merely recognizing the defects.

    This has little to do with the expert and whether or not they like something. Technical competence is worth the first (say) 10 pts on an 80 – 100pt scale but after that you start getting into subjective territory. This explains a lot of the variability between experts.


    Where Mr. Gonzalez would have you believe ratings are worthless, the consumer should feel confident that having a decent rating should signal that the product is free of obvious defects and give some confidence in purchase. When they start rhapsodizing about different flavors and how much you should like it? Here you should expect some play in the numbers. The 80-90 point range is for objective judgment; the 90-100 point range is for subjective opinion and that is variable for any number of reasons but need not necessarily equal bullshit.


    Maybe it’s not the experts who are variable

    We’ve noticed volatility in spirits results. In fact, we specialize in it. For example, Buffalo Trace Bourbon is all over the map the San Francisco World Spirits Competition. According to our tracking of the competition’s results, you can see a huge degree of volatility between the bronze, silver, gold, and double-gold medals from year to year (roughly corresponding to each of the 5pt groups in a more typical 80 – 100 point scale):


     SFWSC Performance


    But there’s a lot going on in bourbon production. Different barrels create different flavors in aging whiskey. Different batches of distillation yield different mixes of alcohol molecules. The grain used from different seasons will have slightly different flavors. All this to say: one barrel of whiskey can taste vastly different from another even when produced at the same distillery from the same ingredients overseen by the same master distiller and aged the same number of years. Barrels of whiskey are not widgets. It’s just like tomatoes coming out of your backyard: they can taste differently from year-to-year, plant-to-plant, or even vine-to-vine.

    The art in making a good whiskey product is bottling that product consistently. It takes a master distiller an inventory of hundreds of different barrels blended together to try and approach a consistent flavor profile. It’s an incredibly daunting task when you have to fill thousands of bottles to satisfy thousands of retailers every quarter. Is there any wonder there’s some variation from year to year or batch to batch? 

    In our opinion, then, it is not correct to rail at the inconsistency or pay too much attention to one year’s scores but rather look for consistency across several years (sometimes regardless of the score). Check out the volatility of the Yamazaki 18yr whisky from the same competition, same judges, over the same years compared to that same Buffalo Trace.


    SFWSC Performance 


    Way more consistent! For 6 out of the 8 years of competition results, the Yamazaki 18yr was not only very consistent but it was consistently rated with the highest accolades. (In 3 of those years, it won best overall for international whiskies.) That’s a stunning performance and one would-be consumers looking for new whiskey to try should take note of.  If you look at onely one year’s results—say 2009—you’ll be fooled into thinking they’re equal. If you look in 2006, you’d say Buffalo Trace is better. But looking at the whole history of scores, you see some striking consistently high results from one and a hit-or-miss product with the other. 

    In short, it is wrong in the extreme to instantly assume that volatility in results are the fault of the expert rather than the producer.

    Mr. Gonzalez would have you believe high ratings are inconsistent and worthless. In reality, some small number of spirits regularly receive high marks or at least very consistent marks… these are extremely helpful to consumers looking to buy with a confidence a special edition or prestige spirit. We have discovered spirits we would never have otherwise tried but for high marks from competitions (for example, St Germain Elderflower liqueur).

    The alternative is marketing and advertising

    The very reason Proof66 exists and carefully tracks competitive results is to try and make them meaningful to consumers. We do this because the alternative, as far as we can tell, is getting a rap star to mention a label in a piece of “music.” Or, in other cases, getting a slinky model to casually hold a bottle strategically in front of a daring mini-skirt. Or to blast us with advertising campaigns about pirates. Or to put it in a bottle that is very, very fancy or very, very unusual. Or (our favorite) just raise the price enough to make people think they’re drinking quality.

    That’s just marketing. That, Mr. Gonzalez, is true bullshit.

    Being confronted with a wall full of vodka (or wine or whatever) with nothing to go on but Yelp and Amazon ratings plus whatever celebrity could costume up the marketing spin to help you make a decision is to take any resemblance of sanity out of the buying process. Expert opinions in the wine and spirits world aren’t perfect. That’s because the judges aren’t perfect; the producers aren’t perfect; and Mother Nature herself isn’t perfect. But it’s the same case for:

  • Cheese
  • Barbecue sauce
  • Star Trek episodes
  • James Bond movies
  • Eggs
  • American Idol performances
  • Salmon
  • Ralph Lauren perfume
  • Calvin Klein jeans
  • Pablo Picasso paintings
  • Anything in the artistic world produced by real people is going to have some variation. If you want absolute perfect consistency then wait for McDonalds hamburgers when they achieve 100% synthetic ingredients made by 3D printers using protein as raw material and made to order by robots. But the act of assessing is part of the fun. Discovering new flavors is fun. Having experts tell you—just like they might tell you a movie or a book or a painting or a pizza—a bottle of whiskey is good allows you to explore, react, and decide for yourself.

    At the end of the day, experts are only good at helping you, the individual, discover a spirit. Their opinion is worth something only until and exactly at the point you taste it for yourself. Then you decide and all the critical opinion in the world isn’t going to change yours. And that’s fine! The point is, you’ve been given a reason to discover and a better way to filter than random chance or a rap star’s music video. 

    Published by