Part 3 of our 3 part assessment of the three critical institutions we follow brings us to Wine Enthusiast. (See Part 1 for our analysis of the San Francisco World Spirits Competition and Part 2 for the Beverage Testing Institute.) Once again, how stable and accurate are the scores from critics over time? One of the interesting things about Wine Enthusiast is that we move from a panel in the Beverage Testing Institute and a rotating set of judges over a concentrated period of time in the San Francisco World Spirits Competition to the opinions of one, single critic: Kara Newman (who took over for Paul Pacult in, we believe, 2010). This would suggest to us that where panels and groups might disagree, we should expect some robust consistency from a single person. One thing we’ll note upfront: Newman’s reviews are a pleasure to read. They’re accessible, often funny, riddled with actual drink recommendations, and generally more useful than many other reviews we see.
For the distillers, bottlers, and blenders of the world, it might suggest a bit more predictability in the results as well.
As before, we want to take a look at the body of critical work over time and take a look at the statistics and see how they match up.
Wine Enthusiast Ratings
Wine Enthusiast publishes its ratings on wine and spirits in monthly print publications that is also available on their website. Kara Newman-noted author of New York-currently holds down the job of Spirits Editor (she has her own website as well). They use a point system running from 80pts to 100pts. 80 – 84 points is regarded as “average”; 85 – 89 points is considered “very good / recommended”; 90 – 95 points is “superb / highly recommended”; and 96 – 100 points is “classic / highest recommendations.” This follows a 4-part structure similar to other institutions.
We’ll reiterate a caveat before we start entering the business of stereotyping and grouping: Wine Enthusiast scoring methodology allows, of course, a very finely grained scoring of 21 points ranging from 80 to 100. Even while they may group them in batches for simplicity’s sake (and we follow suit), there is a big difference between a 90pt score and a 95pt score… but for this article, they appear virtually identical. That’s highly misleading in a way and we’d be remiss in not pointing that out. Yet onward we go!
Looking back at the last four years of scores, you can see the ratio of scores show a very high level of consistency.
Aside from a burst of generosity in 2010, only a very small percentage of spirits claim the 96-100 point threshold… the stingiest of all the competitions. However, the next run down shows that over half of all spirits make it into the 90 -95 point category. That’s a lot! Far greater than the other competitions. Meanwhile, roughly a quarter of the spirits fall within the 85 – 89 category and a corresponding tiny percentage in the 80 – 84 point category. In the parlance of Wine Enthusiast, well over half of all spirits that they’ve reviewed fall within a “superb” rating (at least).
Let’s turn our attention to individual spirits.
One of the interesting things is that the percentages are very consistent across classes. Though flavored vodka-always the unloved and oddball child of the lot-is dramatically lower proportional to the other spirits, the ratio of scores are very close across classes. The other competitions tended to show more favorable scores and more deviations between class. While one might argue that the Wine Enthusiast scores are inflated, one cannot argue that they give preferential treatment to one type over another. Another thing to note is that while it might not be very difficult to get into the “Superb” class of 90 – 95 points, it is very, very difficult to break into that hallowed territory of “Classic.” One should pay attention to these highest ratings.
Normally we would look to volatility as well by examining ratings across identical spirits over time. Sadly, there were only 23 examples of repeated samples in the last 4 years of Wine Enthusiast scoring… not enough of a sample to even comment about. For the record, and for what it’s worth, of the 23 spirits, they were all internally consistent save for one example.
What about comparing the judges from San Francisco with the Spirits Editor from Wine Enthusiast?
When we compared the judging panels at San Francisco with those from Beverage Testing Institute, we were struck with the sharp disagreement between the two judging bodies in many examples. Would the judges of San Francisco match up with the more generous ratings from Wine Enthusiast?
We looked back across the last four years and looked at any label that was submitted to both the San Francisco competition as well as Wine Enthusiast (303 total bottles). Here we assume that the four groups of BTI scores correspond with the bronze, silver, gold, and double gold medal awarded at San Francisco. This is a bit dubious since we’ve seen that 60% of Wine Enthusiast bottles are in that category while only 20% of San Francisco ends up in this category. One would expect a wide variance given the relative inflation between the two and that’s exactly what we see. Here, “perfect consistency” is a match on the categories while “significant inconsistency” is two groups apart.
Oddly, despite the notional inflation, while we see stark disagreement it is not quite as stark as between San Francisco and the Beverage Testing Institute. Still the fact that 1 out of every 5 labels tasted betrays such a big difference of opinion (while they agree on only 1 in 4) is something of note. This is still a subjective game.
To us at Proof66, this furthers the notion that we’ve been maintaining: that when you do see agreement between critics, take note!
What this all means to you!
What does this mean to the consumer? Volatility and disagreement among critics does not mean you should ignore them. But it should help give some color to the idea that a number is infallible. Clearly, it is not. That you disagree with the number is fine because the professionals themselves are exhibiting all kinds of disagreement. Do not make the mistake that the numbers are meaningless… but understand that they’re just a guide.
What does this mean to the distiller or producer? If you want something in the 90s, send your bottle to Wine Enthusiast! If history is any guide, it’s the easy grade at college: hard to get the “A” but not so hard to get the “B.”
What does this mean for Proof66? Publicizing and highlighting the results of leading critical institutions will continue to be a passion for us. Using analyses like these will, we hope, help maintain the integrity of subjective insights from editors like Kara Newman. It should also help drive more producers to submit more frequently.
by Neal MacDonald, Editor
[Disclosures and notes: we are an independent, limited liability company with no affiliation with Wine Enthusiast or any other critical body; our opinions are our own. All scores noted here were compiled from the results made public by the Winemag.com-while we believe our data are complete and accurate, any errors or omissions are unintentional and ours alone.]