Bracketeering Finale: Much ado about nothing or A tale of four regions

By: Richard W. Sharp. Graphics: Patrick W. Zimmerman.

Let us sit upon the ground and tell sad stories of the death of brackets. April is the month when trash talk ends and the cold reality of reversion to the mean takes hold. In the end, the Infallible Brackulator did OK, in spite of the Wild West and Rebellious South, because the powerhouses of the East and Midwest marched relentlessly to the inevitable conclusion

In what was almost and afterthought, Villanova trounced Michigan, but this will be forgotten. 2018 will be remembered because David slew Goliath.1 It is an event destined to spawn a million foolhardy brackets in the years to come.2 What do future brackets really think of 1-135 odds? I asked one and it said: “Let me be that I am and seek not to alter me“.

Me: Huh?

Bracket: Less talk, more chalk!



D’oh!

First things first, it’s time for a serious mea culpa. The algorithm used to produce our published brackets was wrong. Not wrong in the “it didn’t produce a perfect bracket” way, but wrong because it didn’t follow the rules we said it did. Through happy circumstance it was a case where two wrongs made a right. Both the streak feature and the injury penalty against Virginia were implemented incorrectly, and together this led to… a penalty against Virginia that paved the way from Villanova. 

The fix is in: for this final installment of 2018 madness measuring we’ve fixed the bugs and generated 800 new brackets per scenario. The charts and conclusions below stem from these new brackets.


So how’d we do?  

Not too bad. Better than chalk (72 points), better than Obama (56 points – ouch!), and 4th in the local pool (3rd loser?). The chart below shows how the brackets in each of our scenarios placed. The majority fall in a band from 60 to 100 points. The high noise scenarios all have median scores very close to chalk, while the light noise scenarios all did better. This reflects the pattern we observed in previous years’ data while tuning the model: some noise is necessary to get a range of outcomes, but as you add more and more your bracket gets closer to a truly random one (which would only score a lowly 31.5 points on average). 

Our best performing group,3 light noise with streak length 6, has a median score of 84 (good for about 3rd place in the the small pool we actually entered) and a 90th percentile at 112 points (good enough to win that pool). In other words, we had 1-in-10 odds of picking a winner this year by selecting from this group. Of course we wouldn’t expect to come out on top among the 17.3 million brackets entered in ESPN’s mega-pool, but glory and trash talk are best experienced face to face.

Mouseover for details.

The other score to consider is the points you did NOT earn due to overconfidence. These were your bad picks, the cinderella stories whose coaches turned back into pumpkins at 11:00 because somebody forgot about daylight savings. The histograms below show how many points were lost due to overconfidence. These are games in which you were wrong and your chosen winner was a lower seed than the winner predicted by selecting straight seeds: you picked an underdog where the smarties at tourney headquarters picked the big dog. To win the pool you have to pick some upsets, but you also have to know when to quit them. Did you have UMBC in the Sweet Sixteen or Syracuse in the Elite Eight? That kind of overconfidence is going to cost you. 

Again, our preferred scenario has a distinct profile: its histogram is top heavy compared to the purely random scenarios. It is producing brackets that are qualitatively distinct (and given that they earned a decent total score, that’s different in a good way).

Mouseover for details.

Putting these two scores together scatter plot style makes the difference pop. Where purely random brackets produce a cluster of low scoring, low risk brackets, adding the streak feature produces brackets that go for broke and often win.

Mouseover for details.


What did we learn?

Well, upsets are important, but in the end one of the top teams always wins. In fact, the lowest seed to ever win was #8 Villanova in 1985. The pattern was certainly no different this year, and the table below shows that after the shock of the opening weekend, the upsets dried up with only one in the fourth round (if you consider #11 Loyola over #9 Kansas St. an upset), and none thereafter. Yeah it’s exciting, but don’t get carried away when there’s money on the line

Mouseover for details.

Since our model took a poll of polls approach to establish a baseline, upsets did have an impact on their performance. The histograms below show how many brackets lost a significant number of points for a particular upset (select the losing team for each upset with the filter at the bottom of the chart). One of the most painful (especially because there was money riding on a bracket in which they were supposed to go all the way) was #2 Cincinnati’s loss to #7 Nevada. 

Mouseover for details.

Happily streaks did their job, at least qualitatively. In cases where the streak factor was the difference in an outcome, hot teams were promoted and slumping teams were halted in their tracks. For the 6-game streaks, it promoted Clemson over Auburn and Michigan over UNC; it also put the breaks on Xavier. Unfortunately, since bother were perfect, it failed to stop Gonzaga at Michigan’s expense. The streak feature produced good brackets with a distinct profile, and we’ll be incorporating it in the future.


How about next year? 

You gotta keep playing if you want to win, and we’ll continue to focus on winning the pool over the perfect bracket. Bracket picking is big business, but in the end it boils down to guessing the outcome of a sequence of 64 coin flips. Lucky for you, the coins in the first couple rounds are pretty far from fair.

We’ll look for strategies that improve our odds of winning the office pool, but with a focus on explainability. This is hard of course because most explanatory variables should already be baked into the seeding process. What have the experts missed? 

We’re glad you asked. One thing that they’ve missed, or rather, that the seeding process cannot take into account is that the performance of a team depends on their opponent. By ranking all teams we don’t allow for the possibility that a lower ranked team might have an advantage against a higher ranked one – perhaps their one strength (say, perimeter defense) perfectly neutralizes the dominant opponent’s greatest advantage (say, 3pt shooting), opening a path to victory. Of course, we’re not the only ones interested, and focusing on head-to-head matchups instead of overall rankings is one of the suggestions made by Tim Chartier, bracketeer extraordinaire.

Also, new comic relief. We need some new material since there are only a finite number of Shakespeare quotes and Mel Brooks clips on the internet.

Well that’s a wrap for 2018.

You don’t have to go sane, but you can’t stay here.
Time is come to scheme for next year.


Notes:
1 Of course, David had the good sense not to press his luck again against Kansas St. the next day.^
2 “The probability of a rare event will (often, not always) be overestimated , because of the confirmatory bias of memory.” Daniel Kahneman, Thinking Fast and Slow, New York: Farrar, Straus and Giroux, 2011, p333.
^
3 “Best” is subjective, of course, and in this case it means a combination between absolute performance (points) and reliability (a tight band of outcomes).
^

About The Author

Richard is a Seattle area data scientist who builds predictive models and the services that deliver them. He earned a PhD in Applied and Computational Math from Princeton University, and left academia for the dark side of science (industry) in 2010, following his wife to the land of flannel. Fan of coffee, beer, backpacking and puns. Enjoys a day on the lake fishing, and, better, cooking up the catch for a crowd.

No Comments on "Bracketeering Finale: Much ado about nothing or A tale of four regions"

Leave a Comment

Your email address will not be published. Required fields are marked *