Coders Bracket 2015

shawn posted a pic said Mon, Mar 16th 2015 at 1:51pm ET

A year ago, I filled out my March Madness bracket using code: https://mediocre.com/forum/topics/code-bracket--because-science

The algorithm was based on the statistical winning percentage over the past 29 years of each seed broken down by round:

Seemed like a good idea in theory, but the results were just awful. I basically finished last in just about every pool I entered.

I still think the approach is valid, so I'm sticking with it this year but I'm making a small tweak that also takes into account the team's winning percentage (adjusted for strength of schedule using the Rating Percentage Index). Here's an example:

In the first round, 5th seeded Arkansas is playing 12th seeded Wofford. We know that in the first round the 5th seed wins 68% of the time. We also know that Arkansas has a winning percentage of 76.47% and an RPI of 0.6123. Wofford has a winning percentage of 81.25% and an RPI of 0.5744.

We calculate Arkansas' percentage chance of winning at:

68 + (((0.7647 * 0.6123) - (0.8125 * 0.5744)) * 100) = 68.152581%

Let's see if this improves the results this year. Generated bracket and code available here: https://www.codersbracket.com/code_bracket/5506e5e468f310590410eca4

If you think you can code up a better bracket than me, you should probably apply for one of our jobs: https://meh.com/jobs

7 comments, 4 replies
Comment

OldCatLady said Mon, Mar 16th 2015 at 3:27pm ET:

Never mind that, it's going to be UNF Ospreys! SWOOP!

0
- Reply
- Whisper
Lotsofgoats said Mon, Mar 16th 2015 at 3:34pm ET:

so does the tournament selection process produce a consitent-enough result set? it's an interesting question. rpi is definitely a good add.

0
- Reply
- Whisper
fishzine said Mon, Mar 16th 2015 at 3:39pm ET:

By strict definition your small tweak can has a theoretical max possible adjustment of +-1% and that scenario is imposible because it would require one team to have a 100% winning percentage and .1000 RPI, and the other team to have a zero in either of the two (and somehow still make it to tournament).
But extreme cases show us the flaws, your small tweak would only give a 1% bonus to an unbeaten team (who only played opponents with unbeaten records too get get a perfect RPI) playing against a team that hasn't won a game all season but got into tournament on some strange technicality....
I like do like the approach however. I would make a couple changes to the math. I would change the Tweak % to be an estimate of a team winning based on the two teams win% and RPI such that
Tweak%=(TeamAWin%*TeamARPI)/(TeamAWin%*TeamARPI + TeamBWin%TeamBRPI)
In your case above:
TweakTeamA%=(.7647.6123)/(.7647*.6123+.5744*4667)=50.08%

Giving a final solution of
TeamAOddsOfWinning = (SeedWin%.5)+(TweakTeamA.5)
Which yields a winning % of 59.04% the 5th seed...

Still not 100% happy with that math, will have to think about it a little more.

3
- Reply
- Whisper
shawn said Mon, Mar 16th 2015 at 4:20pm ET:

@fishzine if you take the Kentucky vs Manhattan matchup:

Kentucky has winning percentage of 100% and an RPI of 0.678. Manhattan has a winning percentage of 59.38% and an RPI of 0.5038. So the bonus would be:

(((1 * 0.678) - (0.5938 * 0.5038)) * 100) = 37.884356

So it works out in many cases to adjust things more than the +/- 1% you were suggesting although I did find myself wishing the bonus was amplified a bit more when the teams were more equally matched.

You should totally make a code bracket so we can compare results.

2
- Reply
- Whisper
- @shawn this is awesome, but fatally flawed. Your constant is the seed placement, but humans place those teams in those spots, and there is a massive amount of subjectivity to seeding. Is there a way to modify the algorithm to account for these stats independent of seed? possibly by schedule? disclaimer: I manage a team of business analysts, so I'm completely comfortable making declarations regarding how something should work without having any fucking idea whether it's going to work or not.
  
  marklog said Fri, Mar 20th 2015 at 10:54am ET
  - 4
  - Reply
  - Whisper
- @marklog this, basically. it's a model based on another model. there's a certain amount of consistency (e.g. 1-seeds never lose to 16-seeds), but that gets muddy real quick.
  
  Lotsofgoats said Fri, Mar 20th 2015 at 11:03am ET
  - 0
  - Reply
  - Whisper
- @marklog makes an elegant statement self-awareness. I've been on both sides of that fence and you are rare. Well done.: "disclaimer: I manage a team of business analysts, so I'm completely comfortable making declarations regarding how something should work without having any fucking idea whether it's going to work or not."
  
  RedOak said Fri, Mar 20th 2015 at 11:22am ET
  - 0
  - Reply
  - Whisper
- @marklog I think I know where you work 'cause you just described our analysts!
  
  Mehrocco_Mole said Sat, Mar 21st 2015 at 1:12am ET
  - 0
  - Reply
  - Whisper
jqubed said Fri, Mar 20th 2015 at 4:17pm ET:

What's annoying is 3 of the 4 leaders currently just used some example functions, and their random numbers just happened to work out so that they're in the lead.

0
- Reply
- Whisper
clarinetbob said Fri, Mar 20th 2015 at 11:06pm ET:

I had a taco for dinner. That is all.

1
- Reply
- Whisper
Smitty2525 said Fri, Mar 20th 2015 at 11:21pm ET:

Um believe it was God Richard that said for a successful technology public relations cannot be substituted for facts. And sadly the facts of the matter are your code idea is just another dogmatic religion much like Libertarianism only stupider.

0
- Reply
- Whisper