THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, June 28, 2011

“Paradox”: Expectation of being favored to win - vs - Expectation of winning

By Tangotiger, 11:13 AM

I’m not sure that my title description is clear enough.  And if someone wants to propose a better title to be clearer, please do so.

Someone sent me something like what I’m about to post, and he called it a “paradox”, but it is not at all a paradox.  It’s a question of whether you average out binary numbers or average out the rates.

Suppose that Roy Halladay’s true talent level is such that the Phillies win .601 of their games with him on the mound against an average team at a neutral site.  At home, the odds go up by +.050 (and on the road, it goes down .050).  Against good teams, the odds go down by .050 (and up by .050 against bad teams).  Against great teams, the odds go down by .100 (and up by .100 against terrible teams).  So, Phillies with Halladay starting at home against a terrible team gives us odds of .751 that the Phillies will win.  And on the road against a great team gives us .451 that the Phillies will win.

Count as “1” any time the Phillies have a greater than 50% chance of winning with Roy Halladay on the mound.

What percentage of the games are the Phillies favored to win?  Is it exactly 60%?  Or more than 60%?  It’s not a trick question.


#1    Ken      (see all posts) 2011/06/28 (Tue) @ 11:48

They’ll be favored to win far more than 60% of games. Just as an example, drop the great/terrible teams - the Phillies with Halladay would be favored to win every game they play. Even against a good team on the road, they would be 0.501 to win.

But I don’t really understand the question about binary numbers/rates.


#2    Michael      (see all posts) 2011/06/28 (Tue) @ 12:38

Clearly more than 60%.  You’ve got 8 possibilities, Phillies home or road against great, good, bad or terrible teams.  Only times the Phillies aren’t favored is playing a great team on the road.  Assuming an equal number of home and road games against each type of team, the Phillies are favored in 87.5% of the games.


#3    Xeifrank      (see all posts) 2011/06/28 (Tue) @ 12:39

What does their schedule look like?
vr. Xei


#4    Tangotiger      (see all posts) 2011/06/28 (Tue) @ 13:09

They play each team an equal number of times, and equally at home and away.


#5    Lee      (see all posts) 2011/06/28 (Tue) @ 13:20

It depends on his schedule, and how you define “good” and “excellent”. It won’t always be >60%.

Here are the categories:

R=Road=-.05
H=Home=+.05

G=Good=+.05
VG=Very Good=-.1

B=Bad=+.05
VB=Very Bad=+.1

R/G=.501***
R/B=.601
R/VG=.451***
R/VB=.651

H/G=.601
H/B=.701
H/VG=.551***
H/VB=.751

Here are a few assumptions to start with:

- Mathematically average league, where there is a nice bell curve of talent/win%
- Good teams are defined as win% >= %.501 (but below VG threshold)
- Bad teams are define as win% <= %.499 (but above VB threshold)
- No teams are at exactly .500 (this just closes a loophole and makes it neater)

There will be a winning percentage threshold for VG teams that, depending on Halladay’s personal schedule, could put him below 60%.

Start here:

Imagine him pitching all of his VG games at home (or as many as possible), and all of his G games on the road. This maximizes his games that he will be below .601. If you can get that number to be 40% of his total starts, which should be doable considering 50% of his starts ON AVERAGE will be G/VG, it’s easily possible. Throw in the fact that he could draw a really tough schedule, or one that has him on the road a lot, and you’ve got more than enough without doing out all of the math.


#6    Lee      (see all posts) 2011/06/28 (Tue) @ 13:23

There’s a lot of hand waving there. I’ll revisit that in a few if people don’t buy it.


#7    Xeifrank      (see all posts) 2011/06/28 (Tue) @ 13:33

I am afraid to make too many assumptions on what this question is asking.

>> What percentage of the games are the Phillies favored in?

I think Michael in #2 has it right.  7/8=0.875 or 87.5% of the games. 

I assume you mean games in which Halladay is starting.

.501+.601+.451+.651+.601+.701+.551+.751
= 4.808 wins every 8 games
or 60.1% expected wins.  But this is a different question than what you are asking.

Favored 87.5% of the time.  Expected winning percentage of 60.1%


#8    Tangotiger      (see all posts) 2011/06/28 (Tue) @ 13:41

Xei: Not that it matters to the final answer (it’ll be more than 87.5%), but the talent distribution of teams faced should be normal, not uniform.

If you need a practical number, presume:
4 Average,
3 Good, 3 Bad,
2 Great, 2 Terrible
teams, plus the Phillies, in the league.


#9    Lee      (see all posts) 2011/06/28 (Tue) @ 13:53

Well, my entire original post is wrong, I missed a key detail...It does depends on the classifications you just posted though. So assuming an average schedule:

4 games...R/A=.551
3 games...R/G=.501
3 games...R/B=.601
2 games...R/VG=.451***
2 games...R/VB=.651

4 games...H/A=.651
3 games...H/G=.601
3 games...H/B=.701
2 games...H/VG=.551
2 games...H/VB=.751

28 games total, 2 of which he isn’t favored.

~93% of the time the Phillies are favored.


#10    Xeifrank      (see all posts) 2011/06/28 (Tue) @ 13:53

#8. Ok, that changes things.  I will let someone else do the math, but when I get home I will look up the empircal numbers from this year and see how many games the Phillies were actually favored in when Halladay started (unless someone else beats me to it).
vr, Xei


#11    Tangotiger      (see all posts) 2011/06/28 (Tue) @ 14:03

Xei: that would be cool.  Please note that I just did this as an illustration, to show that the answer will not be .601.

The right answer is what Lee said: 2 of 14 times on road, Phillies underdog, and 0 of 14 times at home underdog.  So, 2 of 28 as underdog = 93% of time Phillies favored.  (In this illustration.)


#12    Lee      (see all posts) 2011/06/28 (Tue) @ 14:03

Just as a note, even if the Phils have a perfectly average schedule, it doesn’t mean Halladay draws the same. If he draws all of his VG teams on the road, and happens to draw more VG than the average pitcher, he could end up with realistically maybe 6-7? games out of 28 (using that for roundness with Tango’s classifications).

So I’d put the lower boundary at 75% (21/28)


#13    Xeifrank      (see all posts) 2011/06/28 (Tue) @ 14:14

One thing that could skew the empirical numbers is that (atleast in the beginning of the season before off days and injuries change rotation turns) #1 starters often face off against the other teams #1 starter.  I would think this would effect the lesser #1 starters like Livan Hernandez more so than Roy Halladay.
vr, Xei


#14          (see all posts) 2011/06/28 (Tue) @ 14:57

On a related note, let’s look at a sport without the starting pitcher phenomenon (so team talent level won’t vary so much from one game to another).  The best team in the league might very well be the favorite in 100% of their games but a huge underdog to actually go undefeated.  Consider a college football team so dominant as to be 90% favorites in every one of their games.  They would actually be about a 3-1 underdog (0.9^13) to actually go the season undefeated...even though they were enormous favorites every single time they took the field.

For sports like football where the number of games is small, fans often like to look through the schedule and identify each future game as a win or a loss to predict the future results.  In college football, this could cause many fans to greatly overestimate their national title hopes.  In the NFL, this could easily cause someone to expect say a 14-2 record (they would be favored in 14 of their games!) while only being maybe a 10 or 11 win true talent.


#15    Tangotiger      (see all posts) 2011/06/28 (Tue) @ 15:08

mickey: great point.  Yes, if this were a sport like NFL (or even basketball, though the high home-site advantage adds a wrinkle), it would probably make the explanation clearer.

I don’t like the college football, because don’t alot of teams go close to undefeated?

The purpose of my illustration here is to show as big a gap as possible between expected wins (0.601) and percentage of time favored to win (93%).


#16          (see all posts) 2011/06/28 (Tue) @ 15:27

Yes there is often at least one undefeated team, so perhaps college football is not an ideal example.  However, I was thinking more about a team like West Virginia University.  Over the last 6-7 years or so, they’ve fairly consistently been the best team in a relatively weak conference.  I don’t keep up with the Vegas odds, but I’m guessing they were probably the underdogs only about a dozen or so times during that duration (they might very well have been the favorite in every one of their home games during that span).  I also imagine they were the favorites in each game during 3 or 4 of those seasons.  Yet they were still unlikely to go undefeated (and they never actually did).  There are probably many other examples of good (not great) teams that are favored to win each game but are longshots to win all their games.


#17    Xeifrank      (see all posts) 2011/06/29 (Wed) @ 02:18

Here are the empircal numbers for Roy Halladay.

Starts: 17
Favored: 17
Underdog: 0
Percentage: 100%
Maximum Win Exp: 70.59%
Minimum Win Exp: 51.22%
Average Win Exp: 63.85%

I looked at some other notables in a blog post linked to in my name above.  Of the pitchers I looked at, there were four that were favored to win each and every start so far this season.  Can you guess who they are before looking?
vr, Xei


#18    Tangotiger      (see all posts) 2011/06/29 (Wed) @ 07:18

Great job, thanks!


#19    Tangotiger      (see all posts) 2011/06/29 (Wed) @ 08:05

Xei: one thing if you can add it (at least for the group totals): what was the ACTUAL win% for those games?


#20    Xeifrank      (see all posts) 2011/06/29 (Wed) @ 10:47

#19. Done.

For Halladay it was 82.35%

For the others it is listed at the same site.


#21    Tangotiger      (see all posts) 2011/06/29 (Wed) @ 10:55

Thanks…

You’re blocked at the office.  Can you give me the final number:
Average Win Actual: ??%


#22    Xeifrank      (see all posts) 2011/06/29 (Wed) @ 11:08

Hopefully, this is what you are looking for.

Pitcher         Average    Actual Win%
Roy Halladay    63.85      82.35
C.C. Sabathia   61.09      70.59
Jon Lester      60.79      56.25
Cliff Lee       60.34      64.71
Justin Verlander  57.09    64.71
Cole Hamels     59.57      68.75
Tommy Hanson    58.1       62.5
Tim Lincecum    57.94      62.5
Jered Weaver    56.36      61.11
Jair Jurrjens   53.82      71.43
Clayton Kershaw 53.99      58.82
Josh Beckett    55.43      66.67
Ian Kennedy     50.88      64.71
James Shields   53.51      75.00
Anibal Sanchez  51.38      50.00
Ricky Romero    50.15      38.89


#23    Tangotiger      (see all posts) 2011/06/29 (Wed) @ 11:22

The simple actual average is 63.7%, which is an almost perfect match to what Vegas said.

Great job!


#24    Xeifrank      (see all posts) 2011/06/30 (Thu) @ 02:27

FWIW, Jon Lester is the underdog for the first time this year Thursday.

Boston vs Philadelphia
J.Lester vs C.Hamels

Money Line
PHI -119 (53.60)
BOS +114 (46.40)


#25    Peter Jensen      (see all posts) 2011/06/30 (Thu) @ 08:10

The simple actual average is 63.7%, which is an almost perfect match to what Vegas said.

My math has 56.5% as the simple average of what Vegas said for the pitchers listed.  Were you comparing the average of the actual win percentage of all the pitchers listed to Halladay’s 63.85 expected win percentage?


#26    Tangotiger      (see all posts) 2011/06/30 (Thu) @ 10:16

Thanks for the correction Peter.  I was working blind, because blogspot is blocked at the office.

So, we have 56.5% Vegas, but 63.7% actually.  How many games is that?  About 200 games?  This is two SD.  There might be something there.  Worth looking at past seasons…


#27    Xeifrank      (see all posts) 2011/06/30 (Thu) @ 10:54

fwiw, many of the pitchers I looked at other than Halladay were taken off of the WPA leaderboard, which would bias the results to show the actual higher than Vegas.  I figured those pitchers would have a more interesting narrative.  If I had known there was going to be an actual vs Vegas comparison, I would’ve grabbed a more random sample.
vr, Xei


#28    Tangotiger      (see all posts) 2011/06/30 (Thu) @ 11:25

Ah, ok.  I thought you had taken the top Vegas pitchers, and not the top(-ish) actual pitchers.

So, forget what I said.


#29          (see all posts) 2011/07/14 (Thu) @ 09:47

I am a little late to the conversation, but hear is the work I did on how often an NFL and NCAA team won depending on the line:

http://www.beyondtheboxscore.com/2009/8/29/1003957/chance-of-a-football-team-winning


#30    Tangotiger      (see all posts) 2011/07/14 (Thu) @ 10:51

Great stuff Jeff.

Roughly speaking, to get the odds of winning given the spread (I’ll call that S), you would need two more things:

1. average number of points per game.  I’ll call that P.

2. pythag exponent.  I’ll call that E.

You then do this:
(S + P/2) / (S - P/2)

So, if the spread is 10 points, and the average number of points per game is 20, then we have:
25 / 15

That is, the expected score is 25 points for and 15 points allowed.  That matches our spread of 10.

Then we take that and turn it into odds:
(25/15) ^ E

The exponent would be 2 or 3 or whatever is used in football.

Let’s say E=3, so in this case, 25/15 ^ 3 = 4.63, or 4.63 to 1.

That gives us a chance of winning of 4.63/5.63 = 82%.

***

I don’t know if this process I’m laying out is better or worse than the best-fit equation you got.  But, at least it’s consistent with the way we normally do pythag expectations.


#31    Kung Pao      (see all posts) 2011/07/14 (Thu) @ 10:53

Jeff Z - using any regression formula in football with regards to point spread --> prob. of winning has one major flaw—that the difference of a half-point in a pointspread can vary significantly.  This is because football points are scored (for the most part) in 3s and 7s.

So the difference between these comparisons is quite large:

a. the probability of a 3.5-point favorite winning compared to a 3-point favorite winning

b. the probability of a 1.5-point favorite winning compared to a 1-point favorite winning

This “step” problem means using any simple formula runs into problems if put into practice.  Academically its fine, but academics don’t win or lose money on their ideas smile


#32    Tangotiger      (see all posts) 2011/07/14 (Thu) @ 11:02

King: great point!


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 13:18
Do pitcher’s reach back for velocity when needed?

May 25 13:04
“Why Kickstarter works”

May 25 12:51
Chad Curtis

May 25 12:40
Largest demonstration in Canadian history?

May 25 11:32
Howard Stern

May 25 11:26
Lack of hustle during a game

May 25 11:22
What sabermetrics is NOT

May 25 10:58
Rooting for laundry

May 25 02:38
NFLPA lawsuit against collusion

May 25 01:43
Neal Huntington’s best moves