THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Saturday, September 30, 2006

More than you probably ever wanted to know about the pythagorean method

By Tangotiger, 11:45 AM

This is one of those great articles that deserves to be linked every now and then.  Ben Vollmayr-Lee is one of those few guys who can merge love of baseball with statistical savvy.


#1    David Smyth      (see all posts) 2006/09/30 (Sat) @ 13:01

Ben is most concerned with predicting the W/L of actual MLB teams (as opposed to the theoretical 0 to 1.0 W/L % range)), with the best subjective combination of accuracy and simplicity.

But those parameters are Ben’s personal preference. My preference, as in BsR and PythagoPat, is for a method which places a different value on each of those 3 considerations.

It’s interesting that Ben VL, with all of his comfort with complex math, would fall so much on the side of simplicity, even though I can easily use PythogoPat with my 15$ ‘scientific’ calculator.


#2    David Smyth      (see all posts) 2006/09/30 (Sat) @ 13:05

BTW, whatever happened to Ben VL? He certainly could have become a well-known analyst had he desired. Maybe he felt that saber is not even worth his spare time, compared to advanced physics.


#3          (see all posts) 2006/10/02 (Mon) @ 05:07

There’s a difference between being good at math and being a good analyst, I think.  Most of the people who read this blog, and I like to think myself included, are pretty good at math relative to the population.  But this linked article confused the shit out of me, and I really didn’t walk away with much more knowledge than I went into it with.  If you mean well-known analyst in terms of the sabermetric community, I’d say he would have to get significantly better at explaining stuff first… otherwise, he’ll be well-known to the 10 people out there that understand him, and anonymous to everyone else.  Likewise if you meant well-known in terms of working for a ballclub… even a savvy GM, I’m guessing, would probably not be too happy with getting this article on his desk as a response to some question.

Or maybe I’m just not as bright as I wish I was grin


#4    tangotiger      (see all posts) 2006/10/02 (Mon) @ 06:41

I have no doubt that Ben is exceptional as a data analyst and being able to convey it to his predefined target audience.  If you didn’t get his article, then you were not part of his audience.  But, I’ve seen enough of Ben’s posts at Fanhome to know that he’d be flexible to his audience.

As well, he “gets” baseball, and doesn’t treat it as just another subject to analyze.


#5    Ben VL      (see all posts) 2006/10/10 (Tue) @ 19:04

Thanks for the kind words Tango.  It’s nice to have had an impact.  And it would be fun to do more work - I’ve got some ideas and even some results.  But to answer David’s question, the sad reality is that I have a rather demanding day job, as does my wife, and we’ve got two young kids that fill up the rest of our time.  While baseball is fun, it doesn’t get me invited talks and the kind of exposure that I need to remain competative for grants.  So it goes.

Tango, your dedication and enthusiasm is motivating.  Maybe in a few years I can participate again.


#6    Ben VL      (see all posts) 2006/10/10 (Tue) @ 19:13

David, I meant to respond to your comment about my preference for simplicity.  I like the linear formula with a coefficient of 2 (not far off in today’s run environment) because I can use it to estimate in my head, for example, how many wins a shortstop would be worth if they convert 1 double to a single each game, or 1 single to an out each game.  Linearity is really nice in this way when you can get away with using it.


#7    Tangotiger      (see all posts) 2007/10/18 (Thu) @ 13:25

Since this is all the rage, I am bumping this thread forward for those who want insight into Pythag and baseball.

***

Here was a thread that I had a few years ago…
http://www.battersbox.ca/article.php?story=20040923122101999

...in response to Clay’s data here…
http://www.baseballprospectus.com/article.php?articleid=3490
... where he:
“...used the Retrosheet game logs to get the records for every team that has played 150 games or more in a season, not quite 1900 teams going into 2004. At intervals of 10 games, I pulled out the team’s record to that point in the season, along with how many runs they had scored and allowed. That allowed me to set up some simple regression tests between current actual record and current Pythagorean record as the predictors, and rest of season record (not final record!) as the predictand.”

It looks like Pizza did the same thing:
http://mvn.com/mlb-stats/2007/10/14/the-triumph-of-pythagoras/

***

Bill James contributed his piece:
http://sabermetricresearch.blogspot.com/2007/10/new-bill-james-study-on-pythagoras.html

Of which Pizza also responded to:
http://mvn.com/mlb-stats/2007/10/15/still-more-pythagorean-musings/


#8    Tangotiger      (see all posts) 2008/02/07 (Thu) @ 12:54

My point 1. said this: “When given two variables, offense and defense, offense is 50% of the game and defense is 50%. “.  Nowhere did I say that a run saved is not equal to a run earned, nor do I agree with such an assertion.

A run does have more impact the fewer of them there are, and its value can be determined by PythagenPat.


#9          (see all posts) 2008/02/07 (Thu) @ 14:44

"Nowhere did I say that a run saved is not equal to a run earned, nor do I agree with such an assertion.”

Here’s why I think a run saved is worth more than a run scored: A .500 team that gives up 100 fewer runs is going to be better off than if it scores 100 more runs. Therefore it follows that a greater percentage of value, or value above replacement level, should be allocated to efforts that prevent runs—pitching and defense. I don’t know what that percentage is, but your numbers suggest that it’s 58 percent.


#10    Tangotiger      (see all posts) 2008/02/07 (Thu) @ 15:29

No, you are mixing things up.  The 58% ONLY applies when you look at three variables: offense, pitching, and fielding.  When you look at two variables (offense, defense), it’s 50/50.

A team that scores 5.00 runs per game and allows 4.90 will have a .5096 win%.  If they bring that up to giving up 5.00 runs, they will have a .5000 win%.  So, 0.10 more runs allowed leads to .096 fewer wins, or a runs per win of 10.4 runs per win.

A team that scores 4.90 and allows 5.00 wil have a .4904 win%.  Again, scoring .10 more runs will lead to a .5000 win%, or a gain of .096 more wins.  Again, the same runs per win of 10.4.

A team that is at 5/5, and scores 0.10 more runs will have a win% of .5095.  A team that is at 5/5 and allows .10 less runs will be at .5096 (as previously noted).

For all intents and purposes, a run saved is a run earned.  You can show a bigger gap if you manipulate the run environment.


#11          (see all posts) 2008/02/08 (Fri) @ 09:04

Let me try to explain what I’m thinking: Last year in the 1,123 games lost by American League teams, the losing teams scored 3,445 runs and allowed 7,413. Converting those numbers to a 162-game season, these losing teams scored 497 runs and allowed 1,069. That’s in compiling zero wins and 162 losses.

The ratio of runs (winners:losers) is about 2.15 to 1. From the losers’ standpoint, it’s about 0.46:1. A team that took an all-offense approach to solving its problem would need to add 1,802 runs to get its ratio from 0.46:1 to 2.15:1. A team that took an all-defense approach would have to eliminate 837 runs.

The numbers get smaller but the situation doesn’t change if you start with a .500 team (715 runs scored and allowed). It takes more than 700 runs added to get from 81-81 to 162-0 but less than 400 subtracted to get there.

In the real world, most teams are within 20 wins of the mean of 81. And the difference is obviously a lot less in the middle, but it still exists. A 715-run .500 team could get to 91 wins by adding 104 runs or by subtracting 91. That’s a 13 percent difference.

The further from the mean you are, the larger this difference is. Whether you’re starting at the zero-win level or the 47-win replacement level, a run saved is measurably more valuable than a run scored. If you consider the relative values by starting at the 81-win level and using fractional numbers of runs, you’re going to find the smallest difference between a marginal offensive run and a marginal defensive run. But it’s more appropriate to consider this issue on a team-season level.


#12    Tangotiger      (see all posts) 2008/02/08 (Fri) @ 09:23

rfs/31: I agree that the higher the run environment, the less impact each run has.  But, that has nothing to do with how much a run earned and a run saved has in the exact same run environment.


#13    Anthony      (see all posts) 2008/02/08 (Fri) @ 11:21

The Yankees had 968 RS & 777 RA last year. That’s a pythag win% of .607. At that level, adding 12.4 runs scored nets them one win, while disallowing 10.5 runs nets them one win. It seems like an average 2-win player with the Yankees would be a 2.5-win player with the Padres, in what I’d guess are the most extreme run environments of 2007.


#14    Tangotiger      (see all posts) 2008/02/08 (Fri) @ 11:44

And if you REVERSE that (777 RS, 968 RA), then scoring 10.6 more runs will give you one more win, while allowing 12.3 fewer runs will give you one more win.

So, it depends where you are in your runs scored, runs allowed.  In both cases, it’s around 1.3% more or fewer runs from what you already have, to give you the extra win (and 1/81 = 1.23%).


#15          (see all posts) 2008/02/10 (Sun) @ 00:30

Except that the Pythagorean starts to break down at extreme levels—basically once you’re outside the .400-to-.600 winning percentage zone. Really good teams tend to win more than Pythagoras says they should, and really bad teams underperform their already terrible projections.

So the runs scored/runs allowed situation matters very little—at the team-season level, a defensive marginal run is better than an offensive one in all situations.


#16    tangotiger      (see all posts) 2008/02/10 (Sun) @ 01:31

Your first paragraph: prove it.

Your second paragraph: I just showed you in post 39 this is not true.


#17    David Gassko      (see all posts) 2008/02/10 (Sun) @ 02:31

Your first paragraph: prove it.

***

And don’t do it by showing that teams with good records outperform their Pythagorean record. They do, but because those records are partly inflated by luck (and of course the opposite is true for really bad teams).


#18    rfs1962      (see all posts) 2008/02/10 (Sun) @ 11:39

OK, here is a list of all the teams that have been over .600 or under .400 for the past 10 years. There are 40 of them—21 good and 19 bad.

Of the 21 good, 17 outperformed their projections. As a group, the 21 teams outperformed by a total of 74 games.

Of the 19 bad, 14 underperformed their projections. One matched. These teams as a group underperformed by 60 games.

David, you’re correct that this could be a list of the luckiest and unluckiest teams of the past 10 years. Probably to analyze my idea properly, you’d have to find the teams with the best and worst run ratios and see how they performed. That data is more tedious to dig up, and I haven’t done that.

I think this is true, though, because teams in their losses get outscored by a little more than 2-to-1, and obviously the reverse is true in their wins. Pythagoras says a team with to 2-to-1 win ratio should win at .780, but in actual games won that ratio is just barely higher than that. That tells me Pythagoras is breaking down at the extremes.

Anyway, I made a chart that plots on the X-axis a team’s ratio of runs to runs allowed and on the Y-axis its winning percentage. I put a point at (0.46, 0) and another at (2.15, 1). Then I started putting in data from actual teams.

Then I drew a line through all the dots. I created a blog and posted it at:

http://rfs1962.wordpress.com/?p=3

but all that’s there is the points; you can see how the line would go—steeply upward, then flattening out. (The Pythagoras line is different, obviously.) Sorry about the draft quality, but I wasn’t thinking about publication. I was just fooling around to see if I could get the table to post.


#19    Patriot      (see all posts) 2008/02/10 (Sun) @ 15:38

That proves nothing.  Selective sampling.  You cannot select the teams by W%.

If you take the teams 1996-2006 and sort them by Pythpat W%, the top 20 undershot their projections by a COMBINED .77 games.  The bottom 20 exceeded them by a combined 4.31 games (so, around .2 wins/team).  Hardly a failure of the system.

Of course, no one is claiming that Pythpat is a perfect estimator.  I’m sure it could be improved, but it couldn’t be improved by much, and for teams in the normal range of major league performance, it is more than adequate.


#20    tangotiger      (see all posts) 2008/02/10 (Sun) @ 15:48

Of course they overperformed!  You selected based on the criteria you are looking at.

What you need to do is select based on their pythag estimate, and then compare that to their actual winning percentage.

If you want to see your method in an exaggerrated fashion: select all pitchers with at least 20 wins in the last 20 years.  I’ll guarantee you that their win% will be higher than forecasted by their ERA.


#21          (see all posts) 2008/02/10 (Sun) @ 21:48

Sure, what you really need to do is plot actual performance against the Pythagorean curve, a project I’ve started but probably won’t complete.

The problem with taking the Pythagorean estimate first and then computing the difference with actual performance is this: The Pythagorean estimates don’t go to extremes as often as actual teams do.

In the past 12 years (I quit counting when I got to 1995 and had to deal with the short season), Pythagoras (and I do like to think of him as an actual Greek mathematician poring over reams of baseball statistics) estimates that 23 AL teams should have won either fewer than 66 games or more than 96. Actually, 30 have done so. In the NL, Pythagoras looks at the stats and sees 16 extreme teams, when there were actually 25. Some of the teams that are estimated as extreme weren’t extreme on the field, but the reverse situation is more common.

So Pythagoras projects a slightly tighter band than actually occurs. It’s not a huge deal, and Patriot is totally correct that for most teams it’s more than adequate. But if you’re computing marginal runs from a zero-win level or determining what’s a replacement level team, you’re well outside Pythagoras’ comfort zone.


#22    Patriot      (see all posts) 2008/02/10 (Sun) @ 22:04

Why would we expect otherwise?  Extreme W%s are a combination of a high or low expected W% and “luck”.  If seasons were infinitely long, then we would expect the distribution of true talent and actual W% to be equal.  But they’re not.


#23    tangotiger      (see all posts) 2008/02/11 (Mon) @ 00:05

rfs: you are missing the underlying point that

observation = true + luck

Of course the spread of the observed will be greater than the true.  That’s the case in everything.  There’s several threads in this blog that deals with this issue.  Please take the time to read them.


#24          (see all posts) 2008/02/11 (Mon) @ 12:41

I do actually understand those things, and appreciate all the explanations about why—of course!—the data at the ends doesn’t fit the curve. But if the number of games a team wins makes it miss the curve in a predictable direction, that’s not luck. I’d rather move the curve than explain why the data don’t fit it.


#25    Tangotiger      (see all posts) 2008/02/11 (Mon) @ 13:30

That’s what we are saying… there is no bias in terms of teams over- or under- fitting the Pythag based on the runs scored per game and runs allowed per game.

Here’s some data:
http://www.tangotiger.net/winactuals.html

If you look at the great teams (Quality=1), their actual win%, and their “tango” win% are virtually identical.  Same for the bad teams.

So, I reject the notion that good terms win more than their run distributions would say, without evidence to the contrary.

Once you select based on runs (not on wins), you will see virtually no bias.


#26    rfs1962      (see all posts) 2008/02/13 (Wed) @ 01:09

Let me take another shot at this. This link is to a table and chart showing runs scored vs. runs allowed and winning percentage for all MLB teams for the past eight years.

To help make my point, I’ve added two points to the graph at the far right that are on the Pythagorean curve, showing the projections for a team that doubles up its opponents and for a team that triples up its opponents. (The other point that looks out of place is the 2001 Mariners; the one on the far left is the 2003 Tigers.)

It’s obvious to me which two points are the outlying data here—they’re the ones on the Pyth curve. The rest of the points form a nearly straight line.

Now add the information from Post #31: In all of the games lost by AL teams last year, the winning teams outscored the losing teams by a ratio of about 2.2 to 1. (The ratios are similar for the 2006 and 2007 NL.) That data point, which is not on the chart, represents seven full team-seasons. The result is an almost straight progression once you get a few runs off the x-axis. If you displayed the x-axis logarithmically, with 0.5 as close to 1 as 1 is to 2, you might have a perfectly straight line.

This is why I believe the Pythagorean projection frays at the edges. Very good teams outperform their Pythagorean projections because they are out of Pythagoras’ comfort zone, not because they are lucky. This is also why I believe a run saved is better than a run scored, because saving runs is always the better way to improve your run ratio.


#27    rfs1962      (see all posts) 2008/02/13 (Wed) @ 01:12

Oops, the link didn’t show. My apologies.
http://spreadsheets.google.com/pub?key=pyrTJkVksp1LqQC0a3AdEDg


#28    tangotiger      (see all posts) 2008/02/13 (Wed) @ 08:46

There isn’t a straight-line relationship, otherwise the win% will exceed 1.0

Try to plot (W/L) to (RS/RA)^2, and you’ll get close to a straight line.


#29          (see all posts) 2008/02/13 (Wed) @ 11:59

Yes, there’s a ceiling on performance. A floor, too. That’s not too insightful.

The article is helpful—I haven’t worked through all the steps but I have a couple of thoughts:

-- The data support a linear theory as well as they do a Pythagorean one. In the real world, it mostly doesn’t matter unless you think the cost of a win changes based on where you are in runs scored/runs allowed.

-- One way you could get more data, and more extreme data, would be to split seasons in half, or into thirds, and draw the same chart.


#30    Tangotiger      (see all posts) 2008/02/13 (Wed) @ 12:07

The data supports linear if the RS/RA is closer to 1.  It obviously can’t be linear if RS/RA gets too high.

I’d also suggest using the Tango Distribution, which is the best model out there.


#31    Tangotiger      (see all posts) 2008/02/13 (Wed) @ 12:14

To expand on the Tango Distribution: if you know the distribution of scoring 0, 1, 2, 3, 4… runs per game for a team that on average scores 6 and allows 3, then it’s pure probability math, from that point, as to how often the team will win.

Even if you don’t trust the Tango Distribution, you come up with your own frequency distribution of scoring 0, 1, 2, 3, 4… runs per game, where the average is 6 and the average is 3, and you can *easily* come up with your own win% based on these two distributions.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main