THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
Mailbag:You ask:We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Tuesday, April 22, 2008

Small team sample size: Do I care that the Tigers are 7-13, the WS are 11-7, or that Flo is 12-7?

By , 02:07 AM

What do you think?  The answer, of course, is, “Nope!” I couldn’t care less.  Or with improper grammar, “I could care less” (which would mean sort of the opposite).

At least with players, as they accumulate current ("recent" actually, since there is no such thing as a “current” stat) performance, we use that to update their projections; with teams, the only thing to use to update their w/l projections are the projections for their players.  However, 20 games into the season, updating everyone’s projection on each team is not going to make a lick of difference in terms of the team projections.  IOW, if I use pre-season projections for all of a team’s players, that team’s w/l projection from this point on is going to be almost exactly the same as if I used updated (including this year’s stats) projections for all the players.  The reason is, and this is important, is that a collection of individual player stats is NOT the same as ONE player’s stats for the same number of PA.  Not even close.


For example, let’s say that we have a team comprised of 1000 players and the collective performance of those players prior to a season were an OPS of .750, and thus a projection of .750 for each player, including any regression toward the mean, etc.  Not let’s say that after one game, where all 1000 players bat (in our hypothetical world), our team has a collective OPS of .850.  You might be tempted to think that our projection for this team has jumped tremendously since we now have another 4000 or so PA of .850 performance.  You would be wrong!  As long as the performance of each of our 1000 players is relatively independent of one another, the projection for our team is not going to change more than a tad, and I mean a tad. In fact, the updated (after the one game) projection for our team is the updated projections of all our players combined.  The fact that all of our players happen to be on the same team means nothing!

Each player who had a projection of .750 going into the season now has a projection based on that .750 PLUS 4 more PA of .850 OPS.  That would make their new projection around .7501 or something like that.  So our new projection for our team is STILL .750, even though we have 4000 PA or so of .850 performance.

The kicker of course is that it would be nearly impossible for 1000 (or even 100) players to play at an .850 clip if their true OPS was .750, unless of course they were playing against pitchers who were 100 points or so worse than league average.  I should have said that in that one game, we either came up with an “opponent adjusted” performance of .850, or that they were playing against a league average pitcher.

So, everything you read about why each team is doing considerably better or worse than they were supposed to do (unless that “supposed to” is just plain wrong), is pure, unadulterated hooey.  Unless they suffered some catastrophic, team-talent changing injury or acquisition for the better or worse, which no team has, as far as I am aware.

In order to figure each team’s projected w/l record at the end of the season, simply take their current win/loss record and then “play out” the rest of the season, using the same, damn projections we used before the season started for each team, adjusted for whatever injuries, acquisitions (or releases), or playing time changes occurred since the start of the season.  None of these things (other than the current w/l records of course) will change our team projections very much.

So without further ado, here are my current, as of Tuesday morning, final w/l projections for each team, and their chances of winning the WS, starting with all the teams that have “surprised us” so far.  I put “surprised” in quotations because what SHOULD be a surprise is when a performance distribution is NOT “normal” (bell shaped).  IOW, it SHOULD be a surprise when, after 21 games or so, 10 teams or so aren’t around 2.5 games or more better or worse than they are supposed to be, by sheer luck alone, and 3 teams aren’t 4 wins or more better or worse than they are supposed to be, and 1 or 2 teams aren’t 5 or more games better or worse than they are supposed to be.  Think about that.  We should be surprised if 1 or 2 GOOD teams or bad teams are not, like, worse than 7-14 (for the good teams), or better than 14-7 (for the bad teams)!

FLO 77 wins, .003, they suck on offense and they have little pitching, despite a good start.
WAS 70, 0 (0 means less than 1 in a 1000, rounded off to the nearest 1000th - all teams have a finite chance of winning the WS of course), they have a decent offense, but their pitching is horrendous, but they are extremely unlikely to lose 100 games, as I assume many people think they will given their bad start.
BAL 73, 0, they are still a very bad team, with little offense and decent pitching at best.
CWS 80, .01, Very weak offense with decent pitching.
CLE 86, .086 Still very strong all the way around.  If Sabathia ends up being injured then they will take a 3 or 4 win hit of course.
DET 82, .032 I did not like them that much before the season started - I had them at 88 wins - their offense is very good, but certainly NOT 1000 runs good, their defense is not good, and their pitching is just OK (I am NOT a big Verlander and Bonderman fan, and certainly not a fan of Rogers and Willis).
KCA 71, 0 Their offense is awful and they have a few good pitchers of course.  Still overall a bad team.
NYY 92, .139 I still love them despite all the bad things you hear about them.  Great lineup (better than Tigers and Boston) and good, but not great, pitching.
OAK 85, .042 I had them winning 80 games before the season started, more than most people thought.  If Harden pitches less than 120 IP or so, my projection will take a hit.  Decent lineup, decent pitching.
TBA 84, .027 I loved them before the season started.  I had them at 86 wins.  I may not have incorporated their manager enough (I don’t incorporate managers at all actually), who I think is one of the worst in baseball.  They have a good offense, a 1000-fold improved defense, and a decent to pretty good pitching staff.
TOR 81, .011 Bad lineup, decent pitching.
STL 82, .01 Decent lineup, ,bad pitching.
ATL 86, .052 I like them all the way around.  I had them at 87 wins before the season.
LAN 85, .044 Very good pitching.  Good lineup.  Had them at 85 wins before season.
SDN 84, .028 Like their lineup a lot, especially their defense, and like their pitching a lot.  After Peavy and Young, you don’t need a lot to have a good overall pitching staff.  Also had them at 85 wins pre-season.

The rest:

ARI 88, .053
CHN 89, .061
CIN 77, .003
COL 80, .014
HOU 71, 0
MIL 88 .061
NYN 90, .103
PHI 81, .01
PIT 73, 0
SFN 67, 0
WAS 70, 0
ANA 87, .091
BAL 73, 0
BOS 91 .113
MIN 78, .007
SEA 78, .006
TEX 74, .002
TOR 81, .011

BTW, if anyone wants to scream bloody murder about any of these projections, feel free to do so, and feel free to put your money where your mouth is.  If anyone wants to wager over or under any of these numbers, and wants to lay 3-2 odds, I’ll happily take the bet.  Of course that offer is for entertainment purposes only, in case it is a violation of any state or federal laws, and all parties must agree that all monies won will go to charity.  I also reserve the right to refuse any offers based on a possible typo above, since it is 1:00 in the AM!

(17) Comments • 2008/05/13 • SabermetricsForecasting
Page 1 of 1 pages

<< Back to main