THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, June 03, 2010

No, no, no, no, NO!  (Part 2)

By Tangotiger, 04:51 PM

I don’t remember what my last no,no,no,no,no thread was about.  It must have bothered me then, and this one bothers me now: running regressions of salary to wins.

I just don’t know what to say at this point any more.  I’ve got at least a dozen threads on this.  Correlation increases as the number of games increases.  It’s really that simple.  There’s a huge difference between running a regression against 70 games and against 700 games.  Every time I see one of these regressions, the implicit treatment of the OBSERVED winning percentage is that it’s a TRUE winning percentage. 

Even if God were to tell you the exact talent level of every single player in MLB, you will never be able to get r=0.9999 between talent and winning %.  Not unless you’ve got one million games played.

Please, guys, stop it with regression analysis.  Apologies to Hawkonomics for using his/her post as my target practice.  I otherwise enjoy that blog.


#1    MGL      (see all posts) 2010/06/03 (Thu) @ 21:29

Wow, this is just a disaster of a sentence (a criminal misuse of the concept of “statistical significance"):

“Second, we I run a regression (including a constant term), I find that payroll for the 2010 season is statistically insignificant. In other words in statistical terms payroll has zero effect on winning percentage at this point in the season. From that I would conclude that payroll for the 2010 MLB season has zero impact on winning percent in the 2010 season.”

So we really need a statistical test to tell us whether payroll is related to winning percentage?  If it weren’t teams would have to be randomly paying for wins, if you know what I mean. In other words, they would have to be valuing players by throwing darts at a dartboard with wedges of 0 to around 5 WAR.


#2    Guy      (see all posts) 2010/06/03 (Thu) @ 21:42

That’s nothing.  Here’s what Brook wrote in last year’s post on why payroll in MLB isn’t important:

“If you are still not convinced, suppose all MLB teams payroll decreased from one year to the next. If all teams payroll decreased would all teams performance decrease? It cannot since the average performance for the entire league will still be 0.500. It is thought experiments like this that allow us to see why even if team payroll and team performance are positively correlated, they have little to do with each other in a statistical sense.”

And this is someone who is paid to teach statistics (or at least econometrics) at the college level....


#3    J-Doug      (see all posts) 2010/06/04 (Fri) @ 01:57

I don’t understand why anybody would even attempt to run a regression simply based on the 70 games this year. Funny to see this, since I’m posting a less rigorous (but I think more interesting post) on my blog tomorrow. Feel free to tear it apart.

“Even if God were to tell you the exact talent level of every single player in MLB, you will never be able to get r=0.9999 between talent and winning %.  Not unless you’ve got one million games played.”

Well this is true about all models, right? The real question is what does r approach as the sample approaches infinity, and how this r compares to the r in _other sports_.

I’m not the first to run the data, but from what I can tell the r for payroll vs. performance doesn’t even approach 0.50 with two decades’ worth of data.


#4          (see all posts) 2010/06/04 (Fri) @ 02:56

Tango, would you have the same objection to someone graphing wins and payroll? I don’t know that the intuition would be any different between a graph and this regression. It’s just a simple way to digest data.

I’m not trying to defend these guys, although I do think reduced form correlations can be fun to look at in a very uninformative kind of way. However, these guys are obviously wildly misinterpreting their results.


#5    Guy      (see all posts) 2010/06/04 (Fri) @ 06:04

"from what I can tell the r for payroll vs. performance doesn’t even approach 0.50 with two decades’ worth of data.”

Not true.  Tango gets r=.75 with just 12 years of data:  http://www.insidethebook.com/ee/index.php/site/comments/how_many_wins_are_those_dollars_buying/


#6    Tangotiger      (see all posts) 2010/06/04 (Fri) @ 09:26

Guy has it right about my link.  It’s one thing to keep each data point to 162 games, so that if you go to 1 gazillion years, you’ll still get the same r=.4 or whatever it is.  All that increase the number of data points does is reduce the uncertainty level.

But, by combining the data along team lines (and ignoring years), you are increasing each data point from a sample of 162 games to say 2000 games.  This is what causes the r to go up, because now the observed win% is much closer to the true win%.

***

To the person named after the most horrible character in moviedom:

“However, these guys are obviously wildly misinterpreting their results.”

My objections are purely on interpretation grounds, though ostensibly running a regression and charting a graph has the implication of opinion.


#7    J-Doug      (see all posts) 2010/06/04 (Fri) @ 10:09

@Guy and Tango: Good stuff. I’ll need to take a closer look at it. In my own work I was using a time-series analysis to get n = 1620 for a ten year span.


#8    J-Doug      (see all posts) 2010/06/04 (Fri) @ 10:15

Taking a closer look, my smaller R is due to my inclusion of data from 1988-1994. In those years it seems that payroll is far less a factor in winning than it is from 1994-2009. I’ll be posting my work in a few weeks or so.


#9    Tangotiger      (see all posts) 2010/06/04 (Fri) @ 10:30

J-Doug, you absolutely and positively need to adjust salaries.  Are you doing that?  The way I do it is simply scale everything to the league average for that year, and call it a Payroll Index.

Do not do what others do and take the log of salaries.  This is a crutch that students and professors have been using because, well, that’s what they’ve been taught.  Considering that wins and salaries are linearly related to begin with, taking the logs of salaries introduces a second problem.

Minimizing the errors in the logs of salaries is not the objective.  For example, the ln(1) + ln(25) = 2 * ln(5).  So, 25MM$ is as far away from 5MM$ as 1MM$ is, according to loggers.  But being off by 20MM$ in one direction should not be equivalent to being off by 4MM$ in the other direction.

But, that’s what the flat loggers society deal with.


#10    J-Doug      (see all posts) 2010/06/04 (Fri) @ 10:38

Tango:

Yes, of course I am. I used the exact same method you did. Now that I see you already did it that way I feel much more confident in my results.

I absolutely abhor when researchers take the log of the variable when there’s no methodological case for it. It’s a great way of flattening out a variable so that it looks linear when it isn’t, and a great way of making linear relationships look insignificant when they aren’t..


#11    dq      (see all posts) 2010/06/04 (Fri) @ 10:55

#9/ Education time for the old guy. Why would someone use logs, or maybe where is it appropriate to use the logs for something like this?


#12    Depot      (see all posts) 2010/06/04 (Fri) @ 12:24

There are lots of problems with regressing Win% on Payroll, but “the implicit treatment of the OBSERVED winning percentage is that it’s a TRUE winning percentage” isn’t really one of them.  You’re essentially just saying that the winning percentage is measured with error (and classical measurement error seems like a decent assumption here).  But measurement error in the dependent variable doesn’t affect the consistency of the estimates.  Yeah, it makes them more imprecise and that should be explicitly discussed, especially since the standard errors are (apparently) large here.  But, in general, you’re not making that assumption in a regression.  Now, if we think that payroll is measured with error (or not representing exactly what we want it to), then that causes more serious problems.


#13          (see all posts) 2010/06/04 (Fri) @ 12:29

dq/11,

Researchers normally use logs of data for regressions in order to be able to use standard p-values (since regression coefficients can be estimated on non-normal distributions, just your inference will be screwey depending on the severity of it).  The log of the data transforms right or left skewed data.  Honestly, I think people use it too much because others demand the data be normal when that’s really an unrealistic constraint (and in small sample sizes, normal data can look like any shape you could think of).


#14    James Holzhauer      (see all posts) 2010/06/04 (Fri) @ 13:05

"And this is someone who is paid to teach statistics (or at least econometrics) at the college level....”

Two of my friends took a statistics course together at Illinois State University, and the professor taught them that the Martingale system (keep doubling your bet size until you win one, then start over) was a sure way to beat the roulette wheel in Vegas.  Apparently he sold the whole class on the concept, including my friends (it took me an hour to convince them that they would go broke doing this).

It can’t be very difficult to keep a job as a stats professor.


#15    J-Doug      (see all posts) 2010/06/04 (Fri) @ 13:35

James/14: External validity has never been the strong suit of quantitative social scientists and econometricians.


#16    MGL      (see all posts) 2010/06/04 (Fri) @ 13:44

"it took me an hour to convince them that they would go broke doing this”

Only if they played an infinite number of times would they definitely go broke.

The correct “answer” is that while they have a “negative expectancy,” it is extremely likely that they will win one “unit” in any given session, depending on the casino’s betting limit of course.  And if there were no betting limit and you were allowed to use credit to make your bet, you would have close to a 100% chance of making money, right?  In fact, for all practical purposed, it would be a sure bet.  So, that statistics professor was not necessarily off his rocker!

If I told you that I was going to pay you a million dollars if you won in a casino and you would pay me a million if you lost (it doesn’t matter the amount of win or loss), and that you had to make an even money bet, you bet your life you would use a Martingale system!

My point is that whether you have a positive or negative expectancy on a bet is hardly the whole story and the Martingale system is a good example of that.


#17    Tangotiger      (see all posts) 2010/06/04 (Fri) @ 13:52

"But, in general, you’re not making that assumption in a regression. “

Correct, but you are making that assumption in the magnitude of the correlation coefficient.

Basically, the regression equation will be virtually unchanged regardless of the data you choose (to a point anyway), but the correlation coefficient will continue to rise if the number of games in each data point increases.

And it makes perfect sense because the r measures the level of total variance that is not from the binomial.  I mean… big deal, right?  Who cares how high the r-squared can get, say at 25%, if the reason the other 75% is because of the binomial.  So, 25% of the variance is explained because of the salary… but 60% (or whatever) of the variance is explained because of sample size!  So, there, virtually the entire relationship is explained, with the remainging explain by service class payouts and bad/good years.


#18          (see all posts) 2010/06/04 (Fri) @ 13:53

MGL/16: Casinos have an upper betting limit.  If the maximum is $500 and the minimum is $1, then after nine straight losses starting at $1 you’ll be down $511 and be unable to bet the required $512.

On a 0/00 wheel, your chances of losing 8 straight are about 1 in 323.  So 322 times you’ll win $1, and the 170th time you’ll lose $511.


#19          (see all posts) 2010/06/04 (Fri) @ 13:54

Correction to 18, last sentence: the 323rd time you’ll lose $511.


#20          (see all posts) 2010/06/04 (Fri) @ 13:57

>“It can’t be very difficult to keep a job as a stats professor.”

It seems like it’s hard to *get* one, though.  I applied for an advertised math teaching job at the local community college last week.  They prefer an M.Sc. and an education degree.  I doubt I’ll even get an interview.


#21    Guy      (see all posts) 2010/06/04 (Fri) @ 14:34

"And it makes perfect sense because the r measures the level of total variance that is not from the binomial.  I mean… big deal, right?”

Right, the problem here is Brook’s assumption that because he thinks .22 is “low” the relationship isn’t important.  But it might explain half or more of the non-random variance.

It’s similar to Berri deciding that because the coefficient of variation for goalies’ save% is so low (.002), “there simply is very little difference in the performance of most NHL goalies.” This is totally arbitrary, and tells us nothing about how many games a good goalie wins.  As Phil nicely showed, if the NHL happened to record “goal%” instead of save%, the CV would be 8x as large—but the talent spread would be identical.

As I’ve said before, the problem with the academics is not limited to a lack of subject matter expertise.  Many of them also need to learn statistics…



#23    MGL      (see all posts) 2010/06/04 (Fri) @ 18:39

"MGL/16: Casinos have an upper betting limit.”

Phil, thanks.  Considering that I’ve lived in Las Vegas for 29 years, I might know that… wink


#24          (see all posts) 2010/06/04 (Fri) @ 18:45

MGL/23: I knew you knew that, even though I didn’t know you lived in Vegas, in part because you mentioned it in the post I was replying to.  smile

But, yeah, I misread the conversation ... I mis-thought it was a debate on whether the strategy worked, but you guys already knew the answer to that.


#25    MGL      (see all posts) 2010/06/04 (Fri) @ 20:06

NP smile


#26          (see all posts) 2010/06/04 (Fri) @ 21:57

On a related but unrelated note, I was just in Vegas for the 2nd time in my life.  18 guys for a bachelor party.  I didn’t know most of them, but a handful were what I believe are called ‘whales.’

I couldn’t go anywhere near the tables when they were playing.  It just hurt too much - one guy had a ‘system’ for roulette; they played craps and made seemingly random bets; they drank a ton while playing blackjack for hours.  Everybody claimed to be ‘hot’ or that the dice were ‘hot’.  The four biggest gamblers were down $30k+ between them over 2.5 days.  I tried to discourage my unemployed friend from joining the fray, but he said all my talk of ‘odds’ was a downer.

I made one bet all weekend - on a baseball game that ended up getting rained out.


#27          (see all posts) 2010/06/04 (Fri) @ 23:29

I’m glad I don’t have the gambler gene in me ... I imagine it gets very expensive to indulge a taste for gambling. 

When I go to Las Vegas, I spend most of my time in the Pinball Hall of Fame instead of the casino.  Not only do they have a lot of beautiful vintage machines, but they also have a slot machine with a 100% payout ratio.

Well, actually, it’s not a slot machine, it’s a change machine.  Still the best deal in town.  smile


#28    Vic Ferrari      (see all posts) 2010/06/05 (Sat) @ 14:58

@ Phil/18

Poor Lévy, I’m sure he presented the martingale system as an exercise to illustrate the folly of gambling systems. Now every gambling strategy snake oil salesman pimps him as the brilliant mathemetician who developed the martingale system.

Is anyone here familiar with the use of martingale strategies in survival/censorship problems?  There is an introduction to the thinking here (about 2MB).

It’s akin to the bankruptcy probability you express, Phil.

Recently, on a Flames hockey blog that I read, a commentor pointed out Chris Higgins’ (a forward, and not an enforcer) pattern of shooting percentage in his career to date.  It was roughly 14%, 12%, 10%, 8%, 6% ... and asked what the writer expected it to be next year.  It was implied the that the commenter was expecting about 4%. 

Personally I think that either ‘10%’ or ‘playing in Europe’ are the two best guesses.  It’s not like the guy has a worsening injury or illness.

Now this player brings a lot of other things to the table, he’s hard on the puck and helps his team outchance, plus he’s still young.  And he should come at a reasonable price, plus he’s from Long Island and the local team usually likes local guys extra.  So I hope he gets another NHL deal somewhere.  There are some smart GMs in this league, plus some active coaches that know his game, so he’s in with a chance.  Though if he was a few years older he’d be out of the league for sure, or if he was European he would very likely head to the KHL or another Euro league.

The question is, what happens in parallel universes where guys like this don’t fall out of the league, or see their playing time cut back?  The same applies to any sport.  I’m sure that we simply don’t see enough downward trends followed by bounceback seasons, not as many as we’d expect by chance.  Simply because the opportunity is lost, and a big chunk of the reason for that is luck.  And the failure of the human mind to properlu perceive streakiness.

Obviously this has a severe effect on aging curves.  As well, for almost any MLB hitting stat, if we use a sample that is randomly chosen and balanced for age, then calculate the ability distribution for that sample.  What happens to the survivors the next season?  Intuitively they should improve as a group, this as the weaker and diminishing older players are weeded out.  ie the ability distribution should move to the right a smidge and the left tail should thin out.  Makes sense, right?

For NHL goaltenders (using road close-score EVsave% and weighted road 5v4save%) the opposite actually happens, granted it’s a small sample size and I just had three years of this data handy.  The same occurs for every MLB batting measure I’ve checked.

I’m open to any and all reasoned explanations, though adjusting the aging curves for these batting measures to get a best-fit result ... that seems dodgy to me.  Anyone advancing that argument should ignore Berri, just generally stop picking on the smallest kids on the schoolbus smile, and produce a critique of Jim Albert’s work on the subject.

Long story short, I suspect that stochastic counting processes, like the martingale methods so prevalent in biostatistics, are the key to unravelling this.  And if I was knowledgable about this kind of math I wouldn’t be blathering on right now.  I’m not.  In fact I’m spectacularly ignorant.

Plus there has been so much work in this field, with increasing complexity, that I’m bewildered.  I don’t know where to start.

Is anyone here familiar with the mathematical expression of this type of reasoning?  i.e. the practical execution of this type of math?


#29    MGL      (see all posts) 2010/06/05 (Sat) @ 15:38

You lost me at “@”.


#30    Martin Monkman      (see all posts) 2010/06/06 (Sun) @ 17:54

I’ve re-run and confirmed the regression, and written a longer analysis that I hope shines some light on the results.

You can find it at my blog:
http://bayesball.blogspot.com/2010/06/cant-buy-me-wins.html


#31    MGL      (see all posts) 2010/06/06 (Sun) @ 18:08

Martin, same thing that Tango said. If your sample (the number of games in this case - not the number of teams - “sample” is not the right word) is small enough, no matter how strong the relationship is between the two variables, the R will be small.  So I am not sure of the point other than the correct conclusion that after 50 games, salary is not much of a predictor of WP.  After 100 games, it will be more of a predictor.  After 162 games, even more. After 2 seasons, even more. Etc.


#32    Tangotiger      (see all posts) 2010/06/06 (Sun) @ 19:10

Martin,

You didn’t address the two main things that I brought up:

1. The more games you use for each data point, the higher the r.  Extrapolating 50 games to 162 is not a good thing to do.  I get r=.75 when I use 2000 games for each data point.  Why we want to look at one model that has r=.22 when I’ve already shown a model that has r=.75 is beyond me.

The entire point of these studies is that you can never say “payroll does not link to wins”.  All you can say is “using this model, I can’t find a strong payroll-wins link”.  So, why would you keep looking for that needle in that haystack using the same methods?  I’ve already found the needle. 

“George is getting upset!”

2. The most important part is that part of the variance is explained by the binomial.  Really.  Why in the world is the number of games not being addressed?


#33    Vic Ferrari      (see all posts) 2010/06/06 (Sun) @ 20:38

Martin

I’ve added your blog to my feed reader, and I don’t have any argument with your assessment.  Unless I’m missing something, it is the same thing that Tango and Birnbaum have said, using the same reasoning.  In any case, that strikes me as sound.

I do find it ironic that a blog entitled Bayesball would use purely frequentist math with it’s first post, though.  :D


#34    Vic Ferrari      (see all posts) 2010/06/06 (Sun) @ 21:13

Tango said:

2. The most important part is that part of the variance is explained by the binomial.  Really.  Why in the world is the number of games not being addressed?

Agreed.  In fact I think everyone agrees, no?

What we have is a distribution of results, part of it is explained by binomial chance variation in game play, part of it by salary (for the moment we’ll knowingly ignore the fact that salary isn’t distributed anything close to Normal, and use r² to estabish it’s variance).

The question is, what’s left in the remaining chunk, the bit explained by neither of the above.

Is it normally distributed and wholly independent of team salary?  If so the math becomes trivial, ANOVA or solving-DIPS or whatever ... I don’t think that’s likely, though.

Obviously I did originally, comment #3 at Phil Birnbaum’s blog expresses that.  I estimated an eventual r of 0.77.

Phil’s re-expression of the information in post 4 left me uncomfortable with that though.  Run the link there, Tango, the birnbaum.php thing.  I know it’s just one season, but the implication is that “being a big spending team over the long haul” and “spending big in one year” are very different things.  And counterintuitively, to me at least, it’s not the splurger that get’s rewarded.  I say that cautiously, obviously it’s just one season tested and the model has significant edge effects at higher games played.


#35    Vic Ferrari      (see all posts) 2010/06/06 (Sun) @ 21:49

Tango,

Unrelated to this topic:

You post so frequently that I doubt you’d recall, but I posted here a while ago on an NHL save percentage thread.

Firstly, I did get back to reading Solving DIPS ... the first step is clean (with similar frame sizes, the impact of chance variance is consistent, regardless of the form of the ability distribution).

After that it’s pretty hairy.  Interaction between the different aspects could lead one wildly astray.  Do teams with hitter’s parks tend to acquire good hitters for the outfiled at the expense of defense?  If they don’t ... my point is invalid.  If they do ... we have a problem.

Oddly enough, an NHL poster, a biostatistician using the pseudonym DoctorMyBrainHurts ran Solving DIPS on save percentage and concluded that scoring bias was irrelevant.  Had he been more honest he would have said that scoring bias is comprised of antimatter, causing the universe to draw in upon itself, and suggesting that the world is about to end.

More years of data would just enforce that viewpoint.

The less alarming theory would be that, by chance or otherwise (and it’s just 30 teams, for crying out loud) shot recorders for teams with bad goalies are generally a bit more stingy with the shot count.

As well there is empirical evidence (shot counting of individual games) and statistical evidence (persistence of shots on net/shots at net data) that confirms that.

My position is that assuming that all elements are normally distributed and wholly independent of one another is naive, and might result in the right answer, and might result in something wildly wrong.  Nothing more or less than that.


#36    Vic Ferrari      (see all posts) 2010/06/06 (Sun) @ 22:11

Just to add on the previous hockey goaltending post I mentioned, in the event that anyone remembers.

In defense of Sunny, who seemed like a condescending bully at times there, it would be like you going to Joe Morgan’s blog, Tango.  How do you think that would go?  It’s not easy when people don’t understand you.

And I don’t think he was really trying to grift that kid, the one who lifted his skirt and ran (he called it a skort and claimed he was merely walking away briskly in the direction he planned to go all along.  But it was what it was).

That kid couldn’t possibly have understood the overwhelmingly nature of the odds in Sunny’s proposal.

Personally I respect Mehta, but I consider him an adversary.  And I don’t know what he’s trying to accomplish with some of the tidbits he throws out.  And while I would actually enjoy seeing clever people ripping a strip off the guy ... it pains me to see fools attempting the same thing.

Just had to get that off my chest.  Carry on with your day, Tom.  smile


#37    Martin Monkman      (see all posts) 2010/06/07 (Mon) @ 15:29

Tango,

Re: #32.  Thanks for the response. As I step out of the sabermetric shadows, I guess I’m feeling a bit timid—I wanted to find a way to politely say “The Hawkonomics model isn’t worth the electrons it’s printed on.” I didn’t think it worthwhile delving into the things you’d already raised in your original post, but instead I set about to simply identify all of the self-contained weakness of the regression model as presented.  Next time I will be bolder, and use more forthright language.

Vic,

Re: #33.  Thanks.  We like irony. 

With that said, though, the next post will be a probabilistic look at the frequency of perfect games.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 23 01:15
How much should minor leaguers make?

Feb 22 22:31
Not everything you learn in college is true (duh)…

Feb 22 17:27
Would you cut to a regularly scheduled show, if the main event ran long?

Feb 22 17:02
This week in chart failure

Feb 22 16:26
Who’s evaluating the 2011 forecasts this year?

Feb 22 12:21
MLB 2012 Odds: BetOnline

Feb 22 07:11
K minus BB differential or ratio?

Feb 22 01:18
Two players have the same stats: one is much younger.  Which one will be better next year?

Feb 21 14:49
Knuckleball pitchers: all of them

Feb 21 13:57
Proper compensation for Epstein?