Wednesday, August 10, 2011
Correlation is not causation (no matter how much you want to use regression to ignore otherwise)
The way your typical academician treats payroll and one-season win% to show there’s little relationship is terribly misleading. It basically ignores the fact that there’s huge random variation even in 162 games, and so, how can you ever hope to get a strong correlation to begin with? Not to mention he completely ignores that salaries are tremendously discounted for arb and pre-arb players.
Basically, the relationship goes like this:
talent + RBIs-saves-and-shit + service time -> salary
talent + luck -> wins
So, that’s the cause-effect relationship. We know that’s the cause-effect relationship. And you can’t argue against this cause-effect relationship. Your model must have this as the cause-effect relationship.
We’re not going to necessarily have a causative relationship between wins and salary, but that’s exactly how the academicians treat it. (One can just as well argue that wins leads to higher salaries as one can argue that higher salaries leads to wins.)
This is why I keep harping on subject matter experts. We KNOW the cause-effect relationship, and we know it’s extremely strong.
If you somehow want to use salary as a proxy for talent, then you are going to get biased results. That’s because the Mets will have greater service time among its players than the Twins, even if both teams have the same talent level.


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date