Wednesday, August 23, 2006
Selective Sampling - How NOT to Choose Players
Cy Morong takes a look at establishing the replacement level. He says:
I ranked all NL pitchers from worst to best in RSAA per IP
And therein lies the problem. You cannot establish, after the fact, who is a replacement-level pitcher. Say you have a pitcher that gets thrown into the dustbin, I dunno, maybe Orlando Hernandez, between 2003/2004. He was a free agent, and signed with the Yanks for 500,000$. That’s the definition of replacement level. In 2004, he went 8-2, with a 3.30 ERA. However, he would fall off Cy’s (and pretty much everyone else who looks at this issue) selection criteria.
This is a huge selection bias, and invalidates any results presented.
I also recommend these articles, which discusses True Score Theory, Theory of Reliability, and Regression Toward the Mean:
http://www.socialresearchmethods.net/kb/truescor.htm
http://www.socialresearchmethods.net/kb/reliablt.htm
http://www.socialresearchmethods.net/kb/regrmean.htm