Friday, February 10, 2012
Jose Molina
Max gives us a component-by-component analysis of Jose Molina. Bottom-line: if you believe he controls the strike zone, then he’s a great catcher.
Buy The Book from Amazon
Max gives us a component-by-component analysis of Jose Molina. Bottom-line: if you believe he controls the strike zone, then he’s a great catcher.
Pete:
When it’s said 3 years’ defensive data is needed to judge a player, what does that mean? I’ll use Biggio as an example (since they’re talking about him at Baseballthinkfactory) - from ‘92-’02, he was worth about -5 runs/year fielding except for ‘97 where he was +19. Was he (1) a generally poor fielder who had a good/lucky year, or (2) still a poor fielder in ‘97 that looks good only because of the noise in the numbers?
Me:
You never throw data away, unless you have a REALLY REALLY good reason to do so. And even then, it better be REALLY REALLY REALLY good.
The more data you have, the less you need to regress. So, you need two years of fielding data to tell you as much as one year of hitting data. Would you make conclusions based on one year of hitting data? No? Then, you need more than two years of fielding data.
Poz gives us an overview.
The only consideration you MAY have is that you have the peak in mid to late 20s because that’s when you have the most number of players to begin with (you can’t have 100 players at 6 WAR at age 18, if you don’t have 100 players to begin withl but you don’t have that, because they are not good enough at age 18).
Anyway, you could add the RATE of players who get 6 WAR, and that will give you… well… something. I don’t know what, but, something maybe.
http://sagarin.com/sports/wham_bam.pdf
Some familiar names, notably Sagarin and Wayne Winston. Football, more than baseball, is a great sport for using win expectancy charts. Whereas in baseball the pitcher has a large set of pitches and locations to choose from (and won’t even necessarily hit the target), and the batter has to choose to swing or not, and game theory and randomization will play a huge role here, in football you have a much more finite set of choices, and the play is over after the play (as opposed to baseball because of the count). The clock, penalties, the turnovers, etc, all add great variables that make the win expectancy really valuable for football.
Glove-slap: Kevin.
Not for the Expos, but the next best thing: the Jays. The one you want is probably “Intern - Baseball Operations”, but there are a few other non-tech jobs if you prefer those.
Tell ‘em you heard it from Tango, and it will help.
Glove-slap: NaOH.
Jeffrey S. Moorad, ’81, vice chairman and CEO of the San Diego Padres, headlines a panel of baseball and television executives at the Villanova Sports and Entertainment Law Journal Symposium, “Moneyball’s Impact on Business and Sports” on Friday, February 10 at noon.
Former Governor Edward G. Rendell, ’68, will be the moderator of the panel that also features Billy Beane, vice president and general manager of the Oakland A’s, Omar Minaya, senior vice president of baseball operations for the Padres and former general manager of the New York Mets, and Phil Griffin, president of MSNBC.
...
The symposium will be held from 12 until 2 p.m. on Friday, February 10, at the Pavilion on the campus of Villanova University. The event will be simulcast live to the Villanova Law website at http://www.law.villanova.edu. **Due to popular demand, the location of the event has been moved to The Pavilion**
There will be no cost for this event, and no CLE credits are offered. On-line registration is required; please click here to reserve a seat at the symposium.
I’ll be home in time to catch him on the 7:30pm broadcast. Should be fun!
This was quite a surprising claim from Colin:
These values are on a very different scale, since due to the lack of an intercept the values have to sum to one for the first regression and to three for the second regression, but they’re also very different in a more meaningful sense; recasting the first year to 1 (which is practically already done for us), we get weights of 1/.92/.90.
As you know, Marcel uses 5/4/3 for hitters (meaning 1, .8, .6) and 3/2/1 for pitchers (1, .67, .5). (I think it was 3/2/1… can’t confirm right now.)
I personally use .9994^daysAgo for hitters and .9990^daysAgo for pitchers, which has the effect of being 1, .8, .64, .512 (and so on, each 80% of the previous) for hitters, and 1, .7, .49 (and so on, each 70% of the previous) for pitchers.
Tests from other research makes me think that it should be even more aggressive, so maybe 1, .7, .5 for hitters and 1, .5, .25 for pitchers. But, I haven’t researched that, so, I’ll just leave it there for now.
Colin has gone way to the other side, essentially going with a .9998 or .9999^daysAgo kind of model.
Now, I agree with the framework for his testing, that you should and must include the PA component when establishing the weights. Frankly, this is an important step. When I did it for Marcel, I basically forced everyone in the system to have at least 300 PA, so that I didn’t have to worry about this portion too much (I should have worried a little about this, at least). Indeed, if you give everyone at least 500 PA for each of the three years, this step becomes basically unimportant (no worries at all). That’s because the weighting of each year (the PA of year 1 divided by the PA of years 1 + 2 + 3) will be the same for each wOBA of year 1, 2, and 3.
So, getting back to Colin’s important point: he’s saying that if you introduce the PA weighting component, we see that every year is important. I find this very hard to believe. I mean, it’s an exciting finding if true, and I’d like to see more research on this for sure. My guess at the moment is that there’s a selection bias issue, with guys of limited number of years, or for young guys.
Basically, does Colin’s finding apply across-the-board, or is it really limited to a subset of the population? I’d bet on the latter, and I’d bet that the Marcel 5/4/3 would still hold for players who are regulars. In any case, it’s an exciting prospect to consider.
***
A correction to Colin’s note here:
The third, and perhaps most important, takeaway has to do with regression to the mean. We can add a simplistic version of regression to the mean to our forecasting model by adding a TAv_REG of .260 (the league average) with a PA_REG of 1200. (The PA_REG comes from the Marcels; it’s included here mostly for the purposes of illustration. The regression component in PECOTA is a more rigorous model based on random binomial variance—again, the purpose here is only to illustrate the concepts.
Consider a player with 650 PAs in three straight seasons, or 1950 total PA. Using the Marcel weighting of 1/.8/.6, that comes out to 1560 effective PA— in other words, throwing out 20 percent of a player’s PAs during that time period. That means 56 percent of a player’s forecast comes from his own performance, and 44 percent comes from the regression to the mean component. Using weights of 1/.92/.90 yields 1833 effective PA, throwing out only about six percent. Using the same regression component, that’s 60 percent of a player’s forecast coming from his own production and only 40 percent coming from regression to the mean. (And if you follow from the conclusions above and start using more years to forecast a player as well, even less regression to the mean is necessary.)
There’s a calculation error in there. Marcel uses 5/4/3/2 model, with the 5/4/3 being the weights for years T, T-1, T-2, and the 2 being the weight for regression toward the mean (using 600 PA as the seasonal number). So, if you had say 700 PA in year T, 400 in year T-1, and 500 in year T-2, you get these effective weights:
year T: 700 x 5
year T-1: 400 x 4
year T-2: 500 x 3
regression: 600 x 2
That 600x2 is the same for everyone. Colin’s calculation error is that rather than using 5/4/3, he used 1/.8/.6. The net effect is that he showing a far bigger regression amount than Marcel is actually doing.
It sure seems early to do a reboot, considering that SpiderMan 2 is one of the best comic-book movies ever, and SpiderMan 1 was pretty good as well. Tobey McGuire was also a great choice as actor.
And by the looks of the trailer, it seems to follow the Batman model: focus on the dead parents (which were barely mentioned in the movies or comics when I was growing up), focus on the science, and get grittier.
But, I’ll get past it, because once I set all that aside, this movie looks promising!
Oh, about 4MM$ a year for the next 3-4 years. Kershaw signed two years, to forego his first two arb years. How much would he have gotten, had he had the same performance, but not won the Cy?
I think Matt would be in a better position to answer that. However, if we look at the big four (Felix, Verlander, JJ, Weaver), we see a pretty typical pattern for one year deals, and what we should consider as the upper limit: 3-4MM$ the first year, 7-8MM$ the second year, 13-14MM$ the third year, and 20MM$ if it gets there for a 4th year.
As Dave reminds us, young guys sign away their early years as well. Lincecum got 2/23 for his first two years, but he came off TWO Cy Youngs.
Cole Hamels signed 3/20.5, which is a discount had he gone year-to-year like the Big 4 did (25MM$ or so for their first 3 years).
Kershaw just signed 2/19, and one would think that if he signed for a third year, he’d have gotten another 15MM$ or so to come in at 3/34. Signing multi-year also means you are getting a discount, so, that probably means he’d have expected to get say 38MM$ had he gone year-to-year, compared to the 25MM$ the Big 4 got. That’s a 13MM$ bonus for Kershaw over the Big 4 for those 3 years. Or 15MM$ bonus over Hamels.
Basically, a Cy Young really takes the sting out of arbitration, and accelerates a pitcher’s service clock by one year. What Lincecum did for his year 1 and year 2 arb deal (of 2/23) makes more sense to think of it for his year 2 and year 3 arb deal. (The Big 4 for example were paid about 21MM$ going year-to-year.) Same thing with Kershaw, who signed for 2/19, which is very much in line with thinking that his Cy Young accelerated his service time, since it matches to the Big 4’s year 2 and year 3.
According to the incomparable Brian Burke, the Pats should have given up the go-ahead TD at the two-minute warning.
The smartest play of all would’ve been for Belichick to have allowed the touchdown even earlier. The Patriots certainly could have done so on the play prior to Bradshaw’s touchdown run, when he was stopped for a one-yard gain, forcing New England to burn its second timeout. In fact, they probably should have allowed a touchdown as early as the two-minute warning. That’s the point at which the Win Probability of receiving a kickoff down by four or six points (0.23) exceeds the Win Probability of trying to stop the Giants from bleeding the clock dry (0.2). The Patriots would have had almost two minutes, two timeouts, and all four downs available to get a touchdown and steal the win.
Basically, every time out has a certain win value, every second lost has a certain win value, every yard lost has a certain win value. And Brian is saying that the Pats would have maximized their chances of winning by allowing the TD to happen at the two minute warning.
This is exactly what win expectancy charts (and to a lesser extent, run or point expectancy charts) should be used for.
As many of you know, the Unconstitutionality of Prop 8 (banning gay marriage) in California was upheld today by a 3-person panel of the 9th Circuit.
A leading proponent of Prop 8 said this:
“We are not surprised that this Hollywood-orchestrated attack on marriage – tried in San Francisco – turned out this way. But we are confident that the expressed will of the American people in favor of marriage will be upheld at the Supreme Court,” he said.
California voters passed Proposition 8 with 52 percent of the vote in November 2008, five months after the state Supreme Court legalized same-sex marriage by striking down a pair of laws that had limited marriage to a man and a woman.
Putting aside the (important) issue of a majority being able to dictate the rights or lack thereof of a minority, it really rankles me when any group uses the “will or mandate of the people” argument when 50-something percent vote for against something or someone. I mean, isn’t 52/48 essentially a split? That is almost as far from “a mandate” or “will of the people” as you can get!
And the person quoted above says, “the American people.” Of course this was a California vote, not a national one. However, let’s talk about “Americans” since this guy did in fact say, “the will of the American people.”
According to this Gallup poll:
http://www.gallup.com/poll/147662/first-time-majority-americans-favor-legal-gay-marriage.aspx
53% versus 45% of Americans favors gay marriage! So this guy, in addition to incorrectly (and irrelevantly, since Prop 8 is a state issue) talking about the “will of the American people,” is full of crap as far as his facts are concerned, at least according to the poll I referenced.