THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, April 05, 2007

Nate Silver meet Sean Forman, OR, why I really hate any and all predictions then, now and forever

By Tangotiger, 05:08 PM

Nate Silver posted his first impressions on Matsuzaka after the first inning, with the following prediction:

IP H R BB K NP W-L
4.2 4 3 4 3 98 ND

Here’s the game, courtesy of Fangraphs, with Matsuzaka getting 10K, 1 walk, and +.367 in WPA.  (Santana had only two starts better than that last year in terms of WPA.) Sean Forman chimes in with the best initial game of all Redsox pitchers since 1957, and Matsuzaka comes in fifth.

I’m using Nate as my foil, but it could have been any other intelligent, well-respected analyst out there.  If Nate was right, what would have been the reaction from his readers?  I dunno.  But his prediction is the complete opposite of what really happened.  Nate was wrong, and the opposite of would have happened if he was right should happen now.

The truth of the matter is that we all know sh-t.  You, me, Nate, the manager, the opposing hitters.  We can’t tell who’s on or who’s off.  Maybe only the pitcher himself knows, and that’s a maybe.

This is the same thing with all the stock predictions.  Do you know how many stock predictions are given out on TV and in major media every month?  Thousands.  Do you know how many track these guys, to see how well they do?  I found one site, and every year the Wall Street Journal does an analysis.  And you know what?  Virtually all these experts can predict the future as well as mom&pop.  And with all those expert football picks of the week?  Long-term, they’re almost all around .500.  The BP chats are filled with “what do you think this guy will do”. 

That’s why I hate predictions.  They mean nothing, unless you are held accountable. 


#1    Nate Silver      (see all posts) 2007/04/05 (Thu) @ 18:36

Tom,

I don’t know.  The projection was not without thought, but I was mostly just trying to have some fun there.  If the projection had been dead-on, might I have bragged about it?  Sure, but I wouldn’t expect my readers to take it all that seriously.  It would have been a lucky guess.  And as the projection has turned out to be not so dead-on, I’ve owned up to that too (see link on name)

The more interesting question, I think, is whether someone can pick up something from a visual/aesthetic/scouting impression about how a pitcher is performing that day that would allow him to make a materially better projection about how the rest of his outing will transpire. 

For example, suppose that you have a computer program that predicts the Game Score that a pitcher will finish with on a given day based on his results in the first inning.  This could account for as many variables as you’d like.  Would an educated generalist like you or I who had actually watched the first inning be able to come up with a better prediction?  Would someone who had watched that pitcher dozens of times before?  Would a scout?  A manager?  You see where I’m going with this.  My guess is that the answer is almost certainly ‘yes’ for the scout and the manager, but I’m not certain about the other two.


#2    MGL      (see all posts) 2007/04/05 (Thu) @ 20:37

My initial observation of Dice-K (first time I have seen him) is that he is capable of throwing any pitch at any count and seems to randomize his pitches well.  Even with average stuff (he looks to have above average but not nasty stuff) any pitcher that can intelligently randomize his pitches at almost any count is going to be successful.  What does it take to do that?  One, control of most if not all of your pitches, and two, pitching intelligence.

And yes, trying to figure out whether a pitcher (or hitter) is on or off at any given time in a game, and then to predict his short-term future performance, is pretty much an exercise in fultilty, especially if you are not a professional scout or coach and even then....


#3    Tangotiger      (see all posts) 2007/04/05 (Thu) @ 20:55

First off, I want to make clear that I wasn’t picking on Nate specifically, just using something that was convenient that happens too often.  Nate of course is a blogger, so kudos to him for not letting his blog disappear on the issue, and following up on the matter.

As for observing the first inning to forecast the rest of the game: I just can’t believe it, so I’m more in mgl’s camp, if not more entrenched.  Even if a pitcher looks terrible that first inning to a scout, a pitcher has options (mechanics, location, pitch selection, pitch speed, sequencing of pitches) at his disposal that being bad at one in the first inning means he relies on the other stuff.  Or simply, he works his way through it, and finds his stuff.

What we need is more Curt Schillings:
http://38pitches.com/

I want to know his intent, and his feelings about it.  Then I can believe it.

I can be wrong on this issue…


#4          (see all posts) 2007/04/05 (Thu) @ 21:02

It would be interesting to know what the TradeSports odds were after that first inning, and what that implicitly said about the market’s expectation of Matsuzaka’s next few innings.

I’m guessing the estimates wouldn’t change much.

TradeSports gives data free to educational institutions; I wonder if we count?


#5    Tangotiger      (see all posts) 2007/04/05 (Thu) @ 21:38

Let me find out…


#6    Tangotiger      (see all posts) 2007/04/05 (Thu) @ 21:45

Ok, I put in a request.  Let’s see…


#7    Nate Silver      (see all posts) 2007/04/05 (Thu) @ 22:30

So you guys are telling me that if a guy who usually throws at 94-96 comes out throwing 83-85 (like Mark Prior in his first spring training start), that has no significance at all in terms of his expectation for the rest of the game (above and beyond the statistical evidence)?  That nobody could see that Pedro was tiring in the Aaron Fricking Boone game?

These are hyperbolic examples, perhaps, but I also find your conclusion a bit hyperbolic.  It’s just a matter of degrees.  If a computer spit out an over/under line for a guy’s game score or FIP based on the statistical evidence, how often could an informed observer beat that line?  I’d guess it would be about 52-55% of the time for a layman, on up to about 60% of the time for an expert.


#8    tangotiger      (see all posts) 2007/04/05 (Thu) @ 22:41

The 94/95 to 83/85 is measurable, and the difference is staggering.  I’m happy to agree with you here.

Pedro was tiring and I called for him to be removed, just before he struck out Soriano in the previous inning.  “Oh well, what did I know?” I thought to myself.  In that case, if the coach said that he’s delibrately altering his pitching mechanics, that his fastball has no more movement on it, then we have a prior: we have an expectation that at some point he’s going to tire, and therefore, we have reason to believe that he was tired.

If we looked at half of Tom Glavine’s career first innings, we’d relieve him right away.  We have no reason to suspect that pitchers’ mechanics are “on/off” to start a game.  It’s more likely to believe that they have a whole arsenal at their disposal that they can work through it most of the time.

As a pitcher tires, his margin of error goes way down, and we get Pedro’s game 7.

I would think an informed observer for beat the line 51-52% of the time, an expert 54-55% of the time, and the pitcher himself would be at 60%.


#9          (see all posts) 2007/04/05 (Thu) @ 22:55

I assume by “beat the line,” you mean the opening line, right?  Because we don’t know yet if the after-the-first-inning line already adjusts for the fact that Prior is throwing 85.

Let me ask you guys the same question in a different way.  Suppose you watch the first inning of pitcher X, a good pitcher with a usual ERA of 3.75 and 7 K per 9 innings.  You spot something in the first inning that makes you think he hasn’t got it today.  What would you predict his ERA would be for the rest of the game?

Nate, you predicted 7.36 (3 ER in 3.2 innings) for Matsuzaka.  Would you feel as comfortable predicting 5.00?  Or 2.00?

To be honest, and as much as I respect Nate’s observation skills (I have none), I doubt anyone’s ability to be able to tell that a good pitcher will be 7.36 today.

That is: I am willing to make this bet with anyone, at any time:

You choose any series of plate appearances that hasn’t happened yet—after observing the pitcher all you want.  You pay me $7.36 for every 27 outs they wind up pitching, and I will pay you $1.00 for every earned run they give up (or, preferably, each run of the RC27 of the batting line against).  You can choose one PA here, two PAs there, three PAs tomorrow ... as long as you identify them in advance.

I would make the stipulation that (a) the pitchers have to be decent (no pitchers who are actually expected to be 7.36 over *all* games, but 5.00 is OK), and (b) no cherry-picking Barry Bonds AB (so I have the right to normalize the batting line to the composite batters’ skill).

Nate, Tango, would you accept that bet?  How low would I have to offer on the 7.36 before you would accept it?


#10    tangotiger      (see all posts) 2007/04/05 (Thu) @ 23:17

Phil makes a good point.  After that 1st inning, Matsuzaka had 1 H, 1 BB, 0 R, 0 K, 15 pitches.  That would mean Nate forecast 3.2 IP, 3 H, 3 R, 3 BB, 3 K, 83 pitches (which works out to about 17 batters, or nearly 5 pitches per batter).  That H/K/BB line is more representative of 2 R in 3 2/3, not 3 R.  Let’s give Nate the benefit here, as he’s obviously doing a quick blog. 

As well, if you give him 6 pitches for each BB and K, at most, that leaves him with 47 pitches to the other 11 batters.  Seems reasonable for a struggling pitcher, if a bit excessive.

In any case, I can believe that Nate intended to show that Matsuzaka would be pitching at slightly worse than average (not 7.36).

I think it’s a fair thing for Phil to ask: choosing among Santana, Halladay, Oswalt, Matsuzaka, King Felix, how comfortable would Gleeman, baseballchick, Craig Burley, Eric Van, and Dave Cameron be able to say: “Hey!  For the next 9 batters, this guy will put up below league average numbers.”

Give me a total of 4 such games for each pitcher (180 batters among the 5), and I’ll bet you see his seasonal average.

(1 SD in OBP would be .034, so I don’t think we can learn much.  We need more star pitchers.)


#11    Nate Silver      (see all posts) 2007/04/05 (Thu) @ 23:17

Phil,

By “the line”, I mean a projected Game Score that a PECOTA/Marcel like system would generate after the first inning.  Marcel says, “based on that performance in the first inning, the over/under for Matsuzaka’s game score is 53.5”. 

I would not take your bet.  I’m not a scout, and I don’t think my edge is that large. 

Tango,

Pitchers are rarely 100% healthy, and there is definitely variance in a pitcher’s mechanics from game to game (and inning to inning).  I can’t pick it up unless it’s fairly extreme, but people like Will Carroll can.


#12    Dave Cameron      (see all posts) 2007/04/06 (Fri) @ 00:02

If you guys want to study this a little deeper, I have a suggestion: USSM runs a game thread for every game, all year long.  Especially when Felix starts, there are a ton of subjective comments about how the pitcher looks, how the breaking balls are moving, and what the velocity is like from the first pitch.  It would take a bit of digging, but there’s a pretty large sample of subjective opinions about how a pitcher looks early in the game available on the blog.

For instance, here’s the thread from opening day (click my name).

Included are comments such as:

Great pitch selection by Felix in the first. Totally unpredicatable beyond the obvious first pitch fastball to Kendall.

I may be overreacting, but I can’t see how anyone is going to be able to get a hit off of Felix this year.

That was pretty awesome to watch. Sick stuff.

Sick change. This pitch selection is making me giddy.

Stuff wise, Felix has never looked better. Even back in ‘05, he wasn’t THIS good. This is something else.

It took USSM readers about 2 innings to observe that Felix was pitching at a level far above anything he did last year.  He then proceeded to finish off the best start of his career.  Did we really see something, or were we just overexcited fans? I think probably a mix of both.

Since I did the Charting Felix series last year, I got pretty well vested in watching Felix’s approach and tried to pick up on whether Good Felix or Bad Felix had come to the park that day.  I spent hundreds of hours watching him throw, watching a lot of games multiple times and recording type/location/velocity of each pitch in each circumstance. 

And in the end, I’d probably lean towards Tango’s side here.  I think there are things we can glean from subjective observence, but a pitcher can change something so quickly that, outside of noticing a guy being hurt, I’m not sure we can add much towards the predictability of a small handful of at-bats that are about to occur.


#13          (see all posts) 2007/04/06 (Fri) @ 01:02

I really like the idea of watching the pitcher, instead of just the outcome, to see if he’s pitching a good game or not.  As I said, though, I am completely incapable of doing this myself.  The opinions of people like Nate and Dave Cameron and the USS Mariner threads, I think, are very important.

But are we sure they’re predictive?  Suppose I have no doubt that posters are correct, that Felix’s pitch selection was indeed unpredictable and his stuff was indeed sick.  Does that necessarily mean it’s going to continue?

In any sport, players will do some things that make them look like superstars, and they’ll do some things that make them look like minor-leaguers.  Can we really evaluate what we do earlier in the game and assume that it will continue?  If we do, without evidence, aren’t we committing the hot hand fallacy?

I like Dave’s suggestion of looking back at the log, evaluating the early-game comments, and seeing if there’s a correlation to the rest of the game.

Ideally, I’d love to see one of these guys place bets at TradeSports, during the game, based on the comments.  I’m really, really, really curious to see if they could make a profit.


#14    MGL      (see all posts) 2007/04/06 (Fri) @ 03:24

This is a point that I have been making for many years:

While I have little doubt that scouts, managers and coaches (and the players themselves) can see and identify things that the average person cannot, the problem is that they are so superstitious and so irrationally influenced by random events, and so incapable of understanding the nuances of random fluctuation, that their actual observational talent gets almost completely cancelled out by their irrationality (probably a bad word, but I think you know what I mean).

I don’t know that I explained that very well, so let me give an example of what I am taking about.  Let’s say that a pitcher has his normal stuff, mechanics, velocity, health, etc.  And let’s say that he happens to get hammered in his first three innings.  This is a hypothetical situation that by definition must happen all the time (maybe 5% of a pitcher’s games).  How many managers do you think are going to think/say/react, “Well as far as I can see, there is nothing wrong with my pitcher.  I have no concerns.  I will leave him in there just as if he were pitching a shutout.” And vice versa.  There have to be times when a pitcher does not have his best whatever and by sheer luck he is pitching a great game.  How many managers are going to take him out because they see that he does not have his best whatever even though he is pitching a great game?  Managers and coaches are so irrational and dare I say it, stupid, that any talent they might have in terms of these things we are talking about are 80% (number is a WAG)wasted IMO. The knowledge and talent that these scouts, managers, and coaches may have does not do much good unless it is accompanied by some understanding of small samples, random fluctuation, non-results-oriented decision-making, etc.

For example, some of us admit that managers and coaches probably know which batters are better or worse against which types of pitchers.  If that is true (and I don’t know whether it is or isn’t), what good does that do when almost every manager in baseball is going to love a batter who is 12 for 31 versus a given pitcher and hate one that is 0 for 13?


#15    Guy      (see all posts) 2007/04/06 (Fri) @ 08:18

"Can we really evaluate what we do earlier in the game and assume that it will continue?  If we do, without evidence, aren’t we committing the hot hand fallacy?”

Phil:  I thought one of the most important findings in The Book was that pitchers’ performance in the first part of a game IS somewhat predictive of what follows.  So the “hot hand” is not a myth in this context.  That said, 1 or even 2 IP likely isn’t enough of a sample to allow for predictions.


#16    tangotiger      (see all posts) 2007/04/06 (Fri) @ 08:20

The last two posts epitomize the point: put your money where your mouth is.  If some observational scouting stud thinks he’s so smart, he should be making a killing on TradeSports.  The airwaves and “internets tubes” are replete with forecasters who have debts like the rest of us.


#17    tangotiger      (see all posts) 2007/04/06 (Fri) @ 08:24

I meant MGL and Phil’s posts.

And yes, to Guy’s point, there is some hothandedness in The Book.  A pitcher who’s gone through the order the third time and mowed them down is given a prior of “likely not as tired as he normally would be”.  But, with random variation what it is, you’d have to place hundreds of bets like that to be assured of any money.


#18          (see all posts) 2007/04/06 (Fri) @ 11:28

Right, there is some evidence of hot hand effect based on hitting stats.  But is there a hot hand effect based on *observations of actual pitches*?

On the one hand, you’d think it would be stronger, since there’s some luck in between how good the pitch is and what the batter is able to do with it.  On the other hand, maybe how good a pitch *looks* is less repeatable than how good a pitch *is* in practice.

I’d bet on the first hypothesis, that the hot hand should be stronger.  But would it be strong enough to bet on and make money?


#19    tangotiger      (see all posts) 2007/04/06 (Fri) @ 14:18

I agree with Phil with the “on the one hand, on the other hand...” We just don’t know.

I’m reminded of Will’s article on ESPN about Mark Prior:
http://proxy.espn.go.com/mlb/columns/story?id=1623962

For whatever reason, the best pitchers of all time—guys like Steve Carlton, Tom Seaver, Christy Mathewson—had considerably quieter starts to their careers, posting promising but not dominating numbers, then ratcheting it up to elite status a couple of years later.

But Prior might be a special case, not because of his numbers, but because of his mechanics.

What does a biomechanist see when Prior takes the mound? There are five major principles of proper delivery that can be summarized as balance, posture, anatomical position, rotation, and release. Prior is textbook with all five.

A proper delivery, biomechanically, is focused on driving the ball linearly from cocked position to catcher’s mitt, ideally missing the bat in its travel. Balance seems too simple to be important, but watch any game and you are likely to see a pitcher falling off to either side in his delivery. Prior? Direct, linear and compact. Prior is equally ideal with his posture, keeping his 6-foot-5 frame erect through delivery and using both leg drive and gravity to impart force on the ball as he releases it. His elbows stay level, keeping stress off the rotator cuff.

The deceptively simple combination of fastball, curveball, good command, and good mechanics is enough to make Prior one of the five best pitchers in baseball. He isn’t a demonstrative guy on the mound, and there are times when he’s cruising along so smoothly that he seems to be on autopilot.

But take a look at a list of pitchers that includes names like Gooden, Blue, Ruth and Valenzuela, and it reminds that is for certain: Mark Prior is human.

Pascual Perez, Oil Can Boyd, and Danny Jackson might have longer, more successful careers than Mark Prior. 

I like that Nate contributed his piece in Will’s article, as a stark reminder of how much random variation plays a role in it all.


#20    MGL      (see all posts) 2007/04/06 (Fri) @ 22:24

I am a firm skeptic when it comes to these sabermetrician cum nechanical experts/scouts like Carrol, Sheehan, Law and others…


#21    MGL      (see all posts) 2007/04/06 (Fri) @ 22:30

There are basically three kinds of things that exist in this world:  One, that which we know is true and we don’t need evidence to convince us (of course, it depends on who “we” is).  For example, do we need any statistical evidence to support the theory that the best Little League ballplayers would not do well in the majors.  Two, that which we don’t need any evidence to convince us it is NOT true.  These are essentially the same things.  Finally, there are those things which might seem obvious to one person or another (or lots of people, sometimes even an overwhelming majority of people), but which “we” do not know is true (we might suspect one way or another) ornot true unless we examine and analyze the evidence, and even then, we may not be sure for various reasons.

Obviously the line between 1/2 and 3 is not clear or black and white but we often (probably usually) know whether something is firmly in one camp or the other.


#22    Tangotiger      (see all posts) 2007/04/07 (Sat) @ 12:36

TradeSports sent me their trading activity! 

The first trade was at 12:22 BST, which is 11:22 Greenwich Mean Time, which is 7:22 Redsox time, was for 60.50, meaning that’s the win probability of the Redsox.  This fluctuated until 18:03 BST (13:03 Redsox time) at 62.50.

Unfornunately, that’s all the data they’ve got.  It seems that there was no trading action after that point.


#23    John Beamer      (see all posts) 2007/04/07 (Sat) @ 12:42

Yeah—i’m looking at quite a bit of their trading data. There is normally on one game a night with sufficient liquidity—last night it was Bal vs NYY ... no sure why ... could be the US online betting laws i guess


#24          (see all posts) 2007/04/08 (Sun) @ 21:04

Too bad there’s not enough liquidity.  I guess people are afraid to put down a bid or an ask, since the odds could change instantly against them before they have a chance to revoke their offer ... especially with a six-second satellite delay.


#25          (see all posts) 2007/04/08 (Sun) @ 21:08

I’m willing to put up some prize money in the name of science ... how about this?

If you’re watching a game and you think pitcher X isn’t throwing well today, e-mail me (or post a prediction here) that he’ll struggle in the next inning (or two, or three, or whatever you choose).  If you e-mail at least 50 innings over the season—that’s two innings a week—and the (combined) subsequent ERAs are at least 2.00 above the pitchers’ season averages, I’ll send you $20.  If you don’t win, you don’t have to pay me anything.

Of course, no cheating by past-posting ... do it early in the inning that the pitcher’s team is at bat, to avoid any doubt.  And you have to keep track and keep me posted (I’ll verify everything later if you tell me you won).

Anyone seriously interested?


#26    tangotiger      (see all posts) 2007/04/24 (Tue) @ 16:29

More soothsaying:
http://www.dallasnews.com/sharedcontent/dws/spt/hockey/stars/stories/042307dnspotaylor.2cb3964.html

All that’s left is for Dallas to make history. They’ll do it tonight. Mark it down. It’s going to happen.  The Stars will beat Vancouver and become the first team in franchise history to overcome a 3-1 series deficit.

Stars lost. 

Anyone who is this certain about an outcome (that the general public was only 55/45 certain) ought to bet their entire life savings on it.

This is yet another loud voice in a long line of big mouths who would have taken an enormous amount of credit if he won, and is ready with excuses now that he lost.

Please, if someone wants to make a forecast, put your money where your mouth is first.  Or just tell us that how right you think will be, like this other great TV announcer did:
http://www.tv.com/the-simpsons/lisa-the-greek/episode/1334/trivia.html

“Smooth” Jimmy Apollo: Well, when you’re right 52% of the time, you’re wrong 48% of the time.

Homer: (yelling at the television set) Why didn’t you say that before?!


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 08 04:25
Sabermetric Moves of the 2009 Pre-Season

Jan 09 01:44
Cheers

Jan 08 23:45
The first Hardball Times Annual available for download!

Jan 08 21:16
Line Drives

Jan 08 20:23
(recent) Historical WAR on Fangraphs

Jan 08 16:07
Clint Eastwood is Archie Bunker

Jan 08 16:06
Hardball Times Annual 2008, starring…

Jan 08 15:58
Madoff’s Ponzi

Jan 08 03:41
Valuing relievers

Jan 07 17:41
The latest in park factors