THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, November 05, 2008

Nate Silver

By Tangotiger, 11:21 AM

Non-sports post.


The big story may have been Obama, but among our readership, Nate Silver takes a close second.  He laid it out that his forecast was for 348.6 Electoral Colleges.  The current (and incomplete) official count is 338.  There are three states still not ready to divvy up their totals.  As of right now, 26 of the remaining 37 would go to Obama, giving him 364.  If NC breaks the other way however, the final count will be 349.  If that happens, then Nate will take great pride there.  Nate also called for a 6% margin of victory, exactly what happened.

Did any other forecaster make a call on Electoral Colleges and margin of victory?

***

On his blog, he’s also alluding to big changes forthcoming (presumably Nate will be working with the media in some capacity?).  BPro also happens to have an ownership stake in 538.com.

Blogging
#1          (see all posts) 2008/11/05 (Wed) @ 11:40

I missed the Marcel electoral vote prediction—how did that turn out?


#2    Tangotiger      (see all posts) 2008/11/05 (Wed) @ 11:54

Craig: ?


#3          (see all posts) 2008/11/05 (Wed) @ 11:57

Rasmussen predicted 52-46 as well.

My only question about 538 is why he does simulations. I feel he should be able to estimate the actual probabilities fairly efficiently. It’s not like the network is very large.


#4          (see all posts) 2008/11/05 (Wed) @ 13:05

Re 2:  Feeble attempt at humor.


#5    Matt Lentzner      (see all posts) 2008/11/05 (Wed) @ 13:28

I thought it was funny. smile


#6    Adam B.      (see all posts) 2008/11/05 (Wed) @ 13:44

CNN, Nate and NBC have called Indiana for Obama. Obama might exceed even Nate’s projection.


#7    cannatar      (see all posts) 2008/11/05 (Wed) @ 14:02

Actually, if you were to design a very simple Marcel, it would have done almost as well. If you just averaged the 10 most recent polls for each state and gave the state 100% to whoever was ahead in those 10 polls, you would’ve projected 353 electoral votes for Obama (all the states he’s won other than Indiana, plus NC).


#8    Andy L.      (see all posts) 2008/11/05 (Wed) @ 14:23

Cannatar, the analysis at http://election.princeton.edu would be similar to this very simple Marcel that you describe.


#9    Tangotiger      (see all posts) 2008/11/05 (Wed) @ 14:28

Craig: no worries.  Lame attempts at humor are welcomed. 

***

cann: interesting.  Basically, what would Nate have forecasted if he took the most basic approach possible?  That is, do exactly like you said?  Is that right?  You get 353? 

This is similar to UZR and ZR.  Or Runs Created and Runs Participated In (R+RBI-HR).  Or ERA+ and W/L.  At some point, all the massaging that we do becomes almost unnecessary, because all the breaks even out, over a long-enough haul.  Other than say Yaz at Fenway, or Todd Helton at Coors, or Whitey Ford with the Yanks, the amount of systematic biases simply don’t last long enough to matter.

So, while Nate deserves kudos for all his manipulations, when all is said and done, based on cannatar’s post, maybe it simply doesn’t matter at all.


#10    Tangotiger      (see all posts) 2008/11/05 (Wed) @ 14:35

Andy: great link, thanks!

The Wisdom of Crowd… of Crowds:
http://election.princeton.edu/2008/11/04/meta-meta-analysis-or-the-wisdom-of-punchy-markets/

That pretty much says it all, doesn’t it?


#11    Anthony      (see all posts) 2008/11/05 (Wed) @ 15:07

I grabbed the state results from USA Today, so there are probably some errors in the manual input (plus some states still counting), but 538.com was correct on Obama’s % within 1 point in 21 states, within 2 points in 36 states, and within 3 points in 43 states. The average absolute error was 2.0 points for Obama, 1.8 for McCain and 0.5 for the third-party tally.

His worst performances were in D.C. (Obama beat the projection by 13.5 points) and Hawaii (Obama beat the projection by 9.4 points). He was within 6 points in every other state. It’s worth noting that he had just the last-minute YouGov poll in D.C. & the YouGov & a six-week-old Rasmussen poll in Hawaii.

I wonder how other pollsters did compared to that.


#12    Jeff      (see all posts) 2008/11/05 (Wed) @ 16:41

A good source that averaged polls in their analysis:

http://www.electoral-vote.com/

Click on the state and see what the various polls said.


#13    Tangotiger      (see all posts) 2008/11/05 (Wed) @ 17:01

Good site.  That lead to The Onion:

http://www.theonion.com/content/news_briefs/black_man_given_nations

Black Man Given Nation’s Worst Job

WASHINGTON—African-American man Barack Obama, 47, was given the least-desirable job in the entire country Tuesday when he was elected president of the United States of America.

In his new high-stress, low-reward position, Obama will be charged with such tasks as completely overhauling the nation’s broken-down economy, repairing the crumbling infrastructure, and generally having to please more than 300 million Americans and cater to their every whim on a daily basis. As part of his duties, the black man will have to spend four to eight years cleaning up the messes other people left behind.

The job comes with such intense scrutiny and so certain a guarantee of failure that only one other person even bothered applying for it. Said scholar and activist Mark L. Denton, “It just goes to show you that, in this country, a black man still can’t catch a break.”


#14          (see all posts) 2008/11/05 (Wed) @ 19:57

There were a bunch of sites this year that did poll aggregation - the princeton and electoral-vote sites that others already linked, plus realclearpolitics, pollster.com, and a few others. There were a couple things that Nate did that were different than the others.

One was to adjust his state projections based not only on polls of just that state, but based on trends in national polls or of other states with similar demographics. This was a neat idea, and was particularly useful in the primaries, when there weren’t as many polls, voters knew less about the candidates, and the different states voted at different times, so he could make predictions about what voters in one state would do based on what voters in similar states had already done. In the general election, the swing states were polled nearly every day, and this adjustment probably made a negligible difference, at best allowing his overall prediction to reflect trends a day or two earlier than others.

The other difference was that while the other sites were nominally predicting “what would happen if the election were held today,” 538 was nominally predicting probabilities for what would happen on election day, based on the overall volatility of past elections. IMO, this part was sort of questionable. Keeping in mind a general sense of how races tend to evolve is certainly useful, but making firm predictions based on assumptions about volatility can be perilous (c.f. various financial crises). And, of course, by the time we get to election day, this adjustment tends to zero as well.

In the end, all the sites predicted between 338 and 364 electoral votes for Obama, with Missouri and North Carolina the states being just as borderline in the polls as they ended up being in the actual results. Indiana was also close, but seemed to be leaning toward McCain.

So while Nate took a slightly different path than the others, the result was not much different, just as with PECOTA vs. more traditional regression forecasting systems. He did do his usual fine job of writing and data presentation.


#15    Centris      (see all posts) 2008/11/05 (Wed) @ 19:57

Re #3:  It would be easy enough to run through the 51^2 possibilities and find the probability for each (I guess there would be a couple more because of Nebraska and Maine) if the model assumed the probability of winning each state were independant.

But I think in the simulations they are not.  So if the probability that Obama wins N Dakota is p1 and the probability he wins S Dakota is p2, then the probability he wins N Dakota and S Dakota is greater than p1*p2. 

Also he accounts for the chance that the polls are systematically over or under predicting Obama. 

Because of this I think it is nontrivial to come up with an exact win probability for the model.

I could be wrong but that is my best understanding of it.


#16          (see all posts) 2008/11/05 (Wed) @ 22:36

Re #15: Sure, but you just create a Bayesian network modeling whatever relationships you want (including dependence, any sort of bias in the polls, etc.), then you use standard methods to either solve for the exact probability or approximate it (depending on how large the network is it may be infeasible to find an exact solution).

Incidentally, Intrade called 364 electoral votes for Obama, which is probably the exact amount he’ll have. I guess we don’t need models after all?


#17    anon      (see all posts) 2008/11/05 (Wed) @ 23:23

Re #15: Ignoring NE and ME there are 2^51 possible ways the states and DC could break D/R, which is too big to run through.


#18    Centris      (see all posts) 2008/11/06 (Thu) @ 00:28

re 16: yeah, I am not familiar with those methods.  It sounds interesting I should check that out.

re 17: Thanks, I got that backwards.  I don’t have any intuition for what types of tasks are too big computationally for today’s PCs.  Would 2^51 possible outcomes be too many for a PC to compute in a reasonable amount of time?

If so I think you could get around it by using a generating function.  Let p_i be the probability that Obama wins in the ith state and n_i that states number of electoral votes.  Let f(x) be the polynomial [(1-p_1)+p_1*x^n_1]*[(1-p_2)+p_2*x^n_2]*...*[(1-p_51)+p_51*x^n_51].  Multiply this polynomial out and rewrite in the form f(x)=c_0+c_1*x+c_2*x^2+c_3*x^3+...+c_538*x^538.

Then the probability that Obama wins is c_270+c_271+...+c_538 (assuming he wins in the case of an electoral college tie). 

Multiplying out a polynomial of that size is possible using a computer and that gives you the exact probability of a win assuming the states are independent.


#19          (see all posts) 2008/11/06 (Thu) @ 18:06

As I noted, there are more intelligent techniques than finding the probabilities for all 2^51 combinations.


#20    JD      (see all posts) 2008/11/07 (Fri) @ 03:12

I’m really disappointed that I unknowingly missed a chance to meet Nate tonight. He was at an event on the DePaul University campus, about four blocks where I was having class. I would’ve absolutely skipped out on part of class to meet him if I had known ahead of time.


#21    Matt Mitchell      (see all posts) 2009/02/20 (Fri) @ 17:59

I didn’t know which thread would be best to put this in, so I picked this one. It looks as if Nate’s methods have spilled into predicting the Oscars…

http://rogerebert.suntimes.com/apps/pbcs.dll/article?AID=/20090219/OSCARS/902209995


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 16:10
Clutch analogy

Feb 11 15:58
MGL: Today on Clubhouse Confidential

Feb 11 11:54
Who is Jeremy Lin?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul

Feb 10 21:07
Hero of the month: Brittney Baxter

Feb 10 18:32
Moneyball at Villanova

Feb 10 17:00
Psst… wanna intern in Canada?