THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Monday, June 08, 2009

David Gassko versus THT Forecasting system

By , 02:11 AM

In an article written before the season started, David presented 29 hitters and pitchers who he thought would over or under-perform their THT projections.  This is interesting since I think that David plays a big part in these forecasts.  Like most forecasting systems, I think that a computer essentially spits out each player’s projection and I don’t think that a human being, including David, tweaks them before they are released. I could be wrong, and David can correct me if I am.

Anyway, I either told David or I made a mental note that I was going to check his “intuition” at season’s end.  I am always skeptical of these, “I can beat a good forecast system just by looking at the forecasts.” They are similar to the “players who I think will breakout or fizzle” articles, although it depends on who is authoring these articles or proclamations.  David is a very smart guy and a very good sabermetrician - he should be working for a team someday, if he wants to of course.  He has already done some work for some teams.

On the other hand, maybe it is not so hard for a good analyst or even a good scout or insider to take a list of projections and pick out the ones that are not very good.  Being able to do that is very different from coming up with good projections of your own.  In other words, a scout might easily be able to accurately identify 10 or 20 bad projections but he would likely get killed by a good projection system if he had to project every player.  I wish we had done some kind of survey where we had analysts look at a good projection system like THT, Pecota, Chone, Oliver, ZIPS, etc., and try and identify their top 10 or 20 that they think were “wrong.” Because most projections systems are mechanical, that may not be all that hard to do.  Of course, I am saying that after I have already checked on David’s picks.  It would also probably depend on whether the projection system just spits out the numbers from a computer, as I think THT (and most of the others) does, or whether someone already goes through and tweaks them before they are released.  For example, if David had tweaked the THT projections, presumably he would not have any players to choose as being over or under-rated.

Anyway, here is the good news and the bad news for David:

The good news is that out of the 14 hitters (I did not include Alex Gordon as he only has a few PA) he had on his list, he was correct on 10 of them and wrong on 4 of them.  For each player, he said whether they would likely over or under-perform their THT projection.

For pitchers, he went 8-3.

So overall, he was 18-7, which is quite impressive.

Or is it?

He says in the article that he didn’t look at any other projection system before he chose these players.  I have no reason not to believe him.

On the other hand, you have to wonder whether David is so smart or perhaps the THT system is so bad, or at least obviously bad on some of the players, that it was easy for David to come up with 25 players that they were likely to get wrong. Keep in mind that I think that the THT projections are very good, and they have probably fared well in the various “projection evaluations.” But, as I said, for various reasons, it is probably fairly easy and commonplace for even a good projection system to get some payers obviously wrong, if that projection system is “automated.” If the projection system includes tweaks by knowledgeable human beings, then by definition, it is not so easy to get any players obviously wrong (if it is obvious, they would be corrected by those human beings, right?).  Again, I think that THT is an “automated” projection system with few if any “human tweaking.”

Anyway, to see if David was really smart or the THT projections on his 25 players were just “bad” I compared the THT projections to my projections for those 25 players.  I have an independent projection system which is basically a “Marcel” with a lot of normalization (park, age, and opponent adjustments).  There is nothing special about my projection system whatsoever.  As I said, it is just a Marcel which is more finely-tuned than a basic Marcel.  In fact, the strength of my projections are my park adjustments, but for most projections you don’t need any park adjustments, unless a player changes teams, which only happens 10-15% of the time or so (just a guess).  In fact, in David’s list, almost none of the players changed teams from 08 to 09, which makes it even easier to project them.

So here is what I did:

If David thought that a player would outperform or under-perform his THT projection and so did my projection, I called that a “cheat” for David.  If his subjective evaluation and my projection disagreed with one another (for example, he thought a player would do better than THT’s projection and I thought he would do worse), then I called that a “non-cheat.” To be clear, I don’t think that David cheated in any way by looking at anyone else’s projection, and he didn’t have access to my projections anyway, as they are not publicly available (for no particular reason, BTW).

So here is the tally:

Hitters

Cheats: 10
Non-cheats: 4

Of the 4 players in which David and I differed, as compared to the THT projections of course, David was right on 3 of them, Howard, Ichiro, and Cano, and I was right on one of them, Miggie Cabrera.

Pitchers

Cheats: 9
Non-cheats: 2

Of the non-cheats, David was right on one of them, Greinke, and I was right on one of them, Volquez.

So while I still think that David is a really smart guy, I am going to conclude that it is probably not that hard to identify bad projections from any system, and based on the fact that of the 25 players that David identified as “bad” projections from THT, I agreed with him on 19 of them, that those were truly bad projections on THT’s part and not necessarily great insight on David’s part.

On the other hand, to be able just to look at THT’s projections and without looking at any other projection system system and come up with 25 or so of the players that THT got “obviously” wrong may be an impressive feat after all.

If someone has some spare time, maybe they can do the same thing with a few other projection systems like Chone or Marcel.  If they mostly agree with David, as mine did, then we can probably just take any projections system and assume that if the other projection systems all disagree with a player’s projection, then that projection is likely to be wrong.

Tango has a boat load of projections.  He can probably come up with a list of 20 or so projections for each system that the other systems disagree with.  I would guess that we would find that a majority of those players had “bad” projections in that one system.

If I remember I will revisit David’s picks at the end of the season, as obviously we have larger sample sizes to work with. 


#1    David Gassko      (see all posts) 2009/06/08 (Mon) @ 08:37

Mickey,

You’re stealing my fire! I just updated the spreadsheet I’ve been using to keep track, and was patting myself on the back. After all, through Saturday’s games, my results were as follows

Hitters
Out-perform: .779 OPS (Proj.), .826 (Act.), +.047 Diff
Under-perform: .927, .822, -.105

In other words, the hitters I thought would out-perform their projections indeed have, by 47 points of OPS, and those I thought would under-perform have done the same, by a whopping 105 points. Indeed, though the group I thought would under-perform was projected to be 150 points better than the group I thought would over-perform, they’ve been exactly equal more than two months into the season. What about the pitchers?

Pitcher
Out-perform: 4.29 ERA, 3.43, -0.85
Under-perform: 3.69, 4.98, +1.30

The difference here is even greater. The group I thought would be better than their projections has been better to the tune of 85 points of ERA and the group I thought would be worse has under-performed by a whopping 1.3 earned runs per nine. In fact, though the “under-perform” group was initially projected to be 0.6 runs better than the “over-perform group,” they’ve instead been 1.55 runs worse!

In all, this implies a pretty big win for me. But, I agree, it’s possible that I cherry picked the worst THT projections. Even if our system is very good (and since I designed it, I have to believe it is!) it could be that it misses on certain fairly easy to identify players. So what I did yesterday is add a control group, namely the CHONE projections. Since I did not look at the CHONE projections in picking which players I thought THT had missed, they can be used an independent control; technically, they shouldn’t show any bias. So how does CHONE compare to my picks?

Hitters
Over-perform: .797 (Proj.), .826 (Act.), +.029 (Diff)
Under-perform: .890, .822, -.068

CHONE’s projections for each group were a little more conservative than THT’s (as we would expect, since I chose the THT projections I thought looked the worst - if I had done this experiment with CHONE and used THT as the control, we would see the same thing). Still, CHONE projected more than a 90 point difference between the two groups, when it actuality, two months into the season, there is none. Indeed, the “over-perform” group is in fact out-performing their projection by around 30 points and the “under-perform” group is in fact under-performing their projection by almost 70 points. What about the pitchers?

Pitchers
Over-perform: 3.94, 3.43, -0.51
Under-perform: 3.77, 4.98, +1.21

Again, the CHONE projections are a bit closer together (though I should note that they actually saw the under-perform group about the same as THT did), seeing around a 0.2 run gap between the two groups. Instead, the gap has been 1.55 runs - but in the other directions! CHONE still has the “over-perform” group projected half-a-run too high, and the “under-perform” group 1.2 runs too low!

The CHONE results are a huge win for me (with the caveat that a lot can change over the next four months). They suggest that either I have some magical ability to spot breakouts and collapses, or (more likely) that all computer based systems, even the best ones, can be improved with human imput. That is, the best projection is not one spit out by the computer, but one that is then modified subjective opinion, or at least my subjective opinion. grin That was exactly the hypothesis of my article and it would be a HUGE result to learn that this is the case. Again, we won’t know for sure until I publish my final results in October, but so far it’s looking good.


#2          (see all posts) 2009/06/08 (Mon) @ 10:20

How does this compare to the control of picking the k highest/lower OPS/ERA players?


#3    MGL      (see all posts) 2009/06/08 (Mon) @ 13:15

While I/we don’t know whether you have a great ability to identify these “bad” projections or any good analyst can do roughly the same, I should have pointed out as you did, David, that the magnitude of your overs and unders is amazingly large.  It would have been interesting if you had taken an “over” or “under” on every player in the THT projection database (or even some other projection systems).  Maybe next year, we can do that for 5 or 6 projection systems - have everyone take an over or under on each projection. That is probably a better “contest” than evaluating projection systems in the first place, since I think that they are all basically the same, as they should be actually.

When you presented your list, had you done any research on the players you chose or just pretty much did everything “in your head.”


#4    MGL      (see all posts) 2009/06/08 (Mon) @ 13:47

IOW, how did you determine which players you thought were over or under projected?  Were there many others (you thought were wrong) and these were just the ones you thought were “really” wrong?  Why do you think a system like THT, which as you say, you principally designed, would get some players so wrong?  Do you think there is some bias or flaw in the system that can be corrected or do you think that that is inherent in all “automated” systems (getting some players flat our wrong)?  Inquiring minds want to know!


#5    JD      (see all posts) 2009/06/08 (Mon) @ 13:55

What always made me scratch my head is how with a lot of players, there always seems to be one projection system (and not always the same one, so this isn’t a criticism of any particular system) with a wacky outlier projection. I wonder how this happens, since I have always assumed most of the good projection systems MGL mentioned use mostly the same information in mostly the same way. If this is true, then how the heck is one of them always way off the consensus of other systems?


#6    Tangotiger      (see all posts) 2009/06/08 (Mon) @ 14:15

I disagree that most systems are similar.  The Forecasters Challenge 2009 that I am running is case-in-point. 

Even comparing Marcel, MGL, and ZiPS (which use the identical playing time forecasts, the Community, and therefore only differ on the rate stats) have limited overlap.

I mean, I agree that “overall” they will be similar.  But, you are only selecting 10% or 5% of the available players for your team.  So, those extreme type players (like Gassko is doing) stand out as the exceptions, and therefore define your forecasting system.

Anyway, here is a similar thing I ran six years ago:
http://www.tangotiger.net/forecastAll.html
http://www.tangotiger.net/forecastFinal.html
http://www.tangotiger.net/forecastFinal2.html

I had asked the leading forecasters for their forecasst on “hard to evaluate” players.


#7    David Gassko      (see all posts) 2009/06/08 (Mon) @ 15:01

I ended up with 29 players because that was about the number of projections that looked wrong to me. The vast majority (we projected over 2,600 players) looked fine, but a few (about 1%) looked off to me. I got that from my personal impressions as well perusing the players’ past stats on THT and Baseball Reference. Nothing too fancy, in other words. I think the key to my success (if I end up successful—things can change after all over the next four months) was that I knew things no projection system could. Or maybe it was that I could act on my hunches whereas a computer is significantly more constrained. For sure, not all of my calls (Delmon Young, Francisco Liriano) look good. But so far, so good.


#8    MGL      (see all posts) 2009/06/08 (Mon) @ 16:12

I haven’t looked closely at the 29 players (a few of them - Bonderman, Duchsherer, etc. - have hardly played this year, which is why I only used 25), but if there are obvious errors as compared to a quick and dirty Marcel, then picking them out is no big feat.  If a quick and dirty Marcel yields around the same result (as the “bad” projections) then something else is going on.  There are only 3 things that a projection system does, in order of importance, more or less:  one, use a weighted average of a player’s past performance, two, regress toward a proper mean, and three, age adjust.  Using that basic methodology, all systems should come up with pretty much the same numbers on most players.

Where systems can differ and where one system can be better than the other are:

Regressing individual components properly and then putting them all together.  For example, one system that just did a weighted average of a player’s past OPS and then regressed that would likely under-perform a system that took a weighted average of all of a player’s component stats, then regressed each one individually, and then put them all together to come up with an OPS projection.

Using certain denominators for the component rate stats.

Using different components, like, for example, HR per fly ball for pitchers in order to project their HR rate.  Or using BABIP or line drive rates for hitters to tweak their projections.

Using different weights for a basic weighting of past performance, depending on age, the type of player, etc.

Using different means to regress towards, depending on the type of player, his age, physique, etc.

Using different aging curves for different types of players, or just using different age adjustments, period.

Adjusting for injuries or things like that.

Using minor league stats or not, and if yes, what kind of MLE’s and park factors are being used.

Park adjusting stats for players who have switched teams in the last 3 or 4 years or so.

League adjustments for players who switched leagues in the last few years or so.

There are probably some more things I am missing that can separate one system from another, but as you can see, even though all systems are basically alike in their methodologies, there are a lot of things that can add up to produce very different results from one system to another, as Tango says.


#9    David Gassko      (see all posts) 2009/06/08 (Mon) @ 16:42

Mickey,

I sincerely doubt that THT or CHONE has any meaningful errors in the system. I can add Marcel to my spreadsheet if you’d like, but I suspect it will give roughly the same results. Hey, maybe humans do have something to add to the “wisdom” of projection systems. Or maybe I’ve just been very lucky. We’ll see.


#10    dan      (see all posts) 2009/06/08 (Mon) @ 17:46

If you read Matt Swartz’s articles at StatSpeak on evaluating projection systems, it should be much easier to pick players who will under- and over-perform. I’m not saying this is what David did, nor am I saying he even tried to systematically identify these kinds of players a projection would miss on. I’m just saying that if you wanted to see the kinds of players a projection would miss on, Matt’s article is the right place to start.

http://statspeak.net/2009/04/testing-the-projection-systems-strengths-and-weaknesses.html


#11    David Gassko      (see all posts) 2009/06/08 (Mon) @ 17:48

Dan,

That is not what I did, but more importantly I question the significance of Matt’s findings. Though I was very impressed with the breadth and thoroughness of his work, the sample sizes are such that most of the “biases” he found are likely just randomness, IMHO.


#12          (see all posts) 2009/06/08 (Mon) @ 19:22

I seem to recall (and Rally can correct me if wrong) that CHONE is manually-fiddled after the computer spits out the results of the arithmetic.


#13    MGL      (see all posts) 2009/06/08 (Mon) @ 21:09

Well, based on David’s results so far, I think that manual fiddling is the way to go - at least for David that is!

David, what I am trying to say (about biases and errors in the methodology), and I am probably using the wrong words, is that since you essentially designed the system, and you are able to pick out 29 players whom you think it completely missed the boat with, surely you should be able to figure out how to tweak the system itself so that it doesn’t miss the boat on so many players.  IOW, the first question you need to ask in order to “fix” the system is, “What is it that you knew and the ‘system’ didn’t, for these players?” I mean theoretically, whatever a human being does, you can program a computer to do, especially if you know exactly what it is you are doing that the current system is not.


#14    Rally      (see all posts) 2009/06/08 (Mon) @ 21:54

"I seem to recall (and Rally can correct me if wrong) that CHONE is manually-fiddled after the computer spits out the results of the arithmetic.”

100% wrong.  No fiddling.  Check out the projection for Ben Sheets as an example.

I know what MGL is saying, but I doubt it’s one thing that David was going on to pick those players.  If it were it would be easy to program.  Probably more like 29 things, that may not be wise to systematically program and apply to everyone, combined with a gut feeling.  Sound right, David?


#15    David Gassko      (see all posts) 2009/06/08 (Mon) @ 22:22

Yep, I’m with Rally. How do you include, “I saw David Ortiz hit and no matter what numbers he put up, he looked done to me” in a projection system? Or “Evan Longoria/Justin Upton has superstar written all over him?” (Note that THT projections do give players credit for reaching the majors at a young age, so my thinking was really that these two were even better than a computer projection could statistically account for. You certainly can’t account for, “I never got what made Fausto Carmona good enough to be a major league pitcher, so I see no way he can put up an ERA in the low-4.00s.” Or, “I’ve always thought Joe Blanton sucked.”

Perhaps next year, I will manually fiddle with the THT projections, but that creates its own set of problems. In the end, if these numbers hold up through the end of the season, I think we will have arrived at a fascinating, important result, but frankly I’m not sure what exactly we’ll be able to do with it.

Maybe some team will offer me a lot of money for my psychic powers. grin


#16    Rally      (see all posts) 2009/06/08 (Mon) @ 23:55

From the forecast challenge of 6 years ago, the forecasting systems on average beat the average fan.  The fans who did very well might have been smarter than a system, or they might just be the lucky ones.  Who knows with only one year of data.

What I wondered was how a subgroup of observers would do - say general managers and scouts.  Maybe we should throw top-notch sabermetricians into that mix.  These groups might well do a lot better than the average fan.

Even with that, we’re cherry picking here.  David is picking players he’s seen quite a bit of (Ortiz is hitting like a leetle beetch - he’s done).  When I’m projecting 1500 major and minor league players, how many have I seen enough of that I could add a subjective opinion on?  Angels, a some of their division opponents, a few Orioles, whoever plays in the postseason.  But most players I wouldn’t have much to add.


#17    MGL      (see all posts) 2009/06/09 (Tue) @ 00:14

It is like sports handicapping and linemakers.  The linesmakers and the public combined do a lot better overall than any handicapper.  But, the handicappers can cherry pick games to bet on.  A good handicapper can identify 10% of the lines that are clearly wrong, but if they had to bet on every game they would lose.  If the linesmaker were able to bet against the handicapper he would win as well, because he would be able to cherry pick his games. It is the same thing here.  The forecaster is going to beat out any scout or the public or what have you for all players overall.  But either one can cherry pick players and kill the other one for those players.  There is an old saw in sports betting that is appropriate here: “He who makes the line first loses.” In forecasting, it is “He who projects first loses.”

So I think that David and any other good analyst should definitely manually tweak the projections.  That is assuming that they have the requisite skill and knowledge, which probably varies from analyst to analyst.  Although I watch players as much as anyone else, I’m not sure I would be so comfortable doing any manually tweaking.  But, as I said, if we remember next year, we should have a “contest” for people to pick up to X percent of all the projections as “wrong” like David did.  My guess would be that everyone as a whole would do very well, and that the good analysts like David would kill the forecasts in those small number of players, just as he is doing now.  And he (David) is in such a lead now, that it is unlikely that much will change at the end of the year.

As I said in my initial post, the only thing that troubles me a little is that my regular projections agreed with David in most of the cases, suggesting that the THT ones were indeed bad on those players even as compared to a basic advanced Marcel (which mine is).


#18    David Gassko      (see all posts) 2009/06/09 (Tue) @ 08:18

Mickey,

Why don’t you e-mail me your projections and I’ll post how they compare to my predictions?


#19    MGL      (see all posts) 2009/06/09 (Tue) @ 11:54

I’m not sure what you mean “how they compare to my predictions.” You mean for those 29 players?  By “predictions” do you mean “up” or “down” for those 29 players in your article?  If that is what you mean, I already did that, didn’t I.  I am not sure what you are going to post.


#20    David Gassko      (see all posts) 2009/06/09 (Tue) @ 17:54

I mean that I’ll make the same comparison as I did with THT and CHONE, and see what exactly you projected for each group of hitters and pitchers.


#21    MGL      (see all posts) 2009/06/09 (Tue) @ 19:20

David, on its way…


#22          (see all posts) 2009/06/09 (Tue) @ 19:24

Get a group of people to pick players who under/over-perform, and then do a ‘Wisdom of the Crowds’ approach on Marcel, or whatnot, to see if it outperforms Marcel the next year.


#23    Tangotiger      (see all posts) 2009/06/09 (Tue) @ 19:43

Sal, that was practically what I ran 6 years ago.


#24    David Gassko      (see all posts) 2009/06/09 (Tue) @ 22:59

Okay, so I added MGL’s projections to the study. The results are as follows, with THT and CHONE results also included to make things simpler. Before I begin, I’d just like to note that THT projections will necessarily be further off since I picked the THT projections that looked worst to me, not CHONE or MGL. As you’ll see, it’s likely that if I had done this exercise with a different system, that system would have looked worse than the rest. Sampling bias and all that. On to the results…

Hitters
Over-perform
THT: .779(Proj.), .826 (Act.), +.047 (Diff)
CHONE: .797 (Proj.), .826 (Act.), +.029 (Diff)
MGL: .788 (Proj.), .826 (Act.), +.038 (Diff)

The hitters I thought would beat their projection have done so to the tune of 47 points of OPS. They beat CHONE by 29 and MGL by 38. So no change here—if anything, MGL gets beat a little worse than my CHONE.

Under-perform
THT: .927 (Proj.), .822 (Act.), -.105 (Diff)
CHONE: .890 (Proj.), .822 (Act.), -.068 (Diff)
MGL: .913 (Proj.), .822 (Act.), -.091

Same deal here. The hitters I thought were projected too high have been 105 points worse than their THT projection, 68 points worse than their CHONE projection and 91 points worse than their MGL projection. Again, MGL is worse than CHONE, but here by a more substantial margin.

In all, all three projection systems saw a gap of around 120 points in OPS between the two groups, but instead they’ve been exactly equal. Wow!

On to the pitchers…

Pitchers
Over-perform
THT: 4.29 (Proj.), 3.43 (Act.), -0.85 (Diff.)
CHONE: 3.94 (Proj.), 3.43 (Act.), -0.51 (Diff.)
MGL: 4.08 (Proj.), 3.43 (Act.), -0.65 (Diff.)

The pitchers that I thought would out-perform their projections have. They’ve beaten their THT projection by 85 points of ERA, their CHONE projection by 51 points, and their MGL projection by 65 points. Again, all three systems are wrong and again CHONE is better than MGL.

Under-perform
THT: 3.69 (Proj.), 4.98 (Act.), +1.30 (Diff.)
CHONE: 3.77 (Proj.), 4.98 (Act.), +1.21 (Diff.)
MGL: 3.98 (Proj.), 4.98 (Act.), +1.00 (Diff.)

I thought these pitchers would under-perform their projections and boy have they ever! They’ve done 1.3 runs worse than the THT projections thought they would, 1.21 runs worse than CHONE, and 1 run worse than MGL. Finally, MGL beats CHONE but he’s still a full run off for these pitchers.

In all, the projection systems thought that the first group of pitchers (the one I thought would do better than their projections) was around .25 runs worse than the second, but instead they’ve been 1.55 runs BETTER! That’s a 1.8 run swing. I don’t think any interjection is needed here to explain how incredible a result this is.

So clearly I’ve beaten all the projections thus far, even though I made these predictions only in regards to one. In other words, if these results hold up, it’s not the THT system that’s missing something but ALL computer based systems—or at least the two best besides THT that are in existence today and available to the public (more or less). I’ll let that be interpreted as you wish.

One thing I will note with interest is that while I am definitely winning in all four categories, I’ve nailed the under-performers better than I have the over-performers. Perhaps there’s something to that…


#25          (see all posts) 2009/06/09 (Tue) @ 23:36

Tango, how did it work? Do you have a link?

One way to do it would be to make a google spreadsheet or the likes and give access to a list of 20 people (or whatnot) who could each contribute what they thought was the best. You then run a simple script to see who touched which players, and then use some sort of weighting to see how much the combination of all the reviews moves the final projections.

That way you could get more people giving input on more players, and perhaps their contributions alone aren’t that impressive, but on the whole they may move even good projections closer to where they end up.

If that’s what you already did, I’d love to see the results.


#26    MGL      (see all posts) 2009/06/10 (Wed) @ 00:49

David, as I said, I agree that your results so far are quite impressive.  I assume that you will revisit this at the end of the season.  Not to take anything away from your feat thus far, but to be honest, I am not exactly sure what to make of a situation where someone identifies X players that they think will overperform or underperform (by no particular amount) and it turns out that those players in the aggregate greatly over or underperform.  In other words, I am not exactly sure how great a feat that is.  For example, let’s say that you actually put a number to all those projections - let’s say that you project your under-performers to be 20 points less in OPS than the THT projections.  I realize that you weren’t doing that - you were merely picking out those players that you thought were most “off” (which actually suggests that your own projections, if you made one, would be quite a bit different from the THT ones).  Anyway, let’s say that the THT aggregate projections were .800 and yours were .780.  Now if the players actually produced .780, wouldn’t that be a lot more accurate on your part than if they produced .720?  I’m just kind of thinking out loud here and wondering whether we should be measuring your “feat” based on how much off the THT projections were. I’m not really sure.  Anyway, let’s see what happens at the end of the year.


#27    Tangotiger      (see all posts) 2009/06/10 (Wed) @ 06:55

Sal: post 6


#28          (see all posts) 2009/06/10 (Wed) @ 19:23

Slightly different (I think).

I would rather use Marcels, straight-up, and then have the forecasters all modify those Marcels where they thought it was off.

You take the results of all forecasters, weight them evenly (the first year), and use the aggregate changes to edit the original Marcels.

Then compare it to the Marcels as they are, and see if they do better.

You could do it with the data if you have each forecasters individual picks from 6 years ago lying around.


#29    Mark      (see all posts) 2009/06/14 (Sun) @ 07:25

"It is like sports handicapping and linemakers.  The linesmakers and the public combined do a lot better overall than any handicapper.  But, the handicappers can cherry pick games to bet on.  A good handicapper can identify 10% of the lines that are clearly wrong, but if they had to bet on every game they would lose.  If the linesmaker were able to bet against the handicapper he would win as well, because he would be able to cherry pick his games. It is the same thing here. “

Obviously, you have never bet on sports for a living… If you are a serious handicapper in a given sport, you will be able to create fair odds that are more accurate than the market price on average.  On average means that the your odds will be closer to the actual results than the sportsbooks’ odds are—after grading the predictions of every single game.  Yes, you will only choose to bet on those games with a significant enough theoretical edge to overcome the ~2% to 5% spread you have to cross to place a wager.  But no, the average sportsbook could not beat a successful handicapper by cherry picking games.  What you wrote is just a terrible misinterpretation of how sports handicapping works.  Good players do not pick every game, mainly because of the vig/commission being charged—it is not because they are capable of correctly pricing only a certain subset of games.

In particular, this sentence: “The linesmakers and the public combined do a lot better overall than any handicapper” truly makes no sense.  Linemakers make money mainly due to the vig they charge; some superior ones make more than their theoretical hold bc they shade the markets according to their own calculated fair value, which may be better than what the public’s fair value is.  Most linesmakers do not take that risk.  The “public” is who is paying the vig.  I.e., what the public loses, the linemaker gains.  Net sum = zero.  So for the linesmakers and the public “combined” make more money than “any handicapper” is not even logically possible.  A sharp handicapper makes money off of mispricings.  Mispricings can arise from public opinion and/or from linemakers’ opinions.  Both are beatable if you are skilled enough.


#30    MGL      (see all posts) 2009/06/14 (Sun) @ 12:39

You have no idea how ironic your post is, but I may be using the word “ironic” incorrectly.


#31    weskelton      (see all posts) 2009/10/07 (Wed) @ 22:59

David Gassko has done an evaluation of his pres-season forecasting vs THT’s projections…

http://www.hardballtimes.com/main/article/man-vs.-computer/

It looks like he’s claiming victory on 21 out of 28.  Not too bad.


#32    MGL      (see all posts) 2009/10/08 (Thu) @ 01:44

Wes, have you been sleeping?  There has been a thread on that on this blog for about 3 days! wink


#33    weskelton      (see all posts) 2009/10/08 (Thu) @ 09:20

Ah yes, I see it now.  Thanks MGL.  I guess it has been a few days since I’ve been over here.  I’ll try not to let it happen again.


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

Feb 12 05:18
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 12 04:55
Who is Jeremy Lin?

Feb 12 03:15
New PECOTA

Feb 12 02:42
Whitney Houston

Feb 12 02:23
Psst… wanna intern in Canada?

Feb 12 00:40
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential