THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, September 22, 2006

How Reliable Are Fans?

By Tangotiger, 11:59 AM

Very.  Here’s the evidence, using the Fans’ Scouting Report.


How many fans do we really need participating for each team?

Here’s a mini-study I ran, and I’ll take you step-by-step.  The first thing I did was look for all ballots where all 7 traits were filled in for each player, and that there were at least 8 such players on each ballot.  These can be considered a “fully-completed” ballot.

For each player, I figured out his unweighted score.  So, a guy with all 5s was given a 5.0, a guy with some 5s, and some 4s would have been a 4-point-something, etc.

I then took a random team, the Twins.  They had 31 fully-completed ballots.  I divided up the Twins fans into two random sets, and computed, for each player, their average score.  I ended up with these two sets:

Set1 - Set2
3.8 - 4 Bartlett, Jason
2.3 - 2.3 Batista, Tony
3.6 - 4 Castillo, Luis
2.8 - 3 Castro, Juan
3.8 - 3.8 Cuddyer, Michael
3.6 - 3.5 Ford, Lew
4.2 - 4.4 Hunter, Torii
3.1 - 2.9 Kubel, Jason
4.4 - 4.5 Mauer, Joe
3.1 - 3.1 Morneau, Justin
3.9 - 4.2 Punto, Nick
3.1 - 3.6 Redmond, Mike
3.5 - 3.4 Rodriguez, Luis
2.3 - 2.5 Stewart, Shannon
3.8 - 3.8 Tyner, Jason
2.5 - 2.7 White, Rondell

That’s a total of 16 fielders, each received 8 to 16 votes (for an average of 13 votes in each set).

Then, a simple correlation was run between the two.  The correlation (r) was 0.96.  For all intents and purposes, you take a group of 13 hardcore Twins fans, and compare their evaluation to 13 other hardcore Twins fans, and they will agree almost perfectly.  Anyone else surprised that just getting 13 ballots is enough?

Let’s drop it down some, and try it with another team.  This time, we’ll try the Reds.  I have 15 Reds players, who each received 7 to 12 votes (average of 9.5).  Here is how one group of fans evaluated them against the other:
3 - 2.8 Aurilia, Rich
4 - 3.9 Castro, Juan
2.8 - 2.7 Clayton, Royce
3.7 - 3.9 Denorfia, Chris
2.2 - 2.3 Dunn, Adam
3.4 - 3.5 Encarnacion, Edwin
3.9 - 3.9 Freel, Ryan
3 - 3.1 Griffey Jr., Ken
2.6 - 2.6 Hatteberg, Scott
3.7 - 3.5 Kearns, Austin
3.6 - 3.3 LaRue, Jason
2.6 - 2.5 Lopez, Felipe
4.1 - 4.1 Phillips, Brandon
2.6 - 2.6 Ross, David
2.2 - 2.2 Valentin, Javier

Remember now, we have fewer fans here compared to the Twins (9.5 against 13).  Our correlation between the two groups of Reds fans?  0.98!

Let’s look at one last one, the Pirates.  I have 13 players evaluated for each group of fans, with each player received 4 to 7 evaluations (for an average of only six evaluations in each set).  Pretty tiny, right?  Here are their evaluations:
3.6 - 3.3 Bautista, Jose
3.3 - 3.4 Bay, Jason
2.6 - 2.5 Burnitz, Jeromy
2.8 - 2.7 Casey, Sean
3.2 - 2.9 Castillo, Jose
4.4 - 4 Duffy, Chris
3.6 - 3.5 McLouth, Nate
3.6 - 3.3 Nady, Xavier
2.8 - 2.9 Paulino, Ronny
3.3 - 3 Randa, Joe
3.7 - 4.2 Sanchez, Freddy
2.7 - 2.6 Wilson, Craig
4 - 3.9 Wilson, Jack

And the regression?  0.90.  Think about that. 

I need 2,000 PA for a hitter to have as much reliability that his performance represents his true talent level, as I need SIX fans to tell me what they see is representative of other hardcore fans.  Now remember, fans in general can be completely wrong, but they will be wrong together.  All we’re looking for here is how reliable is the small representation, and not how accurate is their evaluation.

Here’s a quick regression estimate:
r = fans / (fans + 0.6)

If you have 6 fans, the reliability is .90.  If you have 13 fans, the reliability is .96.

Establish what reliability level you want, and I can tell you how many fans you need.

Finally, the biggest reprensentation, by far, was for the Mariners.  I have an average of 68.5 evaluations for each of these players:
4.3 - 4.3 Beltre, Adrian
4.4 - 4.4 Betancourt, Yuniesky
2.8 - 2.7 Bloomquist, Willie
2.8 - 2.7 Ibanez, Raul
3.3 - 3.2 Johjima, Kenji
3.4 - 3.6 Jones, Adam
3.5 - 3.5 Lopez, Jose
3.5 - 3.5 Reed, Jeremy
2.6 - 2.6 Rivera, Rene
2.8 - 2.7 Sexson, Richie
4.8 - 4.8 Suzuki, Ichiro

The correlation is 0.994.  What would our equation say?  68.55 / (68.55+.6) = 0.991.

#1    David Smyth      (see all posts) 2006/09/22 (Fri) @ 16:56

I think I’m on record as not being a fan of the Fan Scouting Report.

Given that admission, I’m wondering what is the significance that there is a high agreement among the responders. There are obvious reasons why that should be somewhat expected, having little or nothing to do with the accuracy of ratings. And, even if the ratings *are* reasonably accurate, it’s not neccessarily (or even likely, IMO) because fans are good enough scouts to produce accurate ratings “en masse”.


#2    tangotiger      (see all posts) 2006/09/22 (Fri) @ 17:13

Fans, by themselves (alone) are not good enough scouts, but, “en masse”, they sure look like it. 

I’ve seen detailed scouting reports from MLB scouts, and, they are very hit-and-miss.  I see little reason, if we look at it stricly from an observational standpoint, why we would choose a pro scout over the hardcore readers the Fans’ Scouting Report attracts.

So, the question is how much value does the observer actually add over the performance numbers.  And, as I’ve said in the past, it does improve the correlation of UZR year-to-year, from .50 to .60.  (Or similarly, UZR improves the Fans’ from .35 to .60.)


#3    James M.      (see all posts) 2006/09/22 (Fri) @ 17:16

Once again I tried TWICE to post a response without success.  I’m getting really tired of this!

Anyway, I agree with David.  And as evidence, look at the 2 sets of ratings for Juan Castro.  Twins fans rate him consistently (2.8-3.0).  And so do Reds fans (3.9-4.0).  But it sure looks like they are rating 2 different players!


#4    tangotiger      (see all posts) 2006/09/22 (Fri) @ 18:09

You probably missed the comment in the other thread, but Castro is an example of an exception proving the rule.  Here is what I said:

Looking at players from multiple teams (meaning two different sets of fans would have evaluated these players), there’s about 20 of them, of which three fans had wildly differing views.  The rest were pretty close.

They are:


Juan Castro: Twins fans think he’s a below average fielder, and Reds fans think he’s way above average.

Bobby Abreu: Phillies fans think he’s average, and Yanks fans think he’s very above average.  Yanks fans probably have no good baseline, in this case.

Xavier Nady: Mets fans think he’s way below average, and Pirates fans think he’s a bit above average.


#5    MGL      (see all posts) 2006/09/22 (Fri) @ 20:23

Well, I don’t think it is surprising that there is so much agreement among fans, as they are all watching the same thing and presumably seeing most of the teams’ games.

While reliability does not mean accuracy, it suggests that there might be considerable accuracy.  And without reliability, it would be doubtful that there would be much accuracy, so I think we are on the right track.  The reliability that Tango reports is important, since, as he states, it suggests that we don’t need many reports at all if we want to use fan scouting (assuming we find it accurate/helpful).

There is nop doubt that fans are going to do a lot better than a professional scout for 3 reasons.  One, they are seeing more games (again, presumably almost all of the teams’ games), two, there is a lot of collective wisdom when you have lots of fans, even 20 or so, three, fans do not necessarily have the age-old biases that the scouts have, and four, Tango is giving them a good methodology to follow and then he is compiling the data (if Tango gave the scouts some direction, they would probably do better).

BTW, this is not PC (not that I care), but I think the scouting profession is a joke, at least as compared to what it could/should be.  Why should it NOT be a joke, when the powers that be that run teams are a joke and they are the ones that hire, fire, and evaluate the scouts?

An example of scouting is a joke, and not nearly what it is cracked up to be, even by analysts, is the Dodgers and Aaron Sele.  He cannot pitch at all.  That should be obvious to anyone who watches him throw an inning or two, let alone a scout.  He throws an 82-85 mph fastball, and a rolling curve that fools no one.  So why has he been pitching some important innings in a pennant race?  Because he pitched “well” (his numbers) for a few innings when the Dodgers tookk a flyer on him (and I suppose because he is a “proven veteran").  Where are the scouts with Aaron Sele (or Jose Lima the last couple of years)?  Someone tell me that.  Scouts, schmouts.  Professional ones that is.


#6    tangotiger      (see all posts) 2006/09/22 (Fri) @ 20:34

I agree that scout data needs to be better compiled.  Of the teams I’ve talked to, the scouting data is “reactive” not “proactive”.  That is, what I’d want, is for all the data to be compiled in a useable form, and then I’d extract that data and generate useful reports.  This isn’t happening. 

They have this, what they call, database, of data and reports from the scouts.  But, just because you have something in an electronic form, and it’s in the same folder or application, doesn’t make it a database.  It’s more like a dataheap.

That said, if the database is well-constructed, we’d have stuff like Scott Rolen injuring his arm a few years ago, and then we’d be able to correlate that with his low UZR, and adjust for it.  Or, if you have the speed and arm scores of players, then you’d have a better mean to regress to.  (Brad Hawpe is the latest find from the Fans’ as a guy with a great throwing arm.) See, it doesn’t matter, at all practically, what UZR says about Rondell White’s throwing arm, or say Richard Hidalgo, or even this new character Hawpe.  White throws like a girl, and you’d regress his UZR 90% towards that fact, if not 100%.

Finally, in 10 years, the Fans’ data will have historical value.  None of the young ones would believe us if we talked about how great Gary Pettis was, and I don’t believe others who tell me about Paul Blair.  Now, we’ll have the record.


#7    studes      (see all posts) 2006/09/23 (Sat) @ 06:36

Regarding consistency, Tango, isn’t the validity of your analysis undermined by the availability of 2005 rankings?  I would hypothesize that a number of voters are going to those rankings before entering their own.


#8    Will      (see all posts) 2006/09/23 (Sat) @ 08:22

Think about who is voting in this. Most of these people probably got linked to the survey from one of the blogs that cover their team. They are regular readers of these blogs. I know that, at least, in the Nationals “blogosphere,” a very strong consensus usually develops with regard to any issue (player X’s performance, evaluation of trade Y). Not to say that there is absolutely no disagreement between blog authors, but there is probably less than there would be if each individual did their evaluations completely independently. In other words, a person’s opinion is shaped only partially by his own independent observation and analysis, but also by the concensus opinion. In voting the Scouting Report, they could be greatly influenced by this concensus opinion, which would give exactly the consistency you find. I wonder what would happen if you polled a group of hard-core fans who don’t follow their team on the internet? Would you get the same consensus, a different concensus, or no concensus?

I think perhaps that Bobby Abreu ratings illustrate this. I haven’t seen Abreu play enough to conclude that he’s above average, but I would tend to trust the opinions of the Yankees fans more than the Phillies fans. For whatever reason, Phillies fans had a real strong dislike of Abreu. They formed some kind of concensus opinion, that Abreu was not a hard-worker or something, and they evaluate him with this concensus in mind. I can’t help but think that the low rating is due to his unpopularity. Hell, they would probably rate him as average hitter if they had the chance.


#9    Will      (see all posts) 2006/09/23 (Sat) @ 09:11

For example, this blog post (http://dcbb.blogspot.com/2006/09/hooray-for-defense.html) which links to the survey, tells people exactly how to vote for a couple players. Now it’s hard to imagine how anyone could rate Jose Guillen’s arm or Jose Vidro any differently, but it seems that looking at someone else’s evaluation of a player is no different from looking at a player’s fielding stats, which is strictly forbiden.  And according to tango’s comment, that post provided a lot of ballots.


#10    studes      (see all posts) 2006/09/23 (Sat) @ 09:34

Yes, that’s a much better point than mine.  To me, the extreme consistency you’re seeing is a sign of homogeneity among the voters you’ve attracted so far.


#11    MGL      (see all posts) 2006/09/23 (Sat) @ 11:08

Good points (about fans being influenced by a common source).  The bottom line is how much does fan ratings help predict UZR (or Dewan’s ratings, or even ZR or Dial’s ratings) over and above any other metric, especially UZR itself.  We are assuming that UZR in the long run captures defensive value, which I think it does, excpeting a few small things (pop-ups, relay throws, scoops at first, etc.), and accepting some minor biases and methodological problems with UZR and other advanced PBP metrics.

And as we always say for any performance, the less data we have, the more scouting becomes important, especially for things that can readily be “seen,” like arm strength, speed, etc.


#12    tangotiger      (see all posts) 2006/09/23 (Sat) @ 11:34

I find blogs rife with the opposite of what Will and studes are saying, that people have a certain viewpoint, and will fight with others, and not listen to others.

Even if they do, everyone will gravitate to the center.  That is, the mean I’d get would be the same, and only the variability has changed.  The point remains that the fans I get are representative of the hardcore fans I seeked.  Whether they are true “Wisdom of Crowds” or not is up to debate.

The real test, the accuracy, will be when I run it against UZR or Dewan or Pinto.  If the fans’ evaluations are independent of say UZR, then their r-squared will add up perfectly.  We’ll see.


#13    David Smyth      (see all posts) 2006/09/23 (Sat) @ 13:28

It’s one thing to see Tango’s result of UZR improvement by adding the Scouting. It’s another thing to look at one’s own balloting experience, and realize that 20% was probably valid observation, according to the requested criteria, and 80% was not up to that standard, for several reasons. And that is for me and the Cubs, probably an above-avg example because of the number of their games I see, and my familiarity with sabermetrics.

So, if a flawed system (IMO) can improve UZR from 50% to 60% yty correlation, then what if the system can be improved?

OTTOMH, if I chose (or was assigned), 1 or 2 players on the Cubs for that season in advance, I could do a much better job focusing on them (and their positional counterparts for a baseline), instead of seeing Tango’s request for ballots near the end of the season and trying to reflect back on all the games I watched in the usual casual manner. Doing this, the number of ballots per player would be smaller--but the non-serious submitters might be weeded out better, and the submissions of those that remain would be of much better quality.

Just an idea.


#14          (see all posts) 2006/09/23 (Sat) @ 13:42

Well, that is an excellent idea!

After all, we’ve just determined that the difference between getting 20 fans (reliability of .97) and 60 fans (reliability of .99) isn’t that great. 

Therefore, what I can do then is ask for “superfans” to fill out even more detailed ballots, and perhaps hand them in on a monthly basis.  If I can get 2 or 3 for each team, we have, in essence the “professional scout” representation, to go along with the hardcore fan representation.

We can try it out for one year, and see how well the hardcore fans come to the superfans.

Those interested can email me by clicking on my name.


#15    awsytn      (see all posts) 2006/09/23 (Sat) @ 14:09

As for the discrepancy between the Castro-Twin and Castro-Red ratings, I can say that the Twins half of the equation was probably colored by hardcore fans’ unanimous displeasure at Castro starting the year at short and Bartlett beginning the year in AAA. And subsequently, when Bartlett was recalled, he showed vastly superior range and glovework to what Castro had shown, who must have suffered by comparison in the ratings of those who submitted. I can’t account for the Reds’ half of it, though.


#16    studes      (see all posts) 2006/09/23 (Sat) @ 15:04

Tango, I have to disagree with you on this.  I don’t think that you can say, at this point, what the minimum sample size for consistency is.  There are too many unknowns in the info you’ve collected.  You’re asking a lot of your data.

If the fan’s poll matches UZR, that might mean that most of the respondents know UZR, or a similar system.  If it differs, there might be a systematic reason for that.  There are obviously biases in the data you’re collecting due to self-selection of those who fill in the survey; you just don’t know what they are yet.


#17    tangotiger      (see all posts) 2006/09/23 (Sat) @ 18:07

I’ll disagree back about the consistency.  The data is very clear on the consistency.  This may mean it’s very biased, but one thing it is, is consistent.

As for the correlation to UZR, that’s not true.  When I did this the first year, I did UZR year-to-year, and I got an r-squared of .25.  I did Fans-to-UZR, and get an r-squared of .12.  The test then was UZR+Fans to UZR (of year x+1).  If the UZR and Fans were truly independent, I’d expect an r-squared of .37.  I got .36.  What does this mean?  That UZR and the Fans each explained an uncertainty without virtually no overlap.  Therefore, the Fans results were not tainted by UZR knowledge.


#18    David Gassko      (see all posts) 2006/09/23 (Sat) @ 19:43

Regarding the Fans survey raising the correlation from .5 to .6, I suspect a large part of that might be player speed. That is, speed has a high correlation with fielding ability and so fluke years might be easily identified by accounting for a player’s speed. PECOTA, for example, uses speed scores in their fielding projections. So do fielding ratings without speed ratings add anything? I doubt it.


#19    MGL      (see all posts) 2006/09/23 (Sat) @ 21:42

Interesting about the speed thing, although I think that is much less true for the IF than for the OF.

Let’s not also forget that these r’s and r squared’s have lots of uncertainty.  When they are quoted, they should include a confidence interval.  I mean if the .25 is plus or minus even .1 (at 2 sigma) and the .36 is plus or minus .1, the “almost complete independence” has to be taken with a large grain of salt.


#20    Guy      (see all posts) 2006/09/24 (Sun) @ 04:34

"I’ll disagree back about the consistency.  The data is very clear on the consistency.  This may mean it’s very biased, but one thing it is, is consistent.”

But I think Studes’ point is that consistency and bias can be related.  That is, the responses of your participants may be consistent in part because they share a bias, whether created by prior ratings, blogs, conventional wisdom, or whatever.  So you can’t know yet how consistent the ratings will be when you have an unbiased sample of participants. 

* * *

The fan ratings may simply be adding a rough kind of “career UZR” to the model.  I doubt fans can totally separate their impressions gained from prior years from current ratings (assuming you want them to).  It would be interesting to see if fan rating adds anything to a prediction model that uses the last 2-3 years of UZR for a player.


#21    Joe Arthur      (see all posts) 2006/09/24 (Sun) @ 05:27

1) as with James M a couple days ago here, people from time to time have complained about losing their comments; I’ve had that problem too, but noticed that it seemed to happen with comments which I previewed multiple times before trying to submit. I don’t think I’ve had a problem if I’ve previewed only once.
2) While I believe that there is some “wisdom in crowds”, I too think the great agreement among fans is likelier to reflect something other than their truly independent observations. Years ago I read a book called “Eyewitness Testimony” by Elizabeth Loftus (still available from the big online sellers). She described experiments such as something like this:
subjects watch video of cars colliding. some are asked “how fast was 2nd car going when it bumped 1st car”, some asked “… when it hit 1st car”, some asked “… when it crashed into 1st car.” From the same video, with the differently phrased questions, the answers revolved around 15 mph, 25 mph, 35 mph.
Obviously that’s an example of survey bias, and I certainly don’t mean to imply that Tango’s survey is biased in that way. But the point is that pre-conceptions from the opinions of others can have a very large impact on our evaluation of what we have seen (and maybe also on what we are about to see).
I think it would be interesting if Tango could try to dig out rookies only and see how consistent their ratings were. Presumably there would be less pre-existing consensus available to influence fans’ interpretation. Even so, to the extent that fans are judging mostly from TV [like me], rather than live, the comments of the regular broadcast crew could be shaping opinion too.
3) I’m not an expert on survey data or techniques, but my wife knows a lot about it, so I’ve picked up a little knowledge (misinformation?) by accident. In surveys which are attempting to be sophisticated, the same information will be requested multiple times with different questions or in different ways. For instance, a respondent’s snap judgment about player X might be colored by his/her responses about immediately preceding player Y. Multiple versions of the survey form, in which X follows a random mix of different players, might produce different results for X.


#22    tangotiger      (see all posts) 2006/09/24 (Sun) @ 09:17

I like the idea about rookies.  I can tell you that professional scouts were not all high on Betancourt and Francoeur two years ago (they changed their reports a few months at a time), but fans last year were pretty unanimous.  I’ll look again this year.

As well, one important consideration is that I have 7 categories.  So, even if an Angel fan had heard about Vlad’s great arm, it’s not like they knew about his range, instincts, release, and hands.  And it’s not like people talk about those individual things often enough to shape a person’s opinion.

Fans do have clear bias with a long-standing member.  Just this year was the first time Reds fans realized that Junior was average at best.  They are in denial with their heroes, which was clear to me when I ran a correlation against UZR and age.


#23    tangotiger      (see all posts) 2006/09/24 (Sun) @ 09:18

As well, any other ideas/issues/complaints, please bring them up.  I’ll try to do something comprehensive, maybe getting the Pinto, Dewan and MGL data as well, to make one super report, on all aspects of this.  So, things like age-bias or experience-bias would be one of many.  Changing teams, etc.  Just keep them coming.


#24          (see all posts) 2006/09/24 (Sun) @ 11:40

Tango,
Shouldn’t your measure of “reliability” be the variance explained in all of your datapoints? Your not calculating the variance of all the datapoints, your just calculating the variance between the average of two halves of your data. Couldn’t you stick every datapoint into SPSS, and run a repeated measures ANOVA, and report the variance explained by the model?

There are tons of psychological factors that are well known that could bias these results. Joe mentioned one. There are sooo many more.  It’s essentially like not collecting hitting statistics, and instead trying to figure out BA by taking the average guess across a lot of fans.

But one way you might measure the accuracy of your data is by seeing how independent the different measures are.  There really shouldn’t be any correlation between instincts and arm strength.  But my guess is that they will be correlated: people think that player X is good, so they make up component scores to reflect that, thinking “He must have good instincts. He’s great!”

Going back to my analogy: If you had hardcore fans who never saw a hitting statistic, they would probably inflate their BA predictions for players with high slugging averages, cause “damn, can that guy hit!”


#25    tangotiger      (see all posts) 2006/09/24 (Sun) @ 12:39

I think there’d have to be some correlation.  I would bet that batting average, walks per PA, and isolated slugging (SLG-BA) would all be correlated.

I’m sure instincts and arm strength would also have some correlation, though certainly not as much as “first step” and “speed”.

As for your first point, I have no idea what you are talking about!  I’ll be happy to release the data collected, minus the name of the voter, to any researcher who wants it, and be able to report on it.  I did in fact do that for the 2005 data, but that researcher hasn’t released any results yet.


#26    tangotiger      (see all posts) 2006/09/25 (Mon) @ 07:31

Taking the 2225 player-team-seasons with at least 400 AB+BB since 1994, here is the correlation between these three:
walkRate:bb/(ab+BB)
powerRate:(2b+3b+hr)/AB
battingAvg: h/AB

walk to power: r=.41
walk to BA: r=.14
power to BA: r=.42

XBH/H to H/AB has an r=.04

In any case, if you are good in one thing, you are more likely to be good in another thing.  A player’s toolset would not be randomly allocated.  Indeed, that we have such clearly great fielders and clearly poor fielders shows that there indeed must be some correlation among each of the specific tools. (Unless one tool is so important as to dwarf all the others, and thereby give you the reason as to why you have standouts).


#27    tangotiger      (see all posts) 2006/09/25 (Mon) @ 08:31

Other biases:
- are some team’s fans more biased than others?  does that team’s record impact the fan bias?  For example, the average score is around 3.3, when it should be 3.0 if there was no bias.  So, we definiely have a home-team skew. Obviously, for a team like the A’s it should be above average.

- can fans take the position-bias into account?  Can a fan treat the speed of Minky at 1B in relation to all fielders, and not just the fielders he sees at 1B?  Even is the bias exists, this would just invalidate, to some extent, the cross-positional comparisons, but not the intra-positional comps.

- do fans love/hate certain players?  Clearly, Kaz Matsui is one case.

- the experience, age or longevity on same team have a bias? 

- do fans remember the highlight play, and discount the majority of plays?


#28    tangotiger      (see all posts) 2006/09/25 (Mon) @ 11:36

The other point of the Fans’ Scouting is to do away with relying on some self-imposed expert.  All I heard about Kaz Matsui was how great a fielder he was, and in training camp, all the NY reporters were saying what a great DP combination Kaz and Reyes were, and how good Kaz was, etc.  Even now, when you read about a trade, you get the “analysis” of the player’s fielding.  You get all these articles who talk about the fielding talents of players.

All of that is b.s.

There is no reason to think that any single person can be more trusted than a group of people.  Unless that self-imposed expert is a true professional and put in the required time on said player, then why should I listen to him?  The Fans’, at worst, would be just as good as these self-imposed experts.  At best, they provide the other half of what UZR, Dewan, and Pinto wished they had.


#29    studes      (see all posts) 2006/09/25 (Mon) @ 13:21

Tango, I think Guy understood my point well.  I also think this is a worthy cause and I do think it’s worth pursuing.  Thanks for putting so much time into it.  But your first post basically said that we only need four responders to get “reliable” results. 

There is obviously self-selection among those filling in your survey, and “reliable” is different from “consistent.” You’re getting consistency.  I don’t believe you can yet prove that you’re getting reliability.


#30    tangotiger      (see all posts) 2006/09/25 (Mon) @ 14:17

Definitely semantical.  I’m treating the people who responded to my survey as the popuation, or universe of people.  Of those people, how many do I need to get a reliable sample, so that this sample is representative of the population.  So, of the people I am targetting, I only need to get six of them to be representative of the population.  As I’m defining the terms, reliable is consistent.

As I’ve said, whether the population of these people are accurate is another story.


#31    James M.      (see all posts) 2006/09/25 (Mon) @ 18:05

do fans remember the highlight play, and discount the majority of plays?

Of course they do, and so they should.  Most plays are routine for the vast majority of major leaguers.  So rating based on all plays results in a very tight distribution around 3.0.  But if you rate only on the most difficult plays you can reasonably expect scores ranging from 1 to 5.

So here’s how I’d rate fielders.  Record BBTN’s Web Gems every day.  At the end of the season total them up for each player and divide the result into the players total chances at each position.  Of course a lot of players will never get a Web Gem so they won’t get a rating.  But that’s fine; we know they’re below average.  Of course it would be better if we had all Web Gems, not just the ones ESPN selects.  You could have a group of hardcore fans from each team choose and rate the top 3 plays in each game for BOTH teams and have 2 independent samples to compare.


#32    MGL      (see all posts) 2006/09/25 (Mon) @ 18:23

While “webgems” are a nice sample of plays that 80-99% of players do not make, there are a million (not literally) of other plays in between “routine” and the “webgems.” You certainly want to notice those as well.  If you only use webgem plays, you are missing out on a ton of relevant samples of performance.  Probably for every webgem, there are 10 plays that 50% of fielders make, but that certain fielder does not make, or does make.  Even though a webgem gives you an opportunity to credit a fielder with almost a fulll play (maybe .75 runs), there are so few of them that they really don’t have that much importance.  Most of the important data (and observations) are going to be the plays that the average fielder makes 25-75% of the time, simply for the number of times they occur.


#33    Sky      (see all posts) 2006/09/25 (Mon) @ 18:45

Plus, some fielders just tend to make the same play look flashier than others.  Dives make up for bad first steps.  Jump throws get more attention than planting and throwing.  And good pre-pitch positioning turns a potentially difficult play into a routine one.


#34    tangotiger      (see all posts) 2006/09/25 (Mon) @ 19:08

If someone wants to record the webgems, that’s fine.  I’m not interested in it.


#35          (see all posts) 2006/09/26 (Tue) @ 09:12

Hi Tango,

If you send me the data, I’d be happy to plug it into SPSS and run the analysis I mentioned.  My email is in the signature at the bottom.

Also, your powerRate = (2b+3b+hr)/AB, and your battingAvg = h/AB = (1b+2b+3b+hr)/AB = powerRate + 1b/AB.  Of course those are highly correlated.  Would power and BA still be correlated if they weren’t based on the same data? Like:

battingAvg = h/AB
powerRate = (2b+3b+hr)/h, or ISO

Regardless, I’d predict that the Fans Fielding Report will inflate the correlation between components, regardless of the existing correlation (not sure why you think instincts and arm strength are correlated, but ok...), because they are biased by a overall, gist sense of how good a fielder a player is.


#36    tangotiger      (see all posts) 2006/09/26 (Tue) @ 09:30

In my earlier post, I did say the following:

XBH/H to H/AB has an r=.04

***

I’ll send you the data tonight.  What I’m going to do is send you “completed” players, which is those fans who evaluated all 7 traits for a fielder, and at least 8 such fielders on a ballot, with a minimum of 15 such ballots per team.  Otherwise, you will get holes in the data.


#37    tangotiger      (see all posts) 2006/09/26 (Tue) @ 10:27

I’m running series of regression.  The first one I did was the one that the fans would have the easiest chance to distinguish: speed, and arm strength.  I think it’s rather easy to pick out these traits, since it’s on display all the time, and they are fairly context-neutral.  The correlation (r) was 0.29. 

How about strength to all the others?
Instincts: .51
First step: .43
Speed: .29
Hands: .44

Release .64
Accuracy: .70

What to make of it?  Certainly, fans seem to be biased to their overall assessment of a fielder, and bump up their scores.  If we consider the arm strength and speed as one where the least amoung of correlation was expected (and, that’s what we got), it serves as a useful floor in assessing how much correlation there is among the traits, as well as serving as a bias estimator in fans’ responses.

And, as expected, the two highest traits that correlated to arm strength were the two other throwing categories.  Again, as expected.

How about repeating the same exercise, but from the Speed view?
Instincts: .44
First step: .87
Hands: .32

Release .35
Strength: .29
Accuracy: .27

A similar pattern is repeated here.  There seems to be a base of around a correlation of .30, when we’d expect something closer to the .00 - .10 level.

How about the last one that may be the easiest to pick out, and that’s Hands.
Instincts: .88
First step: .63
Speed: .32

Release .85
Strength: .44
Accuracy: .73

The one we expected the highest on, the release, is pretty much what we got. 

The question of course is how much of a correlation should we have expected? 

At a group level, we get results that make sense.  For example, the highest speed scores belong to CF followed by SS, and the lowest are C, preceded by 1B.  In arm strength, RF, SS, and 3B lead the group, with 1B and LF at the bottom.  Pretty much all the positions come out where you’d expect.  (Except maybe C, and that may be an issue unto itself.)

So, the overall pattern seems to be about right, but there may be too much of a fan relying on some overall assessment of a player, that biases his results. 

In the end, the final results probably wouldn’t really change.  All it means is that the fans think “too much” of a good fielder and “too bad” of a bad fielder.  Their magnitude may be too wide, but that is corrected within the scope of the results anyway (forcing the mean at 50, and SD at 20).  It doesn’t look like a particular type of player (fast, strong, surehanded) is more biased than the other.


#38    tangotiger      (see all posts) 2006/09/26 (Tue) @ 12:57

This is what Tom Verducci said about Bill Mueller, Dodgers:

Good point about Mueller. The guy’s also a very good fielder.

This year (2006), the Fans’ (only 8 of them) gave Mueller a 37 (league average fielder is a 50).  Last year, 119 Sox fans gave him a 61.  In 2004, it was a 60.

There are four Dodgers with playing time at 3B this year, and the Dodger fans ranked them as: Izturis, Betemit, Aybar, Mueller.

If you look at their ZR, Mueller has the worst of the bunch (.750 compared to .78x to .79x).

I wouldn’t be surprised that this is another case of the fans being right on top of the situation, that Mueller was once a very good fielder, but something happened this year (maybe a Dodger fan can chime in).  And the Dodger fans had the chance to evaluate four guys at that position and they seem to have picked the right ones.


#39    tangotiger      (see all posts) 2006/09/26 (Tue) @ 13:37

I see Mueller’s been hurt (knee injury).


#40    tangotiger      (see all posts) 2006/09/27 (Wed) @ 08:16

I have the team totals for UZR, and I compared it to the team results from the Fans’ (note: I removed COL, and I have the catchers in my report).

Anyway, at the team level, the correlation (r) is .25.  According to the Fans’, the three best fielding teams are A’s, Mariners, and Twins.  UZR has then 9th, 11th, and 16th.

The worst-fielding teams according to the Fans’ are: Cubs, Reds, Indians, Nats. UZR agrees strongly with the Reds/Nats, has the Indians as average, and the Cubs as the 3rd best fielding team in the league.

In terms of the Fans’ being the most deluded, relative to UZR: Rangers, Redsox, A’s, Marlins.

In terms of the Fans’ being overly critical, relative to UZR: Cubs, BY FAR, D’Backs, Royals, Indians.  All these teams are either at the bottom of their division, or has a W/L record below expectations.

I hope David S. is around, since he’s a Cubs fan.  If I convert the “Fan” rating into UZR, the average difference between the Fan evaluation and UZR is 14 runs, PER POSITION.  This is simply enormous.

Looking at each Cub, this is what the Fans are saying:
C: Blanco is above average, Barrett is below (overall around average to below)

1B: Lee is the best fielding 1B in the league, and the other guys (Mabry, Walker, Nevin) are all below average for a 1B (overall around average to below)

2B: Neifi a bit over average, the rest of the crew (Hairston, Womack, Walker) all below average for a 2B (overall below)

SS: Cedeno below, Neifi avg, Izturis above average (overall around average to below)

3B: Ramirez a bit below average

LF: Murton below average

CF: Pierre average to below average

RF: Jones below average

So, a clean sweep, for every position, at best average.

Hardball Times is showing the team DER (a combination of pitching, fielding, park) at above average.  They are also showing “Plus/Minus” (which may be Dewan’s stat??), with the Cubs way above average.  (Studes, when did you roll this out?  This is great!).

So, are the Cubs fans simply being overcritical at the disappointment of their team, or is it a fair assessment of their players?


#41    studes      (see all posts) 2006/09/27 (Wed) @ 12:37

Hey Tango,

The fielding stats we print for teams (I rolled it out in the beginning of the year) is based on how often each team fields each batted ball type.  When I ran overall DER against Dewan’s team ratings last year, I got an R squared of .5.  When I correlated the batted ball/DER stats, the R squared rose to .7.

Here’s the URL to the article when I first announced them:

http://www.hardballtimes.com/main/article/ten-things-i-didnt-know-last-week22/


#42    tangotiger      (see all posts) 2006/09/27 (Wed) @ 13:09

An r-squared of .7 means an r of .84, which is extremely high.  I’m surprised that it could be that high, so that’s very interesting.


#43    studes      (see all posts) 2006/09/27 (Wed) @ 13:25

I double-checked, and that is correct.  It’s only based on 2005 stats.


#44          (see all posts) 2006/09/27 (Wed) @ 13:35

Tango provided me with the data this morning, to look at the reliability of the scores across raters.  The data was comprised of 7211 ratings of 7 components (for 50,000+ datapoints). 618 different raters evaluated 246 MLB players in this dataset.

I ran a multivariate GLM, including 2 fixed factors (without interaction terms)--the player who was doing the rating, and the MLB player who was rated--to predict each of the 7 components. Adding the former factor accounts for between-rater variance, but is a factor with 618 levels. Since that can easily overfit the data, I’ll report analyses with the factor, and without. The true variance explained falls somewhere in the middle.

Variance explained (w/ the model including a rater factor in parentheses):

component, R2 (R2), STD err
INSTINCT: .500 (.605), .79
FIRSTEP : .516 (.624), .81
SPEED : .647 (.728), .70
HANDS : .479 (.580), .73
RELEASE : .463 (.585), .74
STRENGTH: .560 (.647), .70
ACCURACY: .460 (.568), .73
OVERALL : .636 (.758), .48

The STD err is the standard deviation of the error. The GLM estimates the significant of each level of the two factors, and creates estimates of each datapoint, taking into account only those two factors. The error term essentially captures the variance between fans in rating each player.

So if we were to randomly pick a player, and then randomly pick a fan, and try to guess what rating that fan would give to that player, we would be within 1 SD roughly 68% of the time. Thus, 68% of the time, we could guess the fans rating of a random player to +- .48 points on the 5 point scale.

Turning that around, if we were to guess what a fans rating of a particular player was, we would be off by a full point on tango’s 5 point scale roughly 3% of the time, for the overall rating, and roughly %21 of time for the “First Step” component.

I’m not sure why there is so much less variation for the overall rating than the individual components.  It could be an artifact of the design: the components are rated on a 5 point scale, but the overall rating, as a sum of the 7 5 point scales, has 35 possible values.

Of course, that doesn’t address the issue of validity, or bias, but I think that I think:
1) the overall rating is relatively reliable
2) some of the individual components (e.g., first step) are significantly less reliable.


#45    tangotiger      (see all posts) 2006/09/27 (Wed) @ 14:17

cdm, thanks for running that.  Can you explain what you are talking about in paragraph 2?

***

Those SD is what I’ve gotten every year, around .70 to .80 in each category.  The maximum SD that I should have gotten is sqrt(2), or 1.41, if a fan was simply throwing a dart on the board (uniform distribution) for each trait.  So, we know they are not doing that.

***

As for the lack of variation in the overall figure, can I assume that our expectation should have been: sqrt(0.79^2 + 0.81^2 + 0.7^2 + 0.73^2 + 0.74^2 + 0.70^2 + 0.73^2) = 1.97, if all the traits were completely independent, and .70 if they were positively dependent?  That the number is .48, doesn’t it show that these categories are negatively correlated?

(I’m assuming you did “sum” and not “avg” for the OVERALL?)

***

Interesting that your r-squared for SPEED, and for OVERALL are very close.  Does that suggest that there is a random distribution of the other six-summed categories around SPEED?


#46    tangotiger      (see all posts) 2006/09/27 (Wed) @ 14:30

cdm, I’m using a different dataset, and if I use a “sum” for the overall, the SD of the players is around 4.  But the “average” of the 7 is a bit over .50.  So, I think your .48 figure must be the average.

If that’s the case, then the 1.97 figure I quoted should be divided by 7, for a figure of .28.  That’s if everything was random.  And .74 if they were completely dependent.

Since you are reporting a figure of .48, that must show the level of interdependence among the 7 traits, right?


#47    David Smyth      (see all posts) 2006/09/27 (Wed) @ 17:34

As far as the Cubs and their excellent plus/minus Dewan rating--I agree with the Fans almost completely, that they were mediocre. To rank very high, a team should have several good to very good regular defenders. The Cubs have two--D Lee (obvious to me that he is excellent, despite the odd mediocre rating from Dewan), and C Izturis. The problem is that Lee only played 40 (or whatever) games in the field, and Izturis only a handful for the Cubs. So their contribution was pretty minimal, even if they were performing at their usual outstanding levels, which may not be the case because of injury problems.

The rest of the fielders were a mix of avg, slightly above avg, and slightly below avg. They are not going to propell the team rating significantly one way or the other.

Dewan’s system is supposed to be almost the same as UZR. Despite the underlying logic of these metrics, the intelligence of their creators, the numbing detail of their data, and their reasonable yty correlations, it’s results like this (the Cubs) which make me still skeptical.

Of course, after a statement like that, the question is whether I am wrong to be trusting my eyes over the UZR data. In a post on this forum yesterday MGL discussed why such is likely to be folly. But I am not watching webgems, or errors-- I am watching 80% of the plays over the Cubs season.

Of course, just my opinions.


#48          (see all posts) 2006/09/27 (Wed) @ 17:52

(try 2; apologies if this posts twice)

Sorry, I mispoke. Overall was the average of the other 6, to keep it on a 1-5 scale.  But the STDs are the standard deviation of the residuals.

***

I’ll try to be clearer about the methodology…
Essentially, the multivariate general linear model is very similar to a univariate regression, repeated multiple times. I gave it a 7x7211 matrix of ratings (Y), and had the GLM estimate parameters for each player, and each fan, such that the linear combination of the two creates a predicted dataset, of size 7x7211, that is as close to the original data as possible.

So for each row of your data, a single player was being rated by a single fan. I gave the GLM both of those pieces of information, and had it estimate parameters for each fan and each player. 

The “fan” factor, however, is a little sketchy. It soaks up differences between fans. So if one fan ranks everyone high, and another ranks everyone low, that variance gets explained by this variable (this is equivalent to running a repeated-measures analysis).  However, since there were 618 fans doing the rating, this factor had 618 levels. This is essentially like including 618 variables, and opens the door to overfitting the data. 

So I ran two models. One with only a “players” factor, and a second with both a “players” factor and a “fan” factor. The analysis with only a “players” factor doesn’t try to account for differences between fans, and is thus more conservative. The less conservative R^2 were listed in parentheses.

*****

I can’t say what my expectation was for the STD of the error… I’d have to think about the math for a minute to fully understand your equation.

Regarding the figure of .48 suggesting the interdependence of the 7 traits, I think your right.  I imported the data into matlab and ran a principle components analysis. 

The first component loads on everything moderately. It explains 55% of the variance. 

The second component seems to be the dimension between guys with lots of speed, and a good first step vs. guys with a good arm. That explains an additional 17% of the variance.

The third, which explains 10% of the variance, is a bit less intuitive, but seems to be a dimension between guys with instincts and hands vs. guys with tools (speed + strength)

Instincts: -.43 .06 -.51
Firstep : -.44 -.44 -.05
Speed : -.35 -.66 .31
Hands : -.36 .17 -.41
Release : -.37 .30 -.03
Strength : -.32 .32 .64
Accuracy : -.33 .38 .23

I don’t have player names; just player numbers. I guess I will look for some playerIDs that fall very high along these dimensions (like a player that loads very highly on the “speed” factor), and send them to you, along with predictions as to who I think they may be. It’d be cool to see if I can make that prediction based on only these 3 dimensions. That would be a cool illustration of the reduced dimensionality of the data.


#49    tangotiger      (see all posts) 2006/09/27 (Wed) @ 19:06

The player ID are the STATS ID.  So, you can go to CNNSI, or I think Yahoo, and just play with the URL, once you bring up a players page.


#50    tangotiger      (see all posts) 2006/09/27 (Wed) @ 19:34

David, it’s results like the Fans’ are giving me, and objective/trusted fans like you that makes me believe there are gaping holes in PBP metrics.  The entire basis of PBP metrics is that they attempt to qualify the “easiness” of a play, so that it can assign a value between 0 and 1 if you make the play (and between 0 and -1 if you don’t).  Misjudging something, consistently, will cause problems, as we see here with the Cubs.  Or, it could be sampling.  I believe that since over half the plays are routine, that PBP metrics are having a tough time trying to set these plays as “0” or “1”, so taht they can, in essence, be discarded.


#51    tangotiger      (see all posts) 2006/09/28 (Thu) @ 11:38

A comment on Studes’ selections for top fielders, which you can read here:
http://www.hardballtimes.com/main/article/studes-fielding-awards

The Blue Jays fans (mostly from Batter’s Box) were extremely loud on this one: Aaron Hill is a slightly above average 2B, but John MacDonald is one of the best fielding SS in baseball.

Braves fans were also clear, that while Andruw Jones remains an excellent CF, he is no longer “the best”.  He now is among a cluster of CF, just below the best.  He will not win a Fans’ Globe Glove.  Beltran will.

And Mariner fans continue to have Betancourt as one of the best SS in baseball, and clearly the best in the AL.


#52          (see all posts) 2006/10/02 (Mon) @ 06:58

Tango,
In a follow-up to your question about the interdependence of the components, I ran a principle components analysis.

If the variables are highly collinear, PCA will extract factors that explain the most variance in across the components.  So if Speed and First Step were highly correlated, it may lump the two variables together, reducing the dimensionality of the data (7 dimensions, in this case), to a smaller number that captures as much variance as possible.

I found that there are essentially 3 factors that make up 80% of the variance in the 7 components. These are slightly different than the ones reported in my last post because I used an oblique rotation to help isolate the variables. Here’s a list of factors, along with the components that load on each factor:

Factor 1 (Intangibles), 55% of variance: .924 INSTINCTS, .884 HANDS, .542 RELEASE

Factor 2 (Arm), 17% STRENGTH .997, ACCURACY .725, RELEASE .450

Factor 3 (Legs), 10% SPEED .998, FIRST STEP .737

So fans are rating players along three dimensions: their legs, their arms, and their “intangibles.”

Who are the 5 best fielders along the legs dimension? (number is parentheses are z-scores; mean=0, std=1):
1. Ryan Freel, 1.62
2. Shane Victerino, 1.55
3. Ichiro Suzuki, 1.52
4. Jose Reyes, 1.47
5. Carlos Beltran, 1.40

Best 5 Arm:
1. Ivan Rodriguez, 1.62
2. Scott Rolen, 1.61
3. Yadier Molina, 1.57
4. Ichiro Suzuki, 1.47
5. Joe Mauerm 1.43
...
(7. Jose Reyes, 1.28)

Best 5 Intangibles:
1. Omar Vizquel, 1.70
2. Eric Chavez, 1.67
3. Scott Rolen, 1.60
4. Mark Kotsay, 1.39
5. Ichiro Suzuki, 1.36

And the worst in baseball:
Legs:
1. Tony Batista, -1.93
2. Bengie Molina, -1.84
3. Jason Giambi, -1.50
4. Paul Konerko, -1.49
5. Ken Griffey Jr., -1.49

Arms:
1. Jason Giambi, -2.11
2. Bernie Williams, -1.94
3. Victor Martinez, -1.85
4. Luis Gonzalez, -.181
5. Coco Crisp, -1.79
...
(7. Johnny Damon, -1.78; had to do it because Crisp made the list...)

Intangibles:
1. Adam Dunn, -1.7
2. Chris Duncan, -1.6
3. Felipe Lopez, -1.58
4. Wily Mo Pena, -1.51 (soo deserving)
5. Scott Podsednik, -1.44
...
(14. Jason Giambi, -1.24)

Based on this, I’d argue that fans evaluate players along 3 dimensions of performance. And a lot of these lists make some intuitive sense. Anyone who’s watched the Sox this year know that Wily Mo has some speed, and a decent arm, but he makes bad jumps, and just goofs up too much. That shows in the data.

Hope you find this interesting smile


#53    tangotiger      (see all posts) 2006/10/02 (Mon) @ 07:35

I do, thanks!

This of course makes sense, since this is how we see baseball.  As well, the ballot itself was broken up based on how the time dimension of the ball in play:
1. instincts
2. first step
3. velocity
4. hands
5. footwork/release
6. strength
7. accuracy

Compared to your analysis, the 2/3 is lumped in as “legs”, the 6/7 is “arms”, while the “1/4/5” is “rest”.  I think I’d prefer the “4/5” to be “coordination”, and leave the “1” as “instincts”.  Of course, your analysis is showing that instincts and “coordination” should be lumped together.  So, I like to think of it in these 4 dimensions.

In this interesting article by Keith Isley, he broke down the characteristics into these categories: tools (#3/6 above), and skills (1/2/4/5/7).  The first is essentially “you’re born with it”, while the second is “you can improve it”.  So, Keith sees it from the perspective of two dimensions.

Can we conclude that fans think mostly along those three dimensions, even though the 4- and 2-dimensionsal models really represents more of reality?


#54          (see all posts) 2006/10/02 (Mon) @ 09:28

Yes and No.  We can conclude that 82% of the variance in the 7 components can be explained by 3 factors, and if you try to seperate the data into 3 components, it comes out 2/3, 1/4/5, 6/7. 

If we tried to extract 4 components (86% of the variance explained), we get different factors lumping together. So no, we can’t say that the “true” dimensionality is 3.

But we do know that Hand and Release correlate with each other (.616) about the same as they correlate with Instincts (.647, .593). So I see no evidence for Instincts being its own natural category.

If you wanted to get the “true” dimensionality of the fans ratings, I would ask fans to submit single statements about fielders ability (e.g., “Wily Mo doesn’t read the ball off the bat very well.” “Pujols digs a lot of balls out of the dirt.").  Then edit the list to remove redundancies, and try to get something like 50-100 propositions that could describe player performance.

Then have people fill out forms about individual player performance, like you did for the Fan Fielding report, but instead of using the 7 components, randomly pick 20 propositions, and ask fans to rate the extent to which the proposition ("_____ reads the ball off the bat well") describes the performance of the player.  Also, use the inverse ("_____ reads the ball off the bat poorly").  That way, for every player who has 10+ fans rating him will likely have answers to most of the 50-100 propositions. 

A PCA of that data will give you a much better sense of the dimensionality of the fans representation of fielder performance. The PCA will lump together propositions that fans treat at more or less equivalent.

This is a bit overly ambitious, but what would be really cool is if you could get scouts to do the same thing, and then show that scouts rate players in a higher dimensional space than do fans.


#55    Tangotiger      (see all posts) 2006/10/02 (Mon) @ 09:41

Ooooh, I like that.  What I was going to do, concurrently, was get some “superfans” to participate.  They would, in essence, be the professional scout.  We can then compare their results in the kind of questionnaire you are discussing, to the standard format the fans currently fill out.

(It is unrealistic for me to ask the regular voter to do more, because, it’s their time.  If they have to spend too much time on it, they won’t spend any time on it.)


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 14:38
Nate Silver: hero to interviewers

Nov 20 14:20
Marcel 2009 is here

Nov 20 13:42
Top Free Agent Pitchers

Nov 20 12:29
R.I.P. Tom Boswell, sabermetrician; P.A.L.L.(*) Tom Boswell, human being

Nov 20 12:27
David G. checks in again on whether experience matters in the post-season

Nov 20 10:42
Offense by position groups by decade

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel