THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

Filter posts by...

 

Statistical_Theory

Friday, August 05, 2011

Quirks in lotteries

By Tangotiger, 01:07 PM

In Canada, and in most lotteries I suspect, when there is no winner for the top prize, all that extra money is rolled over into the next week’s draw all going to the top prize.  This is to ensure a guaranteed payout, overall.  In Canada, it is (or was anyway when I used to live there), 55% of money is paid out (tax-free!) as winnings.

In Boston, however, they cap the top winnings.  Since they also guarantee payouts, the extra money flows down to the lower winners.  So, if no one wins the top prize (guess correctly all 6 numbers), then all that extra money that exceeds the cap has to be paid out.  And if you have many weeks without a winner, that’s alot of left over money that has to flow down.

But Lu, like the Selbees and a few others, focused on a feature of the game that is extremely rare in the United States, according to gambling authorities contacted by the Globe. The jackpot grows gradually over time from a low of $500,000 to a limit of $2 million to $2.5 million; when the limit is reached and no one claims the big prize, the top prize money is poured into the smaller prizes - or “rolled down’’ - raising the odds of a significant payout.

During normal weeks, picking five out of six numbers correctly will generate a $4,000 prize, but the prize rises to $20,000 to $40,000 during rolldowns, depending on how many winning tickets are cashed. Fewer winning tickets translates to larger payouts: During Cash WinFall’s first year, the prize for picking five numbers correctly once exceeded $100,000.

Likewise, the prize for picking four of six numbers swells from $150 to $800 or even $1,000, while the prize for picking three numbers jumps from $5 to $26 or more.

As a result, sophisticated players do not actually want the jackpot to be paid out - unless it is going to them. The odds of winning the lower prizes are so good, that they can gradually win a fortune just by betting hundreds of thousands of dollars every rolldown week.

(15) Comments • 2011/08/06 • SabermetricsStatistical_Theory

Indirect v direct standardization

By Tangotiger, 07:57 AM

Great piece by Max.

So, let me reiterate the issue. When using indirect standardization (i.e., when using whatever existing fielding metric), you are entitled to say that both Player A (+20 plays) and Player B (+22) performed better than the average shortstop, but there is no way you can infer Player B performed two plays better than Player A.
...

I believe fielding metrics should shift to the direct standardization method when data become more objective, detailed and unbiased. Until then the indirect standardization is an improvement over no standardization at all when players face different set of opportunities (but that’s when improper ranking might come out).

Indeed, back in the original UZR, MGL used direct and then switched to indirect after comments at the old Baseball Boards.

(4) Comments • 2011/08/08 • SabermetricsFieldingStatistical_Theory

Sunday, July 24, 2011

Players who seemingly “lost their power” for no apparent reason.

By , 11:15 PM

Nick Markakis averaged 17.7 HR per 600 PA in his first 4 years in baseball.  Since then (2010 and 2011), he has averaged 11.  There has been much talk about him “losing his power.” Now, obviously that is not a huge drop in power, but let’s look at this from a statistical perspective and we’ll use a little Bayesian inference as well.

Markakis had 2660 PA in those first 4 years, with a HR rate of .029 per PA.  One standard deviation in HR rate in 2660 PA, by chance alone, is around .0026 (assuming a true rate of .0183).

Let’s say that his true HR rate over his career was actually .022, or 13 HR per 600 PA.  His performance in the first 4 years would be around 2.7 SD above his true talent HR rate, by sheer luck alone.  That is going to happen around 1 out of every 385 times.

That may be a small number (1/385), but the fact that we are cherry picking Markakis (or some other player) means that there are lots and lots of players who did NOT seemingly lose their power over a similar time period.  In other words, with all the players we can look at, a few of them will, by chance alone, significantly under-perform their true talent HR rate (or any other stat) for no articulable reason whatsoever.

I’m not saying that we know that Markakis’ true talent HR rate has not decreased.  We don’t.  I’m just saying that it should come as no surprise whatsoever that a few players would seriously under-perform after 4 full years of baseball, simply because during those first 4 years, they just got lucky.  And that is not even taking into consideration those who got unlucky for the next year or two, or some combination of the two (lucky at first, unlucky later).

Just another example of how we often try and invent a story or figure out a reason for things that are merely the vagueries of chance.

How does Bayes come into play?  When I looked at Markakis the other day, he just didn’t look like a HR hitter to me. Although he is listed as 6-1 and 200 pounds, he doesn’t look all that powerful - i.e. a prolific HR hitter.  If that is true, it is much more likely that he got lucky those first 4 years, and that 11 HR per 600 is closer to his true talent rate than 17.7.  If it were someone like Prince Fielder who all of a sudden had a power outage, that would be a completely different story, statistically and Bayesianly speaking…

(17) Comments • 2011/07/28 • SabermetricsStatistical_Theory

My issue with regression equations

By Tangotiger, 10:35 PM

Patriot captures it right here:

Building your metric around a run estimator does not necessarily restrict you to simply plugging in the numbers in the appropriate place. Suppose you wanted to construct a metric based on batted ball types, strikeouts, and walks. One way to go about it would be to simply go through and estimate singles, doubles, triples, homers, and outs in play based on the percentage of each batted ball type that wind up as each. So, you would end up with equations that might look something like this:

Singles = .057FB + .217GB + .516LD + .017PU

However, if you believe that you have gleaned some other insights into the relationship between events that could improve your metric (such as strikeout pitchers having lower HR/FB rates) , you could still build that in to your formula for estimated home runs, and plug those into the run estimator.
It’s more difficult than running a regression, and a more delicate balancing act (at least in terms of developing the formula), but it allows you to stay grounded in a model that estimates runs by taking a first step of, well, estimating runs.

He’s saying this (or if he’s not saying it, then that’s how I am reading it, and, in any case, it’s how I think it):

1. You start with a working model of how runs are created.  This is the beauty of something like BaseRuns, because it works so darn well… GIVEN its inputs.  If you know the number of hits, HR, walks, outs, then we have a fantastically great estimate as to how many runs are expected to be scored.

2. If you don’t know the inputs, estimate the inputs… but don’t change the actual run scoring model.  So, again, if you happen to not have the number of doubles, but can estimate the number of doubles that this pitcher either gave up, deserved to give up, or was expected to give up, and it’s based on his batted ball distribution profile, and/or the number of HR he gave up, and/or his SO/BB ratio, then estimate the doubles in that manner.... but do NOT touch the run scoring model.

Once you have the estimates of all your inputs, then you can plug them into an established working model.

Even something like FIP is basically a regression equation, because it doesn’t adhere to an actual run scoring model.  Of course, there is a tradeoff between complexity level.  A linear equation is used at the expense of a real baseball run scoring model because it’s easier to compute or understand.  But, if you’ve got a complex linear equation, or even a complex multiplicative equation, or some other form of equation, then you’ve got the worst of both worlds.

This is why I like FIP or wOBA, because they are such simple metrics, that its strengths and limitations are readily apparent.

So, ANY pitcher metric that is not grounded in BaseRuns is immediately setup for a limitation.  The bigger your limitation, then the easier your metric must be.

SIERA, for example, is a good example of a metric that is too complex for its own good.  The insights, the benefits of SIERA is hidden inside its complexity.  But, if Matt were to follow Patriot’s lead here, and compute estimates for events (1b, 2b, 3b, hr, bb, so) based on his findings, about how things interact, then we would have a very helpful metric.

So, that’s my recommendation as to how you can really advance the cause: keep the logic of baseball intact if you insist on complexity.

(23) Comments • 2011/07/26 • SabermetricsStatistical_Theory

Wednesday, July 20, 2011

Distributions in sports

By Tangotiger, 10:33 PM

Good job by Kincaid.

Also note the Tango Distribution (last two links on home page).

(2) Comments • 2011/07/21 • SabermetricsStatistical_TheoryOther SportsHockeySoccer

Monday, July 18, 2011

The “Balance” theory

By , 10:22 AM

Here is how it works:

Let’s say it is the 9th inning and your team is winning by a run.  Your pitcher walks the lead-off batter.  The announcer on TV says something like, “Wow, you can’t walk the lead-off batter with a one-run lead.  You have to challenge him.” Or, “There is nothing more frustrating for a manager than walking the first batter with a one-run lead in the 9th.”

Now, obviously a walk is not a good thing in that situation, as opposed to an out or even a generic PA.  But, the question is whether a walk in that situation is particularly bad.  The answer to that question is not necessarily obvious, especially if you are not sabermetrically inclined (like the announcer).  But there is an easy way to answer it using the “balance theory.”

Let’s say that you had more than a 1-run lead.  What about the walk then?  It is now obvious that the lead-off walk is horrendous, since it is nearly equivalent to a home run (other than the double play possibility).  Since the 1-run lead and the “more than 1-run” lead are the only two possibilities, if the walk is particularly bad with a “more than 1-run” lead, it HAS to be not so bad (again, comparatively speaking) with a 1-run lead.

That is the “balance theory,” and it can be used to answer many questions like that…

(28) Comments • 2011/07/25 • SabermetricsStatistical_Theory

Tuesday, July 12, 2011

“Everyone has their own WAR”

By Tangotiger, 11:04 AM

Exactly.

Fangraphs has its WAR and Baseball-Reference has one as well. But in truth, everyone has their own WAR.

My dad and I were talking about this the other day. He was talking about why he thinks no one in baseball is better now, and what he was doing was processing all the factors he values…he puts a higher value on speed (and triples) than you or I might…and he thinks there is a “fan popularity” impact for every player.

In his mind, he’s smushing all those factors together, just as the Fangraphs version and BB-Ref versions do. His version is personal. He and I don’t have to agree. But it makes for the most fun kind of baseball discussion.

We all come up with our “single number”, even though we kick and scream that we shouldn’t come up with a single number.  If one guy argues that Felix is better than Lincecum, and the other argues the opposite, then guess what: they’ve each “smushed” a bunch of parameters, considerations and gut feelings to get to their final opinion.

I remember an old boss of mine deriding the idea of a spreadsheet that would take a bunch of factors into consideration to come up with everyone’s rating at the office, and, in turn, everyone’s salary.  He said that he has to do everything on a case-by-case basis.

But, lost to him is that, in the end, everyone DOES get a final number: a salary.  So, you can have a consistent process, that considers everything objective and subjective.  Or, you can consider those same objective and subjective things, and smush them together in your mind on a case-by-case basis.  You are STILL considering the exact same things.

The difference is that by going case-by-case you may be applying different weights to different parameters for different people as the mood strikes you.  If you have a process, that doesn’t happen.

No one is telling you not to overweight or underweight strikeouts or HR.  But a system requires you to spell out the rules for weighting, and apply that consistently to everyone.

The one good thing about the case-by-case basis is that it forces you to think about parameters.  You’d like to ding Manny Ramirez a little, you’d like to up Jeter a little.  So, you have to create a “heart” parameter.  And that’s perfectly fine!  Just spell it out that that’s what you are doing.  And tell us how much you are giving to each player for heart.  I have no problem with giving out wins for heart, over-and-above whatever his actual performance tells us.  Just spell it out and be consistent.

(19) Comments • 2011/07/14 • SabermetricsStatistical_Theory

Wednesday, June 29, 2011

If you are under 60 years old, you have a 50/50 chance of seeing your team win in your lifetime

By Tangotiger, 12:21 AM

Patriot.

(9) Comments • 2011/06/30 • SabermetricsStatistical_Theory

Tuesday, June 28, 2011

Clustering and pitch/fx

By Tangotiger, 07:36 PM

Jimmy:

Let’s say you want to identify clusters in two-dimensional data. You an do this using a clustering algorithm such as k-means or soft k-means. In a nutshell, what this does is take an initial set of means (chosen however), evaluate the distance of each data point to one of the means using some distance metric and then assigns a mean to each data point (i.e. the closest mean). Then it re-evaluates the means given the current assignment and steps through the process again, unless it converges and you have the data grouped into “k” clusters.

So this helps with grouping the data points, but let’s say you wanted to go a little bit further. What you can do is run the initial algorithm to find the means and cluster assignments, and then impose the assumption that each cluster is distributed around its mean (which you just found) according to a bivariate normal distribution. Then you use maximum likelihood (ML) to find the variance parameters of the bivariate normal for each cluster, which may vary for each cluster. You can assume different variance in each direction to account for clusters that aren’t spherical. Then once you have those parameters, you have the variance of each cluster.

To relate this to baseball, assume the two-dimensional data we have is horizontal and vertical pitch movement, and assume that the pitcher in question has three pitches: 4-seam FB, slider, and a curve. Presumably these three pitches will form three distinct clusters when graphed. We run the k-means algorithm to identify which pitch is which (i.e. assign clusters), and then we fit each cluster to a bivariate normal distribution by ML. Then we have the variance of each cluster. Then we can compare the variance (i.e. the consistency) of each pitch’s movement relative to the other pitches, or compare it amongst pitchers with the same type of pitch. And we can track it from game to game, season to season, etcetera, so that we can say that “oh, Erik Bedard’s control of his CB has really improved this season relative to last” with some quantitative oomph rather than with simple visual evidence.

And there are a lot of other advantages to this too besides just getting the point estimate of the variance. We can also get the variance of the point estimate itself to quantify how accurate we think our estimate of that variance is. We can use the bivariate fit in real time, with Bayesian updating to improve the accuracy of the pitch/fx system itself (in identifying pitch type). There are a lot of places to go from here.

I also hear you on the problem with noisy data. That is a universal issue, but there exist a lot of ways to deal with it. I’ve heard of people transforming the data with principal components analysis first (which is a sort of clustering algorithm in itself… kinda) and then running the k-means on the transformed data to get better clustering fits. And lots of other improvements upon the plain vanilla k-means algorithm to deal with tough data. I’m sure there is literature on this stuff somewhere… but I should really shut up because I don’t understand the pitch/fx system too well.

If you’re feeling adventurous, I recommend chapters 20 and 22 of this book as an intro to the stuff I’m talking about: http://www.inference.phy.cam.ac.uk/mackay/itprnn/ps/

(5) Comments • 2011/06/29 • SabermetricsBall_TrackingStatistical_Theory

“Paradox”: Expectation of being favored to win - vs - Expectation of winning

By Tangotiger, 11:13 AM

I’m not sure that my title description is clear enough.  And if someone wants to propose a better title to be clearer, please do so.

Someone sent me something like what I’m about to post, and he called it a “paradox”, but it is not at all a paradox.  It’s a question of whether you average out binary numbers or average out the rates.

Suppose that Roy Halladay’s true talent level is such that the Phillies win .601 of their games with him on the mound against an average team at a neutral site.  At home, the odds go up by +.050 (and on the road, it goes down .050).  Against good teams, the odds go down by .050 (and up by .050 against bad teams).  Against great teams, the odds go down by .100 (and up by .100 against terrible teams).  So, Phillies with Halladay starting at home against a terrible team gives us odds of .751 that the Phillies will win.  And on the road against a great team gives us .451 that the Phillies will win.

Count as “1” any time the Phillies have a greater than 50% chance of winning with Roy Halladay on the mound.

What percentage of the games are the Phillies favored to win?  Is it exactly 60%?  Or more than 60%?  It’s not a trick question.

(32) Comments • 2011/07/14 • SabermetricsStatistical_Theory

Friday, June 24, 2011

Platoon advantage by time of day

By Tangotiger, 11:18 AM

Interesting

Results indicate that players who were “morning types” had a higher batting average (.267) than players who were “evening types” (.259) in early games that started before 2 p.m. However, evening types had a higher batting average (.261) than morning types (.252) in mid-day games that started between 2 p.m. and 7:59 p.m. This advantage for evening types persisted and was strongest in late games that began at 8 p.m. or later, when evening types had a .306 batting average and morning types maintained a .252 average.

“Our data, though not statistically significant due to low subject numbers, clearly shows a trend toward morning-type batters hitting progressively worse as the day becomes later, and the evening-types showing the opposite trend,” said principal investigator and lead author Dr. W. Christopher Winter, medical director of the Martha Jefferson Hospital Sleep Medicine Center in Charlottesville, Va.

...but obviously the sample size is so tiny, that the “though not statistically significant” can’t just be walked by.

This is their sample size:

Nine participants were found to be evening types, and seven were morning types. Both groups had a mean age of 29 years. The study used the players’ statistics from the 2009 and 2010 seasons, which allowed for the analysis of 2,149 innings from early games, 4,550 innings from mid-day games and 750 innings from late games.

Reporting the innings played inflates the impact of the size of the sample, given that they are reporting batting averages (which means the opportunities is at bats, not innings).  So, 2149 innings is like 1000 at bats, and 750 innings is like 350 at bats.  Laughably small numbers of course. And next time, please, don’t use batting average.  Linear Weights or wOBA would have been the far better choice.

However, I very much like the idea, and the effort.  So, as a starting point, it’s great.

Glove-slap: Sky.

(14) Comments • 2011/07/19 • SabermetricsStatistical_Theory

Monday, June 13, 2011

When is the observed data half real and half noise?

By Tangotiger, 08:12 PM

Derek does exactly (one of the way of) what I do.  I don’t know that I actually get the same results, but, the process is bang-on.

Stabilizes    Years    Stat    Denominator
100    0.2    K    PA
-IBB-HBP
168    0.3    UIBB    PA
-IBB-HBP
253    0.4    IBB    PA
501    0.8    HBP    PA
-IBB
959    2.1    1B    PA
-HBP-K-BB-HR-ROE
833    1.8    2B
+3B    PA-HBP-K-BB-HR-ROE
48    1.5    2B    2B
+3B
48    1.5    3B    2B
+3B
1126    2.4    1B
+2B+3B (BABIP)    PA-HBP-K-BB-HR-ROE
143    0.3    HR    PA
-K-BB-HBP
62    0.5    HR 
(HR/FB)    OF FB [MLBAM]
65    0.5    HR 
(HR/FB)    OF FB [RS]
109    0.2    GB [MLBAM]    GB
+OF+IF+LD
116    0.2    GB [RS]    GB
+OF+IF+LD
182    0.4    OF FB [MLBAM]    GB
+OF+IF+LD
189    0.4    OF FB [RS]    GB
+OF+IF+LD
194    0.4    
IF FB [MLBAM]    GB+OF+IF+LD
233    0.5    
IF FB [RS]    GB+OF+IF+LD
795    1.7    LD [MLBAM]    GB
+OF+IF+LD
979    2.1    LD [RS]    GB
+OF+IF+LD
Inconclusive
*        SB%    SB+CS
39    0.3    SBA
%    1B+UIBB+HBP+ROE+FC

UPDATE: For pitchers:

Stabilizes    Years    Stat    Denominator
126    0.2    K    PA
-IBB-HBP
303    0.5    UIBB    PA
-IBB-HBP
943    1.5    IBB    PA
1346    2.1    HBP    PA
-IBB
3893    8.4    1B    PA
-HBP-K-BB-HR-ROE
2305    5    2B    PA
-HBP-K-BB-HR-ROE
4977    10.7    3B    PA
-HBP-K-BB-HR-ROE
1882    4    2B
+3B    PA-HBP-K-BB-HR-ROE
351    11    2B    2B
+3B
351    11    3B    2B
+3B
3729    8    1B
+2B+3B (BABIP)    PA-HBP-K-BB-HR-ROE
1271    2.7    HR    PA
-K-BB-HBP
1239    9.4    HR 
(HR/FB)    OF FB [MLBAM]
105    0.2    GB [MLBAM]    GB
+OF+IF+LD
205    0.4    OF FB [MLBAM]    GB
+OF+IF+LD
288    0.6    
IF FB [MLBAM]    GB+OF+IF+LD
2026    4.3    LD [MLBAM]    GB
+OF+IF+LD
36    2.3    SB    SB
+CS
161    1.2    SBA    1B
+UIBB+HBP+ROE+FC

Friday, June 10, 2011

How specific can we get in determining the true mean of a particular matchup?

By Tangotiger, 09:58 AM

Every matchup has a specific and true mean.  God herself would establish that specific and true mean at that specific point in time-space with zero level of uncertainty.  Pujols at Busch on July 3, 2011 against Doc and God knows that he can’t handle an outside cutter well, and the next pitch is going to be telegraphed by Doc as an outside cutter?  God says that Pujols will contact that pitch 23% of the time (if allowed to replay in that time-space an infinite number of times) with 0 level of uncertainty.

But what about humans?  If Pujols v Doc has an expected contact rate of 70% any time Pujols swings (with a certain level of uncertainty, say 10%), then how much a better mean estimate can we get in more specific situations (we find more data about Pujols and or Doc and or Busch and or the weather), and how much more can we reduce the uncertainty level?

(21) Comments • 2011/06/10 • SabermetricsBatter_v_PitcherStatistical_Theory

Wednesday, June 08, 2011

Testing the binomial distribution theory in baseball

By Tangotiger, 04:17 PM

Ichiro has had 802 games where he came to bat exactly 5 times.  His OBP was .413.

The expectation of him getting on base 0 or once, using the binomial distribution, is 252 times.  In reality, it was 262 times.

Ichiro had 671 games where he came to bat exactly 4 times.  His OBP was .326.

The expectation of him getting on base 0 or once, using the binomial distribution, is 406 times.  In reality, it was 399 times.

If you add the two above:
- the expected number of times he would get on base 0 or once, based on the binomial, is 659 games
- the actual number of times he actually did get on base 0 or once, based on the binomial, is 661 games

Ichiro was the first guy I looked at.  That it ended up this close was fantastically fortunate for me.  But, it’s not a surprise.

So, there’s my challenge to anyone else: select 10 hitters.  I dunno… Rickey, Boggs, Gwynn, Raines… whoever.  Whoever you are interested in (though preferably not guys with lots of IBB).

Report the results.  You’ll find something close to what I found.

***

For those wondering why the OBP are so different for 4 and 5 PA: the PA was selected after the fact.  If he came to bat 5 times, chances are, his team (and him) were hitting pretty well.  In order to not have this issue, I would instead only look for the FIRST FOUR PA of each game.  Then you wouldn’t have this problem.

(142) Comments • 2011/06/10 • SabermetricsStatistical_Theory

Friday, June 03, 2011

Reader Mail of the Day: What is luck?

By Tangotiger, 06:21 PM

My answer:

Michael,

I agree that, 100%, luck is the random occurrence centered around a true mean.  In effect, EVERYTHING in the world is luck, since something either did (1) or did not (0) happen.  There’s no such thing as something “partially” happening.  If someone has a .420 OBP, he won’t get on base 42% of the time in his next PA.  He either is, or is not, on base.  So, whether he got on, or not, is luck.  The FREQUENCY over a long period of time, is not luck.  But, any single event is luck.

That’s the tough part to get through that, any single occurrence is random, but it’s random based on the true mean.  A goalie saves 90% of his shots, so, he’ll get a save (1) far more than a goal (0). But any single event (save or goal) is luck.

To make it worse, the mean is not even constant!  If there’s a breakaway, his chance at a save is 65%, but if he’s got all 5 of his teammates, and only 1 shooter, his chance at a save would be 95%.

Tough concept.

Tom

***

To further add: the “true mean” is based on everything we know, and don’t know, about the environment.  That is, god herself told you the odds of something happening based on the properties of each entity.  When something happens, or not, that’s luck.  It’s the random occurrence (1 or 0), but predicated on the true mean (whatever it is, but it has to be greater than 0 and less than 1).

If one thing has a 100% causal effect to another thing, that has nothing to do with luck, and is instead, fate.  We’re not talking about fate.

And, I’m not talking about things “outside my control”.  That’s not luck.  That’s simply a gap in knowledge.

I’m talking about you know exactly the true odds of something happening.

(69) Comments • 2011/06/08 • SabermetricsStatistical_Theory

Wednesday, June 01, 2011

North American Society for Sport Management Conference

By Tangotiger, 11:35 AM

Friend of The Book’s Blog Millsy will be in Canada, as well as fellow debater Rodney Fort.

Wednesday, May 18, 2011

BE Press’s latest

By Tangotiger, 02:03 PM

There are several articles of interest this month, including one from Andrew Thomas.  I haven’t read any of these yet, but will do so momentarily.  Please feel free to highlight any of these you find interesting or want to discuss.

Monday, May 09, 2011

Stop the (prediction) insanity!

By Tangotiger, 12:59 PM

Excellent:

By its nature, punditry craves attention, which is easier to attract with certainties than with equivocation. But that certitude reflects bravado more often than true knowledge.

Maybe Kobe Bryant should take heed, predicting a series comeback after being down 3-0 (i.e., needing FOUR consecutive wins), only to lose the next game by 36 points.

(4) Comments • 2011/05/09 • SabermetricsStatistical_Theory

Friday, May 06, 2011

Married MLB players earn more than single MLB players of the same quality?

By Tangotiger, 04:25 PM

I haven’t read the paper, but apparently it’s true!

(18) Comments • 2011/05/09 • SabermetricsStatistical_Theory

Tuesday, May 03, 2011

Gassko’s thesis on Free Agency

By Tangotiger, 11:51 AM

dsg_freeagent_Thesis.pdf

(40) Comments • 2011/05/04 • SabermetricsStatistical_Theory
Page 3 of 16 pages « First  <  1 2 3 4 5 >  Last »

Latest...

COMMENTS

Feb 11 20:11
Clutch analogy

Feb 11 20:11
Fighting leads to goals?

Feb 11 20:02
Who is Jeremy Lin?

Feb 11 19:55
Why do players get crappy caps?

Feb 11 19:12
Hero of the month: Brittney Baxter

Feb 11 17:59
MGL: Today on Clubhouse Confidential

Feb 11 16:48
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

Feb 11 10:29
Dwight Evans

Feb 11 02:12
Performance through the ages

Feb 10 23:01
For Your Soul

THREADS

February 11, 2012
Why do players get crappy caps?

February 11, 2012
Clutch analogy

February 11, 2012
Who is Jeremy Lin?

February 10, 2012
Jose Molina

February 10, 2012
Reader Mail of the Day: Why do we need X years of fielding data?  And what about outliers?

February 10, 2012
Performance through the ages

February 10, 2012
Hero of the month: Brittney Baxter

February 10, 2012
Win expectancy charts used in football… in 1983!

February 10, 2012
Dwight Evans

February 09, 2012
Psst… wanna intern in Canada?