THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, August 29, 2008

Preview to Run Creation

By Tangotiger, 11:47 AM

Colin looks like he’s going to do what I’ve been meaning to do for years, and what others have been asking for a long time: look at run creation at the inning level, so that we can get far more extreme environments than simply looking at seasonal data or even game-level data.  The article I am linking to is a preamble to his work next week.  I’m looking forward to seeing the results.


#1    Peter Jensen      (see all posts) 2008/08/29 (Fri) @ 13:53

Linear weights provide a theoretical team construction as well, although it’s slightly different from the construct provided by a dynamic run estimator. A linear run estimator will generally play a player on a team that is average on the whole. That means that a very good hitter will be viewed in the context of being on a team of eight slightly below-average players, and a very poor hitter will be viewed in the context of being on team of eight slightly above average players.

I don’t think this is true for how I construct linear weights and I had assumed that it wasn’t true for other implementations.


#2    Tangotiger      (see all posts) 2008/08/29 (Fri) @ 14:04

I think what Colin is trying to say is this:
http://tangotiger.net/customlwts.html

That Linear Weights assumes that you are playing on an average team of average profile. 

However, if you are highlighting the part about “slightly xxx-average”, then I agree with you.  The player is being viewed against exactly average.  Unless of course Colin is saying to remove say Albert Pujols from the league and therefore, instead of the league runs above average being 0.0, it is now -0.2 runs per 162G, because we removed Pujols from the league.  If this is what Colin means, then technically he is correct.  It may be a question of nuanced degree here.


#3    Peter Jensen      (see all posts) 2008/08/29 (Fri) @ 14:27

I guess I’ll have to wait and see where he is going with this.


#4    Colin Wyers      (see all posts) 2008/08/29 (Fri) @ 14:30

Yeah - linear weights assumes that you’re playing in a lineup that averages out to nine league average players. (At least, when comparing it to theoretical team run estimators that’s the apples to apples comparison.)

So when you compare empirically derived linear weights to “Plus 1” weights from a TT run estimator, they’re not measuring the same thing - because a team of eight average players and Pujols is not an average league run environment, and that’s the run envionment used in a TT run estimator.


#5    Peter Jensen      (see all posts) 2008/08/29 (Fri) @ 15:55

linear weights assumes that you’re playing in a lineup that averages out to nine league average players. (At least, when comparing it to theoretical team run estimators that’s the apples to apples comparison.

That’s what I thought you meant and that’s just not correct.  Why do you think this Colin?


#6    Tangotiger      (see all posts) 2008/08/29 (Fri) @ 16:06

If Peter is suggesting that Linear Weights is based on 9 guys who are overall average as a team, but are individually not equal, then he is correct.  Linear Weights is based on the fact that you have a great hitter, a good hitter on the team, some above average hitters, average, below average, and sucky.  That is, a normal team, who are overall average, but are not 9 guys identical.

That said, a Markov chain made up of 9 identical players will give you extremely similar results.  In The Book, in the Batting Order chapter, I used both.  There really wasn’t much if any difference, in the overall weights (just in the lineup-specific weights).


#7    Colin Wyers      (see all posts) 2008/08/29 (Fri) @ 18:02

Lemme go ahead and step through this with an example, and see if we’re all in agreeance then. In the 2007 NL, the average team hit .266/.334/.423 in 6300 PAs. In a theoretical team model (Runs Created or BaseRuns, doesn’t matter) you’d use a team of eight players hitting .266/.334/.423 as a group and the player in question, say Albert Pujols. Because of the way you’re modelling the run scoring process, it really doesn’t matter if they’re homogenous or not, although it’s convenient to refer to it that way.

Such a team is an above-average team, however. In a linear weights model, you’re instead modeling a total of nine players, including Albert Pujols, that hit .266/.334/.423 as a whole. So instead of an above-average team, you’re looking at an average team.

Now, Pujols hit .327/.429/.568 in 679 PAs. That leaves 5927 PAs for the other eight hitters in the lineup. In order to make it reconcile out to .266/.334/.423 as a team, the remaining hitters have to average out to .266/.330/.422.

It’s a very subtle difference, and mostly a technicality.


#8    david smyth      (see all posts) 2008/08/29 (Fri) @ 18:25

It’s clear from the article and his posts here, that Colin has an excellent understanding of the subtleties of the run estimator business.

I look forward to his followup pieces.


#9    tangotiger      (see all posts) 2008/08/29 (Fri) @ 19:23

Colin, You are right in what you are describing.

However, what you want to do is have Pujols interact with 8 average guys, the exact same 8 guys that Adam Everett will interact with.  To do otherwise is to give each player a different context (over and above the unique context he himself brings to the table).

Take an extreme case, of someone who has a 1000 OBP%.  What you will end up doing, is having 8 guys with a 250 OBP.  That is not what we want.

For those who haven’t seen it, I recommend this:
http://tangotiger.net/reconcile.html


#10    david smyth      (see all posts) 2008/08/29 (Fri) @ 19:58

Tango, I don’t see the problem with what Colin is saying. He is simply saying that you shouldn’t apply standard Lwts to a batter, you should apply custom lwts, customized to his presence in the lineup. Of course, applying custom lwts is also a problem, as I recently had a discussion here with Patriot about.


#11    Peter Jensen      (see all posts) 2008/08/29 (Fri) @ 20:10

Colin - If you are calculating the linear weight value of each batter event on the entire NL 2007 record than you would subtract Pujols batting line from the entire 2007 NL batting line as Tango described in Post #2. If you are calculating linear weights just for StL then the you could subtract Pujols line from the actual StL team batting line.  But it is not correct to propose a hypothetical average team and subtract Pujols batting line from just that team.  Linear weights is derived from the run expectancies of whatever grouping you want to chose and will zero out for that group and only that group.  It is dependent on not only the distribution of batting events; 1Bs, 2Bs, 3Bs, HRs, BBs etc. but also the distribution of situations (BaseOut States) in which they occur.  Most of the time you want to use as large a group of PAs as possible without changing the run environment to minimize the sample errors that occur for the rarer BaseOut situations.


#12    Colin Wyers      (see all posts) 2008/08/29 (Fri) @ 21:00

Here’s the thread david is referring to:

http://www.insidethebook.com/ee/index.php/site/comments/linear_weights_v_runs_created/

But it is not correct to propose a hypothetical average team and subtract Pujols batting line from just that team.  Linear weights is derived from the run expectancies of whatever grouping you want to chose and will zero out for that group and only that group.

But that’s exactly what linear weights calculated against the league average do. They measure a player’s run contribution in a league-average environment. So - since runs are created at the team level - linear weights “zeroed out” to league average will reflect a player’s contributions to a team that is league average with his contributions included.

Since you’re holding the run environment stable, when one player changes, they all have to change, since any event alters the run environment.

And thanks for the kind words, David. I’ll try not to disappoint!


#13    terpsfan101      (see all posts) 2008/08/29 (Fri) @ 22:39

Until now, I was under the impression that to apply the Theoretical Team, all you had to do was take 8/9 of the players Linear Baseruns and add them to 1/9 of his straight Baserun total. How accurate is this method.


#14    terpsfan101      (see all posts) 2008/08/30 (Sat) @ 00:06

Referring to Tangotigers Reconcile Article in post#9:

My take on this article is that a player implicitly create runs for the other 8 batters in the lineup by not making outs. That means he should get partial credit for the extra runs that his teammates create as a result of the extra plate appearances he generated for them. I like the example Tangotiger chose for the Reconcile article. From my research so far on Run Estimators, I have learned that it is important to test models in extreme run environments.


#15    Colin Wyers      (see all posts) 2008/08/31 (Sun) @ 18:52

Negative B values are utter murder on the accuracy of BaseRuns at the inning level. What ends up happening is that negative B values between -2 and -4 end up giving you a fractional value for B+C, and dividing by a fraction is multiplication. So what you end up with are absurd run estimates of +/- 120 runs for a handful of events, which can really wreck your day. (Run values of -50 aren’t fun either.)

So I’m left wondering if I should just state that B values have a lower bound of zero, or if I should try and construct a BaseRuns equation where the B value won’t turn out to be negative.


#16    terpsfan101      (see all posts) 2008/08/31 (Sun) @ 22:03

If you include CS in “C factor” and then don’t include it in the “A factor” this will help with the negative B values. If you include CS in the “A factor”, the B value of the Walk almost always comes out negative. If you include it in the “C factor” but not in the “A factor”, the B value of the walk is positive. So you want to use Initial Baserunners in the “A factor” and All Outs in the “C Factor” In my opinion, this is the most accurate form of Baseruns. It gives the most reasonable +1 values. What Baseruns equation are you using?


#17    terpsfan101      (see all posts) 2008/08/31 (Sun) @ 22:09

I should of said “If you include CS in the C factor and not in the A factor” it helps with the negative B coefficients, not B values. Sorry about that. Again this would work for walks. Ideally, are all of the coefficients in the B factor supposed to be positive? I guess you can get away with it at the Team/League Level, and even the Individual Level in most cases. But I can see why it causes problems at the Inning Level.


#18    Colin Wyers      (see all posts) 2008/08/31 (Sun) @ 22:19

I tested it on Patriot’s latest BsR formula, F1-W.

If you get rid of negative B coefficients, you get rid of negative B values. It’s probabaly not a big concern for most BsR applications, granted, but in this case you get lower correlations than James’ latest Runs Created formula because of the issues with negative B values.


#19    terpsfan101      (see all posts) 2008/08/31 (Sun) @ 22:47

Considering you are doing this at the inning level and are you using Play By Play Data, you would get more accurate results with Tangotiger’s full BaseRuns equation. This includes all the small stuff like Reaching on Error, Wild Pitches, Passed Balls, etc… Use the values in the “lwts_RC” column. You have to include “Implied Outs” for it reconcile to Runs scored. That would mean you have to include partial innings in your study. You would also need to be careful not to double count certain events, because Tango’s categories are all mutually exclusive. For instance “Other Advance” is the number of “Fielders Choice zero Outs Recorded” minus the number of “Sacrifice Hit Fielders Choice’s”. Finally, remember that his equation used data from 1974-1990. That comes out to 4.27 Runs/Game. But you could easily calibrate the equation for a higher or lower run enviornment if you so desired.


#20    terpsfan101      (see all posts) 2008/08/31 (Sun) @ 23:08

Geez, I made another mistake. I meant to say “Other Safe” not “Other Advance” is the number of “Fileder’s Choice Zero Outs Recorded” minus the number of “Sac Hit Fielder Choices” The great thing about Tom’s equation is that he classifies all his categories by Retrosheet event numbers. So it would not be very hard to apply at all if you have the PBP data. And finally, his equation is the only one I know of so far that states the proper “RC value” for plays involving outs. Instead of applying the average R/O, like Patriot did, Tom applies the actual R/O depending on whether it was the 1st, 2nd or 3rd out. Think about it, the R/O for the SH is always higher than the average R/O, because it occurs when there are 0 and 1 outs, and the R/O of making the 1st and 2nd outs is higher than the R/O of making the 3rd out. I looked at all this about a week ago using the R/E matrix Tom devised for the equation. The average R/O was .1597. The R/O of the 1st, 2nd, and 3rd outs were .224, .161, and .094. Theoretically speaking, Tango’s 1974-1990 equation is perfect. Now you can’t apply it to the official statistics without distorting things since he categorizes things by events. However you are using Play by Play data, so you can categorize things like he does.


#21    terpsfan101      (see all posts) 2008/08/31 (Sun) @ 23:31

Antoher mistake. I should of said that his equation is “empirically perfect.” Nothing is fudged, All batting events are accounted for.


#22    Patriot      (see all posts) 2008/09/01 (Mon) @ 01:12

Interesting stuff from all parties.

Colin mentioned using the “F1-W” version that I posted in my linked post.  That one is based on initial baserunners for A and batting outs for C.  I also presented the other 3 possible combinations that arise from using initial/final baserunenrs for A and batting/all outs for C. 

They all have negative B weights, but the “F4” version is negative only in strikeouts (if you were to make an “F4-W” version that fixed the walk problem).  F4 is final baserunners/all outs, and perhaps a construct like that is necessary on the inning level.

I don’t think there’s really anything wrong with bounding B at zero.  It only happens at an extreme level at which there’s not going to be any runs scored (*).

We’ve discussed some ideas on solving the B problem in the past.  One suggestion that has probably been floated before but that I can’t recall right now is tying the strikeout/DP/etc. penalties that result in negative B values to the positive B events.  In other words, instead of having a(S) + b(D) + c(T) + d(HR) + ...., you would have something like [a(S) + b(D) + c(T) + + ....]*(K/DP adjustment).  This is better theoretically because in absence of baserunners, we don’t care how the outs are made. 

Unfortunately, it would also a pain in the butt, at least for the empirical LW matching that Tango and I have used in the past.

Terps/13:  The actual percentage for straight BsR to approximate TT BsR is tightly centered around 1/9.  I never actually formally or systematically studied it, but I don’t recall ever seeing it outside of 9-13%.  Mark McGwire 1998, a fairly extreme season, is 10.6% (of course, that’s dependent on the specific assumptions about the reference team that you place him on).

(*) assuming complete knowledge of all events.  We could construct some crazy theoretical innings in which there are 100 batting outs due to 97 errors and, 97 runs end up scoring.  That’s not really a formula error but a data deficiency, as if you knew there were 97 RBOE, you could have an equation (like Tango’s full version) that factored it in.


#23    Patriot      (see all posts) 2008/09/01 (Mon) @ 01:16

Sorry, an addendum on the TT team.  The 1/9*straight + 8/9*linear property is true for TT RC, but not for BsR.  Some people (including myself at one point) confuse the two for obvious reasons.

That property is true for RC because of the math; RC is straight multiplication, while BsR was addition and multiplication, and so it doesn’t work out nice and neat.


#24    terpsfan101      (see all posts) 2008/09/01 (Mon) @ 02:19

Patriot,

I keep putting off finishing your series on Run Estimation. I read and enjoyed the first 4 articles. But I need to read the 5th and 6th articles on the Theoretical Team. On you old website you also have some material on the Theoretical Team that I glanced over a couple of times. Thanks for the Baseruns spreadsheets by the way. From the spreadsheet I see that you can applied the Theoretical Team 7 different ways. Honestly I’m in a bad mood right and can’t sleep. This hurricane is eating away at my conscience. I’m glad to see that 95% of New Orleans population has evacuated. That is the number I heard from the Weather Channel. Hopefully 100% of the civilians will be out of there before it hits. Of course, New Orleans is not the only area that the storm looks like its going to hit.


#25    tangotiger      (see all posts) 2008/09/01 (Mon) @ 10:09

The negative B values might be “fixed” by putting all those in the C term.  For example, maybe C = outs + 0.05*strikeouts


#26    terpsfan101      (see all posts) 2008/09/01 (Mon) @ 17:09

Putting all the negative B values in the C term would work for outs, but what about the negative B value of the IBB. You probably need to treat the IBB’s as Unintentional Walks.


#27    David Smyth      (see all posts) 2008/09/01 (Mon) @ 19:28

I don’t think anyone should bend over backwards because of IBBs. They are a strategical input, and as thus it is not the job of BsR to handle them correctly. Take them out of the equation.

I believe that these ‘problems’ with BsR ( negative B factors, etc.) are 95% due to limited data rather than any structural problem with the formula. And furthermore, I do not think it is always, or even often, correct to try to ‘force’ the correct Lwts values for the outcomes. In fact, I think it is only correct to do so when you have ‘all’ of the possible outcomes in the equation.

So I say, construct the formula properly (so negative runs are not possible), and accept whatever inaccuracy you get due to limited data.

Base Runs is a very simple formula and concept. I think that some of the people who are working with it are obviously very bright, but may be not spending enough time simply looking at the formula and ‘getting’ exactly what it’s doing.

IOW, if you are finding that you just have to try to work IBB and SF ( situational outcomes) into the elements, or that you have to set up a possibly negative B rate to force in the correct Lwts +1 values, then you are simply not understanding what the BsR formula is really doing.

It is not the ‘task’ of the BsR formula to betray its basic concept and structure, just to be a bit more accurate for some sample. It is the task of BsR to provide the best available model of the scoring process, given the limitations of the data. That means that BsR will ace some accuracy tests, and in some others will be in the middle of the pack. So be it.


#28    tangotiger      (see all posts) 2008/09/01 (Mon) @ 21:11

I’m not sure that you need to treat that one differently.

We can consider the “B” term as anything that makes the “A” values move up, and the “C” term as anything that makes the “A” values move down.

The “A” values (getting on, excluding HR) have a scoreRate of around 30% (that is B divided B+C is 30% these days).  So, the run value of the walk and single and double are set at .30.

Since the IBB is worth less than .30 runs, it needs to come down.  And you do that by increasing the C weight.  Just something that came to me, so maybe I’ll work it out a little later, unless Patriot and others have gone down this road before.


#29    terpsfan101      (see all posts) 2008/09/01 (Mon) @ 23:52

You can get rid of the IBB problem by subtracting its value (.176) from the BB (.303). I am referring to Tango’s 1974-90 BsR Equation here. That rounds to (.291), and the new B value is positive (.0001). You would then just apply a LW value of .291 to BB+IBB.


#30    terpsfan101      (see all posts) 2008/09/02 (Tue) @ 00:04

I’ve got a bad habit of not editing my posts. I should of said you subtract the value in runs of the IBB

IBB = .176 * 22260 = 3927 runs
BB = .303 * 202435 = 61399 runs

New LW of BB+IBB = (65326 runs / 224695 BB+IBB) = .291.

The b value here is barely positive at .0001 using Tango’s 1974-1990 BsR equation. Hope that’s clearer.


#31    Colin Wyers      (see all posts) 2008/09/02 (Tue) @ 00:23

I took the second version of BsR from the wiki (the one with SB) and removed IBB, CS and DP from the B term to make sure that all B values were positive. That did the trick. (I also counted reached on error as a single in all terms and an IBB as walk in the A value only.) I held outs constant at 3.

I also similarly modified the latest RC version, again as per the wiki. Sacrifices were removed from the B term. (I will probably tweak RC a bit further before I publish this on StatSpeak, mostly removing K from the B value.)

Correlation -
RC: .893
BsR: .907

Avg. Error -
RC: .322
BsR: .281

Nothing to get excited over, right? I think that the scatterplot tells a much more interesting story:

RC: http://farm4.static.flickr.com/3213/2820603806_c86a7a010b_o.png
BsR: http://farm4.static.flickr.com/3273/2819761635_b03c87b31c_o.png

Pay attention to the y values - BsR tops off at 15 or so, while RC goes all the way up to 20.


#32    Colin Wyers      (see all posts) 2008/09/02 (Tue) @ 00:42

And because it seems like every other run estimation study uses it, RMSE:

RC: .565
BsR: .451


#33    tangotiger      (see all posts) 2008/09/02 (Tue) @ 07:29

Colin: In order to make a fair comparison, you should insist that both equations use the same terms.  I’m not sure if you did…

Terps: certainly we can combine the BB and IBB, but we’d also like to see the separate some times.


#34    terpsfan101      (see all posts) 2008/09/02 (Tue) @ 08:37

What do the solid red lines on the scatter plot mean? I haven’t seen a scatter plot expressed in this manner before.


#35    Colin Wyers      (see all posts) 2008/09/02 (Tue) @ 09:47

They weren’t quite using the same data - Ks and DPs were the only plays that were different between the two. Removing K from the B term (and fixing an earlier mistake and adding ROE to the A term) raised RC’s RMSE to .583 and average error to .334.

As far as DPs - I’m comfortable with allowing RC and BsR to use different approaches to estimating the number of baserunners because it’s a philosophical, not technical, difference. (All outs are accounted for in the C factor, so the data is being used somewhere.)

The reason the scatterplots look that way is because you can only score an integer number of runs in baseball, whereas run estimators can (and routinely do) produce fractional numbers. When you’re plotting nearly two million points like that, you tend to lose differentiation REAL fast.


#36    Tangotiger      (see all posts) 2008/09/02 (Tue) @ 09:47

They are not solid red lines, but simply show all the data points.  Because there are so many bunched up, it looks like it’s a line.

Suffice to say that the “ranges” are wider for RC than BsR.


#37    terpsfan101      (see all posts) 2008/09/02 (Tue) @ 19:28

The only recommendation I can make for Colin’s Baserun equation is to remove Stolen Bases since you are not including CS. Or you just need to make sure you assign the SB a lower coefficient and LW value if you aren’t going to include CS. It would probably be easiest to not include the SB.


#38    Colin Wyers      (see all posts) 2008/09/03 (Wed) @ 02:28

I actually hadn’t added ROEs to the A factor of BsR - okay, well I had, but they weren’t showing up in the results. That’s fixed, and I added CS and DP to the A factor as well. RMSE goes down to .425.



#40    Peter Jensen      (see all posts) 2008/09/05 (Fri) @ 05:20

So BaseRuns is better at estimating runs scored in an inning than Runs Created or Extrapolated Runs.  What’s the point of knowing that?  It certainly has no bearing on which run estimator is better at evaluating the run contribution of individual players.

BaseRuns is very good at estimating linear weight values for environments where we don’t have play by play data and you and others seem to working hard to make it even better.  Whether it is as good at that task as Markov simulated linear weights values or as values that could be produced by a more complex simulation has not been tested to my knowledge, but testing that would be a good idea.  But for trying to predict what a player’s future value will be the basis should be an empirically based linear weights derived from the RE charts from a run environment in which a player is playing or is expected to be playing.  If you want to evaluate what a player has done in the past, then the value added approach that Ruane describes in the article you linked is better.

Testing run estimators that are be used on individual players by their accuracy in estimating team runs produced whether for a team season or an individual inning is irrelavant.


#41    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 09:15

It certainly has no bearing on which run estimator is better at evaluating the run contribution of individual players.

There are teams, there are pitchers and there are hitters.  BaseRuns is ideal for teams and individual pitchers, but not ideal for individual hitters.

Whether it is as good at that task as Markov simulated linear weights values or as values that could be produced by a more complex simulation has not been tested to my knowledge, but testing that would be a good idea. 

Actually, I had tested it back in the Fanhome days.  I used my simulator and I tested it against BaseRuns, and I was shocked at how well it was holding up in various extreme environments.  It was a pretty good test actually, as I tested some 25 or so different scenarios ranging from lots of HR, lots of walks to little HR, little walks, and everything in-between.  I made a few other tests as well, moving the singles and doubles up and down.

One can also try their own tests using my Markov:
http://tangotiger.net/markov.html
With the understanding that the Markov overweights walks alot and singles a little because of the lack of baserunning outs and the presumption that the frequency of each event is the same at each base out state.  That said, you can try some crazy combinations if you like, and see at what points BaseRuns breaks down.

But for trying to predict what a player’s future value will be the basis should be an empirically based linear weights derived from the RE charts from a run environment in which a player is playing or is expected to be playing.

If by empirically-based, you mean using the frequency rates of each event by base/out state and using the transition rates for each event from state-to-state, then I agree wholeheartedly.

If you want to evaluate what a player has done in the past, then the value added approach that Ruane describes in the article you linked is better.

However, I will disagree here.  The RE approach that uses only the actual empirical results is subject to sample size issues.  Indeed, you wouldn’t use the RE data for just 10 games would you?  But you would for 10 million games.  But, what about 1000 games?  In fact, you could (and IIRC you do) have a situation where the run value of the triple exceeds that of the HR if you use the Ruane league/year data.

What you do need is to have 1 million samples at each base/out state.  Barring that, you need to use the empirical event frequency and the event state-to-state transitions to construct a Markov chain to derive Linear Weights for that particular run environment.  Even in that case, if your sample size is too small, you might get problems.

Testing run estimators that are be used on individual players by their accuracy in estimating team runs produced whether for a team season or an individual inning is irrelavant.

Only for individual hitters, not for individual pitchers.  You are probably implying it, but it should be noted explicitly for the benefit of others.


#42    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 09:40

One of the breaking points to BaseRuns occurs when you go to my Markov page, and simply change “AB” to 13 and SO to 1.  After you hit CALCULATE, go to the bottom, and you will see that the run value of the triples exceeds that of the HR according to BaseRuns.  Markov of course is never wrong on this front.

It is a small blip that exists on a very extreme environment, one we’ll never come across.  A year or two ago, I noted an alternative to BaseRuns that would prevent the triples-issue from materializing.  It was based on Markov-testing, and worked pretty well.  Let me dig through the archives…


#43    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 09:46

Also, change the AB to 10.5 and the SO to 0.  You will see that both Markov and BaseRuns have the positive Linear Weights converge toward 1.0.  Then, look at Runs Created.  Runs Created makes the run value of the walk nearly 0 and that of the HR nearly 4.0, both of which are ridiculous.


#44    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 09:50

Here is my fix to the Triples-problem:

http://www.insidethebook.com/ee/index.php/site/comments/updating_baseruns/

It’s actually pretty cool.  I forgot I did that.


#45    terpsfan101      (see all posts) 2008/09/05 (Fri) @ 12:35

Tangotiger,

On your Markov page, where you enter the batting line, the SO column appears to be cosmetic. It doesn’t matter what you type in there. 0 or 1000000.  I also noticed the Markov always gives the same value for the SO as it does for the Batting Out. Does the “Triples fix” work only at the lower end of the run spectrum. I’ve seen +1 LW from Softball and Amateur leagues where Baseruns predicts a higher LW value for the 3B than it does the HR.


#46    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 13:23

The SO is not cosmetic.  You should try changing it from 7 to 10 and see the difference.  I just tried it and there is a difference.

Note that I make no validation check, so if you enter 100 SO and you only have 30 AB, I wouldn’t necessarily trust the results.


#47    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 13:24

"I’ve seen +1 LW from Softball and Amateur leagues where Baseruns predicts a higher LW value for the 3B than it does the HR. “

I believe it.  However, try it with my Triples fix, and you will see that problem go away.


#48    Colin Wyers      (see all posts) 2008/09/05 (Fri) @ 13:29

Peter, that’s like asking why anyone would ever want a screwdriver when you have a perfectly nice hammer.

Arguing over the correct run estimation methods for individual hitters from 1950-2008 is, to me, spectacularly boring, even if I do seem to devote much of my energies to it. It’s not an entirely solved problem, but for the most part we’re 95% there, and all that’s left is to fight over that last 5% and proselytize. (Because even though we have very robust TT BsR models and empiric linear weights values, half the world is still using Runs Created, and they’re the advanced half!)

So, if your only interest is in studying the hitting contributions of individual hitters to an average team, then you already pretty much have everything you would ever need.

But if we want to move over into a different problem space - like evaluating pitching or modeling the impact of a hitter on a specific team or figuring out what the correct lineup or using individual player projections to model team wins - we need another set of tools altogether.

And again, if that’s all you’re into, then Patriot and Tango have both published very good versions of BsR, and Smyth’s BaseRuns Primer versions are very robust as well. So those tools are out there.

The purpose of this series of articles is multifold. One, to showcase the issues I have with the way Runs Created is contstructed. Two, to actual get my hands dirty with run estimators - if someone finds these articles half as educational in how BsR works as writing them was for me I’ll be thrilled. A third purpose has arisen as I’ve started writing - to see if I can’t make some refinements to BaseRuns to address some of the B factor issues at extreme run scoring environments.

That said, I’d like to think all of this is useful to some line of baseball analysis in the future, but I won’t be upset if it isn’t. I’ve found writing these articles to be interesting and worthwhile for their own sake, and I’ve got some feedback that indicates that there are at least some readers who enjoy reading the analysis as well. I have one paid article to my credit, and teams aren’t beating down my door to hire me as an analyst, so if I didn’t think this sort of thing was fun in its own right I wouldn’t be doing it.

If you don’t enjoy it, that’s fine - different strokes to different folks. But that’s the point of it.


#49    Peter Jensen      (see all posts) 2008/09/05 (Fri) @ 13:44

I believe that the limitation of a Markov having the identical batting line for each BaseOut state introduces more error in an RE table than the small sample sizes of the rarer BaseOut states in an empirical based RE table. But the value added method is not dependent on an empirical RE.  Gary Skoog and I had many discussions about this in the winter of 1987.  He opted for using his Markov model for generating the RE table for his linear weight values in the 1987 Abstract article; I chose to use 1986 Project Scoresheet PBP data to generate the RE table I used to calculate linear weights in my 1987 SABR presentation.  There were only minor differences if I recall correctly.  Obviously, my RE table would have been even more accurate if I had multiple years of data available to me at the time, but 1987 was the first complete year available from project scoresheet.


#50    Peter Jensen      (see all posts) 2008/09/05 (Fri) @ 13:53

Colin - I have no problem with your eforts to improve BaseRuns as I have stated elsewhere.  But that is not how you presented these articles.  You presented them as a comparison of run estimators.  My post #40 is saying that the comparison you presented is virtually meaningless for any purpose except estimating how many runs will be scored in an inning, which is trivial.


#51    terpsfan101      (see all posts) 2008/09/05 (Fri) @ 13:54

Peter,

You said: “But the value added method is not dependent on an empirical RE.” Tom Ruane’s RE charts are empirically derived, right?


#52    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 14:28

I believe that the limitation of a Markov having the identical batting line for each BaseOut state…

The Markov on my website does that.  However, the Markov in The Book *does not* do this.  Indeed, I take the exact frequency of each event by base/out state and the exact state-to-state for each frequency at each base/out state.

In fact, The Book shows it in 3 ways:
- the pure empirical approach (Table 4), which is runs to end of inning minus the starting RE for that base/out state
- the empirical approach (Tables 6, 7), which used the RE of each base/out state, and then uses the frequency of each event at those base/out states
- the Markov-empirical approach (Table 11, but denominated in wins), which uses the frequency and state-to-state by event/base/out, and makes it identical by inning/score

In any case, the only reason Peter’s approach (approach #2 in my above list) works is because of sample size.  If you look just as say 1958 NL, you’ll get problems.  Maybe not with that specific league/season, but at some point.  You need sample size for approach #2 to work.

Take the extreme case: one game, where the score is 1-0, with 1 HR and 10 hits and 5 walks.  What do you think the RE is for the 23 base/out states that excluded bases empty 0 outs?  Exactly .000.  The bases empty 0 outs RE will be 1/PA, say .022.  Does that make any sense?  No of course not.  How about 10 games?  100?  1000?  1 million?  How many games do you need?

If you look at Ruane’s data in the 1960s, you’ll see some impossible results.  So, you need to be careful.


#53    terpsfan101      (see all posts) 2008/09/05 (Fri) @ 14:44

Tom,

Using your Custom Linear Weights Spreadsheet, the LW value of the 3B doesn’t pass the value of the HR until approximately 36 to 37 runs per game. Obviously we wouldn’t need a “triples fix” in higher run enviornments.


#54    Peter Jensen      (see all posts) 2008/09/05 (Fri) @ 14:54

Tango - To calculate RE’s empirically you should always use the largest multiyear sample that has the same approximate run’s per inning to minimize the amall sample size issues.  So should your Markov for determining the event frequency of each event for each BaseOut State. So the effect of sample size should be identical no matter how many iterations you run the Markov.  If you use a larger sample size to determine the event frequency for the Markov then you run the risk that the event frequency is not representative of the run environment you are trying to model.

I would be willing to bet that if you asked Ruane today he would advocate using multiple years for the RE tables rather than the single league season he used in his earlier article.

I don’t know much about BaseRuns but doen’t it also suffer from the limitation of using the league average batting line for all the players on its theoretical team?


#55    Tangotiger      (see all posts) 2008/09/05 (Fri) @ 15:26

So should your Markov for determining the event frequency of each event for each BaseOut State. So the effect of sample size should be identical no matter how many iterations you run the Markov.

While you are correct that I am also beholden to sample size issue in determining both the frequency of events and the state-to-state rates for each event, the sample size is less of an issue for me, and not identical.

We already see with the Ruane data that just one thousand games gives you kooky results.  If I had 1000 games, the state-to-state rates for the single with man on 1b, 0 outs and man on 1b, 2 outs would not be kooky in the least.  I’m not going to get a 35% advancement for the former and a 30% advancement for the latter after 1000 games.  I’m not sure how many games I would need, but it would be less than whatever the RE approach requires.

***

http://www.retrosheet.org/Research/RuaneT/va_efr_dat.htm

The very first table: RE, 1960NL, with 2 outs:
man on 2B: .323
man on 3B: .320

That is obviously impossible.

Over in the other league, the same year, it’s .377, .401.

The next year in the NL: .332, .426.

I can guarantee you that if I were to create an RE chart using Markov (that is, use the frequency of each event at the 24 base/out states, and the state-to-state transition rates for each event at those base/out states), I would not get the crazy 60NL RE chart.

Indeed, the ONLY way for Markov to give me the empirical results of 60NL (2 out scenarios) would be for the frequency of positive events with man on 3B 2 outs to be much worse than the frequency of positive events with man on 2B 2 outs.

I would bet that the 2 out issue in 60NL, man on 2b and man on 3B is the result of not only much worse hitting with man on 3B (by luck), but that the guys who managed to hit the rest of that inning (as few as they are) would also have managed to have worse hitting.

That is, if the OBP/SLG of the guys at bat with man on 2B, 2 outs was say .320/.390, maybe the man on 3B, 2 outs was say .300/.370.  Not only that, but then the successive batters in that inning was also worse with the second scenario than the first.  The Markov approach would make the successive batters exactly equal to whatever the sample performance was at the new base state.


#56    Peter Jensen      (see all posts) 2008/09/05 (Fri) @ 19:21

Tango - My guess would be that there were more home runs hit by the two batters following the 202 BaseOut State than the 32 BaseOut State. Not surprising since a lot of triples are hit by #1 hitters and lots of doubles by #3 hitters.  I don’t have 1960 in my Retrosheet database or I would check out what was odd about the event distribution in the NL.


#57    tangotiger      (see all posts) 2008/09/05 (Fri) @ 20:11

Sean Forman to the rescue:
http://www.baseball-reference.com/pi/bsplit.cgi?lg=NL&team=TOT&year=1960#situa-bases

Man on 3B, 2 out, the batting average a paltry .223, compared to a league .255.

In 1961 NL, it was .259 (compared to a league overall of .262), and in 1960 AL it was .261 compared to .255 overall.

Clearly, the .223 was a sample size issue.


#58    Colin Wyers      (see all posts) 2008/09/06 (Sat) @ 02:12

Peter: I’m testing run estimators at their ability to estimate run scoring, as the term “run estimator” implies their purpose is. So far I’ve left the reason for wanting to use a run estimator as an exercise to the reader, and in fact have made several disclaimers that the best approach to run estimation is in fact not the best approach to estimating the value of individual hitters.

By doing this test on the inning level, rather than the team-season level that is traditionally used, I’m increasing my sample size of test cases and hopefully mitigating a potential source of bias (for instance, a team of otherwise poor hitters that happens to bit a disproportionate amount of triples would artificially depress the value of the triple).

At this point we really seem to be talking past each other - as far as I can tell, you are using the term “run estimator” to refer to the practice of estimating an individual hitter’s contribution to scoring, which is certainly one use of a run estimator, in fact probably the most common use of run estimation.

But I’m simply looking at run estimation, broadly defined as estimating run totals from component stats. Individual hitters are like team run scoring totals - as I lay out in part one, the spread in run scoring talent between teams (or between full-time starters) is so small that almost any reasonably-constructed run estimator will perform well enough. Dan Fox did a very readable study:

http://www.hardballtimes.com/main/article/a-closer-look-at-run-estimation/

And the difference between the best and worst run estimator in R was .0042. Hardly a difference at all, right?

But you put RC or XR up against, say, Pedro Martinez, and you end up getting much worse results. All of this really grew out of my studies of run estimators on pitchers, not hitters:

http://mvn.com/mlb-stats/2008/08/15/creating-a-dynamic-fip-with-baseruns/


#59    Peter Jensen      (see all posts) 2008/09/12 (Fri) @ 11:00

Colin - Read your part 3.  Your run values in your RE table are low for the current MLB run environment.  What are you using for your time period and why did you select it in particular?


#60    Tangotiger      (see all posts) 2008/09/12 (Fri) @ 11:02

http://mvn.com/mlb-stats/2008/09/12/what-run-estimator-would-batman-use-part-iii/#more-626

It looks like Colin is using the full Retrosheet data, as it looks close to what I did here:
http://tangotiger.net/retrosheet/reports/re.htm


#61    Colin Wyers      (see all posts) 2008/09/12 (Fri) @ 11:50

Precisely; the linear weights values are derived from the same dataset that I’m using as a test case. As always, this will produce the best results for your dataset; the Retroera is a common enough research subject that I think these weights might have practical applications beyond what I’m doing with them, but as always with LWTS you can get better results by selecting LWTS designed for the specific environment you’re interested in.


#62    Peter Jensen      (see all posts) 2008/09/12 (Fri) @ 12:16

To figure absolute runs, we need to start the inning off from 0. So we take that .486 and divide by 3 to get .162. Then we add that value to our events where an out is made.

Tango - Who decided that this was the best way to change runs above average to absolute runs?  I am asking you because this is the way that you suggest on your web site and I assume Colin is just following your lead.  It seems to make no theoretical sense to reward making an out by increasing its value relative to other events, particularly if are going to use the linear weights values to calculate the absolute run contributions of individual batters or pitchers.  It seems much more logical to divide up the average runs among PAs that DON’T result in an out as those PAs help to extend the inning.

Precisely; the linear weights values are derived from the same dataset that I’m using as a test case. As always, this will produce the best results for your dataset;

If your aim is to eventually compare BaseRuns generated values to linear weights values for the same time period then it really makes no difference what time period you choose.  But the RE values for such a large time period that includes diverse run environments are useless for creating linear weights to compare individuals.  The utility of both linear weights and BaseRuns is that each can be customized for the particular run environment that you are interested in. Providing, of course, for linear weights that you have PBP data for to create the RE table for that time period.


#63    Tangotiger      (see all posts) 2008/09/12 (Fri) @ 12:40

Peter, that article I wrote was from a long time ago.  At some point after I did that, this has become my position:
http://tangotiger.net/reconcile.html


#64    Peter Jensen      (see all posts) 2008/09/12 (Fri) @ 13:20

Tango - I read the reconcile and it did seem that you were moving away from your previous position.  Unfortunately the extreme hypothetical example you chose doesn’t give much practical help on how you would apply its results in the real world of linear weights and so people like Colin are still using your old method.  Which is also the method given for adjusting to absolute runs on Wikipedia under linear weights.

I see in other threads that you have suggested
RC = (.12 * PA) + Lwts.  Do you think that adjusting all PAs is gives a better estimate of run value than adjusting just the non-out PAs?


#65    Colin Wyers      (see all posts) 2008/09/12 (Fri) @ 13:42

Actually, the method I used wasn’t directly based on anything Tango wrote - I probably did read Tango’s article, but at the time I was thinking about an article Dan Fox wrote for THT. (I see now in looking at the article again that Fox references the article you’re referring to.)

But the RE values for such a large time period that includes diverse run environments are useless for creating linear weights to compare individuals.

Which… isn’t what I’m doing. So, yeah, I’ve created linear weights that aren’t optimal for a purpose other than my own. They’re optimal (or at least, appropriate) for what I want to do with them.

Tomorrow hopefully I’ll have a follow-up that will shine some light on the issue; I couldn’t get it finished in time to publish this morning, but I’m working on a RE table that starts from zero runs. Since I don’t have the RE table sitting in front of my yet (I have the query written and will hit “Go” on it when I leave for work), I don’t know if it will shed any light on the issue or not, but that’s the direction I’m headed in.


#66    Tangotiger      (see all posts) 2008/09/12 (Fri) @ 14:04

Peter/64: I think if I were to adjust based only on the non-outs, then that reconcile.html post won’t… uhm… reconcile.  Granted, I haven’t given it much thought beyond when I wrote that.  I’ll have to think about it some more to make sure.

Also note (not necessarily for Peter’s behalf, but everyone in general) that I only use extreme cases to highlight the issue, not necessarily as something that needs to satisfy every nuance of a metric.


#67    Colin Wyers      (see all posts) 2008/09/12 (Fri) @ 14:24

Alrighty then - we will make amends ere long; Else the Puck a liar call.

I don’t promise these are perfectly appropriate, either. In fact, they share some of the flaws we’ve already discussed with the Ruane weights. But, here are my Retrosheet linear weights, broken down by year and league:

http://www.editgrid.com/user/cwyers/Linear_Weights_and_Run_Expectancy,_by_League_and_Season

Full RE tables are available as well. These are runs above average, just like Batting Runs and unlike ERP/XR. Double plays are not broken out into a seperate category. Any of the bolded values in the LWTS_Pivot table are not actual values but an approximaiton. I have done no validation of any of these. Partial innings excluded.


#68    Colin Wyers      (see all posts) 2008/09/19 (Fri) @ 03:09

The (hopefully) penultimate section is now available:

http://mvn.com/mlb-stats/2008/09/19/what-run-estimator-would-batman-use-part-iv/

As with last week, I ran out of steam at the end, but unlike last week, I at least feel like I’ve contributed something meaningful that hasn’t been done before. Next week, the test.


#69    Peter Jensen      (see all posts) 2008/09/19 (Fri) @ 09:02

I read the new installment.  It would be helpful to me to know the exact methodology that you used in creating your new RE table and the new linear weights table.  It appears that you are still adjusting the out value to get “real” run values instead of runs above average.  I am not sure what useful purpose you have in mind (referring to your post #65) where that would be the optimal methodology.  I expect it would give a very high correlation in estimating runs scored in an inning, but that would amount to nothing more than an accounting parlor trick and would not be indicative of any usefulness for any other purpose. 

For better or worse, I will not be able to participate in any further discussions until I return from a trip on Monday.


#70    Tangotiger      (see all posts) 2008/09/19 (Fri) @ 09:58

Colin: I think the LWTS chart would look more readable if it were sorted by LWTS values.


#71    Colin Wyers      (see all posts) 2008/09/19 (Fri) @ 11:46

Peter, in this case, it’s not a trick of the light at all, but a result of how I calculated the RE values. I used the “absolute RE” table I generated, which is simply the average chance of a runner on that base scoring given that number of outs.

From that point on, it’s the same basic concept as generating any other set of empiric linear weights - simply looking at the change in RE from before and after the event. I did not adjust the values of the out afterward; by simply looking at the runners on base and not the “inning-killing” value of the out, you get “absolute” LWTS values, which is what you need if you’re using LWTS as a way to generate parts of a dynamic run estimator.

As far as resorting the LWTS table - does anyone else agree? If that’s the general consensus, I can do that. Alternately, I can try and see if I can get a sortable table in there. I’ve done it before, but I don’t know if the JavaScript will work with MVN’s hosting platform.


#72    David Smyth      (see all posts) 2008/09/19 (Fri) @ 18:06

Colin, I scanned the pt 3 and 4 articles. I don’t get what the -.022 out value is supposed to represent. We know that the lwts value is -.30, and the absolute value -.10. Does the -.022 represent the difference from the ‘total’ value of .145 per ‘event’

A bit more clarification than you gave in the brief articles is called for, IMO.


#73    Colin Wyers      (see all posts) 2008/09/19 (Fri) @ 18:21

David - Do you mean the -.024 value from the chart in Part IV? That is the value of any event with Retrosheet event code 2 ("Generic Out"), with double plays excluded. Generally speaking, that’s an out on a ball in play that doesn’t fall into any other category. (Outside of the double/triple plays, everything corresponds to a Retrosheet event code.)


#74    david smyth      (see all posts) 2008/09/19 (Fri) @ 18:31

So, how would you represent the ‘generic out’ in regular terms? Is it AB-H-SO-ROE+SF+SH -FC, or something like that??


#75    david smyth      (see all posts) 2008/09/19 (Fri) @ 18:37

Just to expand, when I tried the formula on a dataset I have memorized, using O as simply AB-H+SF, I got a RC estimate less than half of what it should have been.

I can’t believe that you could be making such a huge mistake, so obviously I am not understanding the formula.


#76    terpsfan101      (see all posts) 2008/09/19 (Fri) @ 18:40

The Generic Out event code in Retrosheet includes all outs on Balls in Play except Fielder’s Choice. There are also a few “Reached on Error” included in there as well. So David is correct in saying that AB-H-SO-ROE+SF+SH-FC would be the best way to represent Generic Out in regular terms. Colin is defining generic out as AB-H-SO-ROE+SF+SH-FC-DP-TP.


#77    Colin Wyers      (see all posts) 2008/09/19 (Fri) @ 18:56

David - No, I could be making that big of a mistake. (The only defense I have is that I did say right there that I hadn’t tested the formula at all.) Check the comments for an updated formula that actually has been tested and works well with real live data.


#78    Peter Jensen      (see all posts) 2008/09/23 (Tue) @ 07:54

I’m back. Did you miss me?

...in this case, it’s not a trick of the light at all, but a result of how I calculated the RE values.

Well, Colin, that’s exactly my point isn’t it.  You chose to create your RE table in a particular way and didn’t adequately explain the methodology. But it is clear that the entire “run value over average” is created by choosing to assign no negative value to an out that happens when no one is on base.  This is obviously theoretically incorrect and distorts everything that comes afterword.

The batter does have a chance of scoring eventually.  In a traditional RE table that is the positive value for the 0,1, and 2 states.  That value has to be accounted for in some way.  It was your choice to account for it by ignoring the loss of value when the batter makes an out.  But you could just as easily decided that that value is actuated only when that batter gets on base and therefore distributed the positive value to each event that gets the batter on base.  That, for me, is the theoretically more correct choice, although Tango has divided up that value between all the events by giving a positive value to all PAs. 

It would still be helpful to me if you would provide what I requested on Friday, a detailed step by step description of your methodology for creating the RE table and the linear weights table.


#79    Colin Wyers      (see all posts) 2008/09/23 (Tue) @ 12:40

Peter:

The batter does have a starting run expectancy - something like .15 runs in the 0 out state, and decreasing from there - that isn’t accounted for. I have revised totals that I’ll be publishing on Friday to account for that. (This changes the values for the generic out, strikeout, and double/triple play.)

RE values are calculated simply as:

Number of times that runner scores / number of runners

Or, precisely:

SUM(IF(RUN1_FATE_ID>3,1,0))/SUM(IF(RUN1_ORIGIN_EVENT_ID > 0,1,0)) AS RUN1_RE

Etc.

Beyond that, I “walked” the RE tables whenever there was a state change. If a batter reached base safely, the value of being at that base was tracked. If a runner advanced or scored, that change in RE was tracked. If an out was made on base, that change in RE was tracked. Then, finally, the change in RE for baserunners based on the change in outs was tracked. (Prior to that, advancement was calculated based upon the STARTING outs, not the final outs.) In the revised table, the loss of a batter’s starting RE is tracked as well. Then all those values are summed to create the linear weights.

Correct me if I’m wrong, but what you’re suggesting is that I simply scale the value of the positive events to be higher, and keep the out values from the runs above average LWTS the same, correct?


#80    Colin Wyers      (see all posts) 2008/09/23 (Tue) @ 18:18

Let’s give this a shot. This was done real quick with pen-and-paper, using only 2008 seasonal data to derive the PA/Outs/etc. values and should be taken as examples or illustrations.

Batting Runs:

(.47*H)+(.38*D)+(.55*T)+(.93*HR)+(.33*(BB+HBP))+(.22*SB)+(-.38*CS)+(-.26*(AB-H))

It’s not the best LWTS formula in the world, but it’s good enough for the purposes of an example.

Now, here’s BR recast as a runs above zero formula, rather than runs above average, adding the difference (.16) to outs:

(.47*H)+(.38*D)+(.55*T)+(.93*HR)+(.33*(BB+HBP))+(.22*SB)+(-.38*CS)+(-.09*(AB-H))

Now, let’s try that again, except using Tango’s approach of adding the difference (.12) to all PAs:

(.55*H)+(.50*D)+(.67*T)+(1.05*HR)+(.45*(BB+HBP))+(.22*SB)+(-.38*CS)+(-.14*(AB-H))

[Note: since D, T and HR are included in H, I didn’t add .12 to H, simply .12 times the number of singles divided by the total number of hits. This isn’t precise, because I’m only looking at 2008 data - again, illustration only.]

Now, one last time, adding the difference (.35) only to positive events, which is what I’m understanding Peter’s position to be:

(.60*H)+(.73*D)+(.90*T)+(1.28*HR)+(.68*(BB+HBP))+(.22*SB)+(-.38*CS)+(-.26*(AB-H))

Am I misrepresenting either of you, Tom and Peter? Or does that pretty much cover the different approaches?


#81    tangotiger      (see all posts) 2008/09/23 (Tue) @ 19:55

Colin: you just need to add the .12 to H, and not to D, T, or HR.  After you think about it, you should say “doh!”


#82    Colin Wyers      (see all posts) 2008/09/23 (Tue) @ 20:44

Tom: It’ll all make sense in the end, I hope. Just want to make sure I’m correctly understanding Peter’s position before I proceed.


#83    Colin Wyers      (see all posts) 2008/09/23 (Tue) @ 21:07

And, actually, you’re right. I thought I couldn’t because of what I was doing, but I could. I say doh.


#84    Peter Jensen      (see all posts) 2008/09/24 (Wed) @ 00:21

Colin - Thanks for taking the time to describe your process to me. Yes, the position I stated in the previous post was to add the runs just to the positive events.  But upon consideration I am not sure that doing that doesn’t distort things too much in the other direction.  It may be more important to keep the relative values of positive and negative events similar to they are in runs above average linear weights. 

I do believe that your RE table for the run scoring potential from each base may be a helful construct.  In addition to adding the run potential of the batter as you plan to do, it may be helpful to add a separate bin for the run potential of all the batters following the batter as well.  I need to think about all the possibilities some more.


#85    Tangotiger      (see all posts) 2008/09/24 (Wed) @ 09:35

Right, I would suggest going to that file I keep pointing to:
http://www.tangotiger.net/reconcile.html

Try to construct some relatively extreme conditions, and I think you will probably end up with the same conclusion as I do.

I would basically find it odd to see one guy have a higher runs above average than another guy, and, in the same number of PA, then have the second guy have more “runs created” than the first.  What is a reader supposed to do with that information?


#86    Peter Jensen      (see all posts) 2008/09/24 (Wed) @ 10:46

Tango - Having done just what you suggested and finding it easy to construct such a scenario if one adds the runs to the outs is why I am convinced that that method is incorrect.  And looking at your reconcile file example was what initially suggested to me that distributing between all positive events was the theoretically correct method just because it seemed intuitively correct to treat all non-outs as being equally responsible for extending the inning.  But that may only be true for distributing the runs that are due to extending the inning, and not all runs above average.  I have found the issues too complex and the differences too subtle to able to create a hypothetical that clearly demonstrates an unambiguous theoretically correct method of distributing all the runs created above average.


#87    Tangotiger      (see all posts) 2008/09/24 (Wed) @ 11:17

The third suggestion, one that says to distribute the runs only to the positive events will give us this chart:

Player      Runs Above Average      Runs Created
---------------------------------------------------
superstar         +6.75              12.15
player1           
+0.00               2.70
player2           
+0.00               2.70
...
player8           +0.00               2.70
---------------------------------------------------
Total             +6.75              33.75

(Note to readers: read the reconcile.html file first, or this won’t make any sense to you.)

Remember, if we had 9 average players, they’d score 27 runs, and have created 3.00 runs each.  By distributing the excess runs only on the positive events, we are making these players as having created 2.70 runs each.

To me, it seems fairly correct that the second chart in my file, the one that shows that the 8 average players created 3.00 runs each, regardless of what the ninth guy did is the one that makes the most sense.  Anything outside of what these 8 guys did is attributed to the ninth guy.  And, it maintains the relative relationship among these players, whether you look at runs above average, or runs created.

However, I can understand your, and many people’s, reluctance to accept this as correct.  It is only correct under the assumptions laid out.


#88    Peter Jensen      (see all posts) 2008/09/24 (Wed) @ 14:32

My reasoning would be a team of average guys create 27 runs in 54 ABs. After 54 ABs the team with the superstar has created 30 runs, 6 directly by the superstar and 3 by each of the 8 average players.  But because they have 3 outs left the team gets to bat 6.75 more times and score 3.75 more times.  It is this 3.75 runs that gets distributed to the 9 players in proportion to the number of PAs (out of the 54) that they reached base safely.  Superstar hit safely 6 times each of the other batters hit safely 3 times, so superstar gets an extra .75 runs and each average player gets an extra .375.  Hence example 1 is correct. 

I can also understand your logic for example number 2.  Each average player is EXPECTED to get on base half the time, so all the extra run value should go to the superstar who is the only player exceeds the average on base percentage.

In practice, however, the contributers to a teams extra runs above average are not so clear cut.  Some of the team players may have linear weights above average that are greater than 0 (above average) but have OBPs that are less than average.  And in practice your add .12 runs per PA bears no resemblence to your theoretical solution in the reconcile file of giving all the added runs above average to the superstar.

In practice, adding an additional total amount of runs scored to the run pool ensures that the total runs created by all the players is equal to the total runs scored.  Divying up these runs per each non-out PA has these advantages:

1. A player with an average number of PA’s, an average on base percentage, and an average linear weight above average also has an average number of runs created.

2. A player with an average number of PA’s and an average on base percentage, but a lower than average linear weights above average will have a runs created lower than average by exactly the same amount as his linear weights. Ditto with above average linear weights players.

3. Being above average in total PAs only adds extra runs created for a player if he is either more productive than average in those PAs ( higher than average linear weights) or gets on base more than average ( adds opportunities for his team mates).


#89    Colin Wyers      (see all posts) 2008/09/24 (Wed) @ 17:15

If we look at this a different way, it’ll become… well, let’s just look at it a different way for a second.

Essentially, all linear run estimators can be boiled down to:

X*Safe - Y*Outs = Runs Above Average

To convert to absolute runs:

X*Safe - Y*Outs + Z*All = Runs Above Absolute

The three positions are:

* Include the Z factor in outs.
* Include the Z factor in safe.
* Include the Z factor in both.

The only reason we care about which to do is in situations where outs are not evenly distributed among playing time. This is true for individual batters, but not for run scoring as a whole, where playing time is assigned by outs.

For a run scoring model designed to work on a team run scoring process - whether on the inning, game or seasonal level - we don’t need to worry about overly rewarding/penalizing the creation of an out, because by definition everyone will have the same outs (absent some technicalities like home half-innings in the ninth and afterwards).


#90    Peter Jensen      (see all posts) 2008/09/24 (Wed) @ 18:15

Colin - The purpose of constructing a model is to discover how runs are created. Causal factors, not just correlations.  Its pretty hard to argue that making an out is a cause of increased runs or that a PA that doesn’t result in a man on base is either. 

It may not be your intention to use the model that you create to measure the offensive production of individual batters, but whatever model you create should be robust enough so that if it is used for individual batters that it doesn’t result in logical contradictions.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Jan 09 16:41
Sabermetric Moves of the 2009 Pre-Season

Jan 09 19:56
Modeling Baseball Player Ability with a Nested Dirichlet Distribution

Jan 09 18:08
Line Drives

Jan 09 18:04
Challenging Nate Silver (and all other forecasters)

Jan 09 17:31
Cheers

Jan 09 17:14
Teaching sabermetrics at school

Jan 09 16:51
The first Hardball Times Annual available for download!

Jan 09 14:44
Vote for the Worst Player in MLB

Jan 09 12:29
Clint Eastwood is Archie Bunker

Jan 09 12:16
Mailbags on Parade