THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Thursday, January 25, 2007

Baseball Analysis, 100 years ago

By Tangotiger, 05:21 PM

Ubiquitous of Baseball-Fever posted a fascinating article by F.C. Lane, which I’ve merged and reposted on my site.  All thanks should go to him for unearthing it.


#1    Chris Miller      (see all posts) 2007/01/25 (Thu) @ 17:37

Wow!  Those linear weight values are incredible considering when they were made and with what technology was probably used.  Great article!


#2    Tangotiger      (see all posts) 2007/01/25 (Thu) @ 17:42

If I plug in the numbers in the article into my Markov calculator:
http://www.tangotiger.net/markov.html

That is, 1000 hits, 122 doubles, 60 triples, 29 HR, and then just throwing in 400 walks, 400 K, and 4100 at bats.  This gives me 3.49 runs per game (around the norm in the late 1910s).

The chance of scoring from 1b, 2b, 3b, according to Markov is .24, .39, .60.  The Lane sample says: .23, .40, .52.  Hats off to Mr Lane.  I think his triples value is either too low, or its possible there weren’t as many SF back in the day.

Fascinating stuff.


#3    Tangotiger      (see all posts) 2007/01/25 (Thu) @ 17:51

Here’s another F.C. Lane article:

http://www.geocities.com/cyrilmorong@sbcglobal.net/LaneBaseonBalls.htm

And his wiki entry on Baseball-Reference, including links to other articles:
http://www.baseball-reference.com/bullpen/F.C._Lane


#4    Tangotiger      (see all posts) 2007/01/25 (Thu) @ 17:58

By the way, the more accurate numbers that Lane should have used was the first set, not the second.  As I’ve discussed in plenty of places, including here:
http://www.tangotiger.net/rc2.html
It’s the getting on value plus the moving over value. 

Interestingly, if we focus on the triple and the HR, they should have the exact same “moving over value”.  However, in Lane’s sample, the 60 triples gained 96 bases (1.6 bases per triple) while the 29 HR gained 30 bases (1.0 bases per HR).

As you can see, what Lane’s data observed is that the HR occurred with bases empty, or just not on 1B, while the triple had more chances to move guys around.

Super small sample size, but still fascinating.


#5    Tangotiger      (see all posts) 2007/01/25 (Thu) @ 18:20

Hmmm… actually, there’s more data to go through.  The 29 HR scored 16 runners (0.55 runners per HR), while the 60 triples scored 38 runners (0.63 runners per triple).

For the 16 runners on base for the HR, they advanced 30 bases, or 1.9 bases per runner.

For the 38 runners on base for the triple, they advanced 96 bases, or 2.5 bases per runner.

(For comparison, in today’s game, a triple or HR adds about 2.3 bases per runner)

So, we see here it was an issue of small sample.  While there are more runners scoring on triples than HR (because of non-randomness), the .55 and .63 are a little too wide.  And, the number of bases gained per runner should be close to the same for the HR and triple, but it was again too wide for the triple and too small for the HR.

Put both of those together, and you can see why the advancement of the triple and HR weren’t so close.


#6    Peter Jensen      (see all posts) 2007/01/25 (Thu) @ 19:30

According to league stats for 1916 ( the year Lane says he collected his data) Lane’s sample is light on doubles and heavy on triples and HRs.  And your estimates of 400 walks is about 50 too high and 400 Ks is about 75 too low.  Better estimates for the year are 150 doubles, 57 triples, 19 HRs.  Also, I’m not sure that the base runners scoring on a HR being low is entirely due to sample size error.  The batter may have opted for a power swing only when the bases were empty and tried primarily to meet the ball when there were men on base.  The conventions of the time probably meant that when there were men on base with at least one out that there were more steal attempts and hit and run plays making a single almost as likely to score the runner as a HR.


#7    John Walsh      (see all posts) 2007/01/26 (Fri) @ 08:31

Tango,

There are quite a few Lane articles available online here:

http://www.aafla.org:8080/verity_templates/jsp/newsearch/search.jsp

Click on “Periodicals”, then select the “Baseball Magazine” radio button, then put Lane’s name in the search form: that gives 147 articles on all kinds of things. It’s great reading.


#8    McCoy      (see all posts) 2007/01/27 (Sat) @ 00:58

If you look in the Cyril link that Tango provided you will find in this 62 game sample that there were 283 walks, and that there were 20 FC at first base.


#9    Rod      (see all posts) 2007/01/27 (Sat) @ 11:20

That article, and over 1700 others from Baseball Magazine have been online at the AAFLA.org website for quite some time.  With more to come, as well.

BTW, 90 years is not 100 years.


#10    Joe Arthur      (see all posts) 2007/01/27 (Sat) @ 17:19

Tango may appreciate this Lane article found on the aafla site: Ice Hockey - The Baseball of Winter


#11    Tangotiger      (see all posts) 2007/01/27 (Sat) @ 23:54

BTW, 90 years is not 100 years.

Headline-literary licence.


Page 1 of 1 pages


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Dec 05 04:40
Sabermetric Moves of the 2009 Pre-Season

Dec 05 05:33
Avery being Avery

Dec 05 05:06
NYC’s 3 1/2 year mandatory jail time sentence for carrying a loaded weapon

Dec 04 23:42
Poll: Would you vote Raines for the Hall?

Dec 04 23:07
How to calculate the area of a baseball field

Dec 04 22:48
Complete Run Expectancy, Retrosheet Years

Dec 04 22:03
Raines for the Hall

Dec 04 15:55
Mailbags on Parade

Dec 04 14:01
What would happen if the shootout period was 10 minutes, not 5?

Dec 04 11:49
Estimating BABIP