THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews
If you are a media member and would like a review copy of The Book, please contact Kevin Cuddihy of Potomac Books.

Buy The Book from Amazon

MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, November 30, 2007

Google Earth meets Retrosheet

By Tangotiger, 03:25 PM

Philip (Flip) Kromer writes:


Hello,

I needed a file that had geolocations for each park, and separately
wanted to match the BDB team info against the gamelogs database
.  I've
taken the retrosheet park info from
http://retrosheet.org/boxesetc/MISC/PKDIR.htm , the old parkcode.txt
http://www.retrosheet.org/parkcode.txt info, these Google Earth files:
  http://bbs.keyhole.com/ubb/download.php?Number=721289 NL
  http://bbs.keyhole.com/ubb/download.php?Number=721294 AL
the MLB team info http://mlb.mlb.com/team/index.jsp and  David
Vincent'
s Alternate Site Games at
http
://www.retrosheet.org/neutral.htm and
http://www.retrosheet.org/neutral19.htm , smashed it together and made
a unified file.

The result contains all teamsnames and alternate site info from
Retrosheet
geolocations, and address and URL info for active teams.

Please enjoy
  http
://vizsage.com/apps/baseball/results/parkinfo/parkinfo-all.xml
  
-- This is the best formatit lists all the info hierachically;
using python's element tree (http://effbot.org/zone/element-index.htm)
or perl'
s XML::Simple
(http://search.cpan.org/~grantm/XML-Simple-2.18/lib/XML/Simple.pm)
should give you a cleansimple data structur

  
http://vizsage.com/apps/baseball/results/parkinfo/parkinfo-flatall.csv
  
http://vizsage.com/apps/baseball/results/parkinfo/parkinfo-flatall.xml
  
-- This is the same filein .csv and in .xml formatslisting the
parks in a flattened 
(but still parsableformatit's not a drop-in
replacement for parkcodes.txt but it has the same flavor.  If people
are interested in a drop-in parkcodes.txt replacement I can spin that
off pretty easily.

  Parks
    parkID -- Retrosheet parkID
      name
        -- The current name, or the last name this stadium was known by.
      beg, end, active, games
        -- Dates YYYY-MM-DD for the first and last recorded (according
to retrosheet gamelogs) games at that stadium (blank for active or
future sites); whether the site is currently the home stadium for an
active MLB team; and the total number of gamelogs games at that stadium.
      lat, lng 
        -- Geolocation
      streetaddr, extaddr, city, state, country, zip, tel 
        -- Address
      url, spanishurl  
        -- The main URL for active teams, and the URL for its
Spanish-language 
      logofile  
        -- The MLB logo file: prefix
http://mlb.mlb.com/mlb/images/team_logos/ to retrieve.  I suppose this
should be a full URL, if someone would like to ship me team logos for
past teams I'
ll fix that.

  
Teams
    teamID 
-- Retrosheet teamIDwith 'ANA' used for the Los Angeles
Angels of Anaheim in Orange County
CaliforniaUSASol 3Milky Way,
Local Cluster since 1997 to now.
      
parkID -- Retrosheet parkID
      beg
end
        
-- Dates YYYY-MM-DD the first and last games recorded by that
team at that stadium 
(according to retrosheet gamelogs), blank for
active or future sites.
      
games
        
-- total number of gamelogs games at that stadium by that team.
      
altsite
        
-- Given as "1" if the site is listed in David Vincent's
Alternate Site Games

  OtherNames
    parkID -- Retrosheet parkID
      name
        -- A name this park was known by.  I used the Retrosheet park
info from http://retrosheet.org/boxesetc/MISC/PKDIR.htm to list off
the official names of each park (flagged with "auth").  If a park was
labelled in some way in one of my other sources, it was also thrown on
the heap.
      beg, end
        -- For "auth" names, the seasons that the park was marketed
under that name
      auth
        -- Was this an official name for the park ('
Bank One Ballpark'
yes, '
The BOB' no)?
      curr
        -- Is this the current or last-used-while-MLB-active name for
the park?

  Comments
    parkID -- Retrosheet parkID
    comment
      -- comments for each site.  There may be some parkcodes.txt
cruft left in there.

The flat files give all of the above in the format
 
"parkID","name","beg","end","active","games","lat","lng","allteams","allnames","streetaddr","extaddr","city","state","country","zip","tel","url","spanishurl","logofile","allcomments"
where 
  allteams lists each '
team (beg-end[alt]', with end=>'now' for an
active park, separated by '
', and with '[alt]' appended to
alternate-site games
  allnames lists each '
name (beg-end)', auth names first, with
end=>'
now' for an active park, separated by '', and no dates for a
non-auth team.
  allcomments lists the comments separated by ' 
' in some arbitrary
order.

I'
ve left five 'proposed' stadiums in the setincluding the future
NYA stadium 
and a few others whose status I'm unsure of.  These are
easy enough to remove if the idea offends.

I'
m going to try to spin off a Google Earth .kml file from this data,
hopefully later tonightwhich will placemark all the geolocated sites
with all the above info in the descriptions
.

Cheers,
Philip (FlipKromer
http
://vizsage.com

PS If you enjoyed playing with the Google Earth thingId also like
to point to these other Google Earth files
:
  
All Minor Lg
http
://bbs.keyhole.com/ubb/placemarks/997756-minorleague.kmz     
  
AAA Stadiums http://bbs.keyhole.com/ubb/download.php?Number=47579                
  
Football http://bbs.keyhole.com/ubb/download.php?Number=12353                
which give geolocations (and neat-o team logos) for minor league and
football stadiums.

SabermetricsDataParks


Name (required)
E-Mail (optional)
Website (optional)

<< Back to main


Latest...

COMMENTS

Nov 20 01:43
Sabermetric Moves of the 2009 Pre-Season

Nov 20 04:02
Nate Silver: hero to interviewers

Nov 20 02:01
My 1B is better than your 1B

Nov 20 00:26
MLB logo

Nov 19 23:03
NBA’s Marcel

Nov 19 19:13
Offense by position groups by decade

Nov 19 17:32
Changes in home run rates during the Retrosheet years

Nov 19 16:40
One Year and One Million Hits Later

Nov 19 16:22
Soria as a starter?

Nov 19 13:50
Response of a fired head coach