THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


SABR101 required reading if you enter this site. Check out the Sabermetric Wiki. And interesting baseball books.
MOST RECENT ARTICLES
MAIL : You ask | We say

Advanced


THE BOOK--Playing The Percentages In Baseball

<< Back to main

Wednesday, June 01, 2011

Attendance base for each team

By Tangotiger, 12:26 PM

There have been several great articles that track attendance by city over time, and controls for such factors as expansion, new ballpark, past won-loss record.  The article I am writing below is not going to be one.  That’s because I don’t control for any of those three key parameters.  There was an excellent article written a few years back in the Phil-edited By The Numbers.  If you want something good, read that one.

What I’m going to do here will establish the framework, and then some aspiring saberist can improve upon it.

Let me tell you what I did, and why I did it.  The why will always be the same answer: because it was easy.  I choose the honest mess because I don’t want to spend more than a few minutes doing what I’m doing, and by leaving it honest, you can clean up the mess if you want to.  Or not.  Indeed, I will spend more time writing this article than doing the actual research.

There are three eras for baseball, in terms of attendance.  There is 1946-1976, where the per-game attendance was around 14,000 fans, give or take a few thousand each year.  There is 1987-present, where the per-game attendance was around 28,000 fans, give or take a few thousand each year.  And there was the transition period of 1977-1986, as baseball’s fandom rose from one plateau to the next.  Why this happened, I will leave to the historians.  I will note that I became a baseball fan right around 1977, and a regression equation will therefore conclude that I am completely responsible.  Which is why I hate regression equations. 

Free agency and money was probably the real reason.  People like to see valuable things, and if you pay alot for someone, then people will want to see that player.  It’s like McNall said when he and Gretzky bought the Honus Wagner card: you buy the most important card, and you pay top dollar, and that drives its value.  This is true even for small things, like books.  When we had to price our book, we thought about pricing it for 9.99$, but there was a theory that if you priced your book too low, it couldn’t be valuable!  Indeed, the lower we wanted to price the book, the less units it would sell apparently.  A comedian had a joke about his father buying a VCR (old joke), and it was priced at 99$, and the father told the salesman he wanted to pay 49$ for it.  The salesman was so flustered with the back-and-forth, he told the father to take it for 19$.  And the father said “19$?  What’s wrong with it?”.

Anyway, historians and comedians and economists have a better handle on this than I do.  Listen to them, and ignore me on why circa 1977 was pivotal.

Ok, what I did:


1. I flagged each team’s winning percentage as “420” if they had a winning percentage of .462 or lower.  I flagged each team’s winning percentage as “580” if they had a winning percentage of .538 or higher.  And, “500” for a winning percentage between those two.  With about 1500 team-seasons, there were about 500 seasons in each class.  Basically, you either had a bad, average, or good season.

2. I averaged out each team’s attendance by their winning percentage class.  This is the Dodgers in LA through 1976:
22,215 attend420
24,526 attend500
28,461 attend580

So, they averaged 28,461 fans when they were a good team, and 22,215 when they were a bad team.

3. I normalized their attendance based on them having a .500 record.  Since teams with an average record have historically drawn 25% more fans than teams with a bad record, and since teams with a good record have drawn 25% more fans than teams with an average record, I use those numbers as the adjustment.  The 22,215 fans with a bad record are increased by 25%, while the 28,461 fans with a good record are decreased by 25%. 

We see by the way that this 25% shouldn’t really apply to the Dodgers.  Dodger fans will support a team, much better than the average city.  Their adjustment numbers historically are less than 10%.  But, honest mess and all.

Anyway, after the adjustment, we see that the average Dodger game through to 1976, if they were a .500 team, was 25,021 fans.

4. I repeat this for all teams and all eras, and I get this:

team     1987_2009      1977_1986      1946_1976 
LAN     39
,413      38,869      25,021 
COL     38
,123         
BAL     37
,095      24,143      12,191 
SLN     35
,084      22,257      15,195 
TOR     33
,960      22,442     
CHN     33
,684      20,930      13,648 
ARI     33
,658         
WAS     32
,798         
BOS     31
,755      21,512      15,028 
NYN     30
,896      20,494      23,520 
TEX     30
,779      15,858      12,571 
ANA     30
,395      28,132      12,828 
NYA     30
,113      23,585      16,834 
HOU     29
,940      17,637      16,469 
SFN     29
,507      16,013      11,998 
SEA     29
,211      14,000     
PHI     28
,722      25,504      14,053 
ATL     27
,086      17,186      12,428 
CIN     26
,758      20,665      12,438 
SDN     26
,524      20,662      13,131 
CLE     26
,170      12,602      12,664 
MIL     25
,484      20,403      13,095 
DET     24
,729      21,460      16,497 
CHA     24
,040      20,689      12,561 
KCA     23
,241      23,635      12,607 
MIN     23
,037      14,669      14,691 
FLO     22
,341         
OAK     22
,325      15,691      9,452 
PIT     21
,622      13,274      13,154 
TBA     20
,960         
MON     18
,873      21,161      15,514 
MON2     11
,870         
NY1             13
,608 
BRO             13
,393 
ML1             13
,257 
KC1             11
,828 
SE1             10
,335 
WS2             10
,115 
WS1             9
,950 
BSN             9
,397 
PHA             8
,853 
SLA             5
,731

As noted earlier, attendance basically increased by 40-45% from one era to the next era.  The Dodgers did explode from the through-1976 era to the 1977-1986 era like the rest of MLB, but that’s where they plateaued.

The Chicago Cubs are one team that has gained huge favor over the decades.  They were 11th in 1946-1976 (against, after adjusting for their poor record) to 6th in the current era.

You’ll note that there are two lines for the Expos.  From 1998 to the time they folded, their attendances woes was something unrelated to the fanbase.  We see that through 1976, they were a bit above average, with 15,514 adjusted fans.  From 1977-1986, they were pretty much average with 21,161 fans.  But, then they plateau-ed, and indeed, dropped, with 18,873 fans.

While four cities plateaued (Expos, Royals, Dodgers, Angels), most cities exploded in attendance starting in 1987. Basically, it can presumed that there not much more that those cities can do to find new fans.

5. What I also did was a timeline adjustment, by doubling the fans from 1946-1976 and adding 40% to the fans of 1977-1986.  The idea is that since that’s what happened to the league, let’s apply that for all the teams.  (Again, I don’t think it’s necessarily a good thing to do, since I just finished saying it looked like the Dodgers plateaued already.  But, let’s have some fun while we are here.)

teamID     BASE 
LAN     45
,843 
COL     38
,123 
NYN     35
,723 
ARI     33
,658 
WAS     32
,798 
TOR     32
,333 
NYA     32
,248 
SLN     32
,119 
ANA     31
,105 
BAL     30
,200 
HOU     30
,130 
BOS     29
,886 
DET     29
,746 
PHI     29
,711 
CHN     29
,494 
TEX     27
,518 
NY1     27
,217 
SEA     27
,050 
BRO     26
,787 
MON     26
,662 
SDN     26
,633 
CIN     26
,575 
ML1     26
,514 
MIL     26
,437 
KCA     26
,366 
ATL     26
,073 
SFN     25
,877 
CHA     25
,604 
CLE     24
,888 
KC1     23
,656 
PIT     23
,637 
MIN     23
,356 
FLO     22
,341 
TBA     20
,960 
OAK     20
,827 

SE1     20
,669 
WS2     20
,230 
WS1     19
,901 
BSN     18
,793 
PHA     17
,706 
MON2     11
,870 
SLA     11
,462

A few interesting points:
- current incarnation of Washington is doing much better than the past ones… again, since I didn’t control for expansion and my timeline adjustment may not be the best, who knows…

- different incarnations of Milwaukee and KC are similar

- the two cities that have a good fan base don’t have a team: Brooklyn and Montreal… given a decent situation, each city can support a team

- Oakland and Tampa are probably the two teams that may benefit the most from a new ballpark

6. Anyway, best way to use this chart is to look at the base numbers (in either step 5 above, or the 1987_2009 in step 4), and treat that as a city’s “.500 team” fan support.

Then, in order to figure out how much fan support you can expect, add 2% for each win over 162 games.  So, if you are a .580 team (94 wins, or +13 wins above average), you add 26% to the base number.  Similarly, drop 2% for each win below average.

Again, like I said, this is just a general rule.  The Dodgers, for example, would likely only add 1% for each win, since their fans are not fickle.

***

If anyone wants to clean up this mess, feel free.

SabermetricsHistoryParks
#1    Tangotiger      (see all posts) 2011/06/01 (Wed) @ 13:06

That 2% figure was mentioned by Palmer in The Hidden Game.

Indeed, that 2% number is my go to number for things like converting wins to dollars. 

Since MLB takes in close to 7 billion $, that means the median is probably close to 200 million$ for each team.

If we can presume that attendance goes up by 2% per win, let’s presume that everything goes up by 2% (TV revenue, sponsors, etc).  And 2% of 200 million $ is 4 million $.  So, that’s why we can get by with saying that a win will displace 4 million dollars.

Of course, things like revenue sharing between teams means that the marginal effect won’t all go to that team.


#2    NaOH      (see all posts) 2011/06/01 (Wed) @ 13:37

Sky had a good piece in March 2010 on the effects of various team-based factors on attendance. Broadly, he found that each win adds about 300 fans:

The average .500, non-playoff team that does not have a new park or any other advantages draws about 24,500 fans. Every extra game won adds about 300 fans per game. Of course, the relationship is not linear, but that’s an approximate estimate. All else being equal, a .400 team will draw about 20,100 fans, while a .600 team will draw 29,900 - difference of about 10,000 fans per game. Obviously, winning teams draw more fans and the effect is quite large.

http://baseballanalysts.com/archives/2010/03/what_puts_fans.php


#3    Tangotiger      (see all posts) 2011/06/01 (Wed) @ 13:43

That’s about 1.2% per win.  And, yes, depending on the timeline, that is a reasonable number.

For 1987- present, I had it as only 0.6% if you were below .500 and 1.6% if you were above .500.

For 1946-1976, it was a much wider gap (1.5% for below .500 and 2.5% for above .500).

I kind of settled on the 2% figure, but the better number historically was 1.2% if below .500 and 2.0% if above .500.


#4    mettle      (see all posts) 2011/06/01 (Wed) @ 14:00

I guess you “hate” regressions, but what you did is a really convoluted way of doing a simultaneous multiple regression. If you do a linear mixed effect model, then you can even avoid being all hand-wavey about the difference in coefficients between teams…


#5    Tangotiger      (see all posts) 2011/06/01 (Wed) @ 14:16

mettle: I’m not disagreeing with you.  Feel free to do just that.

I will say that the way I did it is an easier sell than regression, because it takes away that black-box. You don’t “see” things with the regression that way.  For example, had I only done a regression, I wouldn’t have noticed how the Dodgers, Royals, Angels, and Expos did not keep pace with the rest of the league in terms of present-era attendance explosion.

Ultimately, given what I have done here, I would now be able to construct better parameters, and I would then put that into a regression.  Too often, a regression is used as the final answer, logic be damned, rather than it be used as a tool.


#6    Tangotiger      (see all posts) 2011/06/01 (Wed) @ 14:22

I thought the Mets numbers above were also very interesting.  They were the #2 draw in MLB through 1976, and Yankees were nothing really.  But, they didn’t keep pace (Tom Seaver considered a fire sale?)

Really, the Yankees being so low was also a surprise.


#7    Michael K      (see all posts) 2011/06/01 (Wed) @ 16:27

I’m curious if there’s any correlation between MLB historical attendance figures and NFL / NBA / NHL attendance beyond population growth effects.

If there is some correlation, that could mean that larger cultural forces are part of the explanation.

Not sure where one can find old full-season attendance numbers for other sports, but one quick anecdotal check: The 7 longest NFL sellout streaks all began between 1968-1975 (Redskins, Steelers, Broncos, 49ers, Giants, Packers, Jets).  Only one other NFL team has a sellout streak that goes back farther than 1987 (Bears ca. 1983).

http://wiki.answers.com/Q/What_NFL_team_has_the_longest_sell_out_streak


#8    Tangotiger      (see all posts) 2011/06/01 (Wed) @ 16:34

Good question.

NHL has experienced attendance growth since at least 1990, but it’s got alot more parameters to worry about than MLB.  The arenas are close to sellouts, so there’s not alot of “play”.  All new arenas are always bigger, so, you can be at 100% capacity, and then increase your attendance by simply being in a larger arena.  Then, you had an explosion in number of teams.

NFL kind of has the capacity problem as well I guess.


#9    NaOH      (see all posts) 2011/06/01 (Wed) @ 16:54

Rodney Fort has nearly complete attendance data for the four major sports.

http://www.rodneyfort.com/PHSportsEcon/Common/OtherData/DataDirectory.html


#10    PaulScarfo      (see all posts) 2011/06/02 (Thu) @ 08:37

"They were the #2 draw in MLB through 1976… Really, the Yankees being so low was also a surprise.”

Spend any time in the Bronx during the 1970s?

Anyway, are those (expansion, new ballpark, past won-loss record) necessarily the only key parameters?


#11    Tangotiger      (see all posts) 2011/06/02 (Thu) @ 10:51

Necessarily the only?

I don’t know that I said that, and if I implied it, then that’s a bad job writing on my part.

There are, about, one thousand parameters to consider in any study.  I highlighted what I thought would likely be 3 of the most important ones.


#12    J-Doug      (see all posts) 2011/06/02 (Thu) @ 15:47

IMO, most of the best (and worst) work on attendance is in the peer-reviewed economics journals. A search on Google Scholar will bring up a bunch of studies, although most will be inaccessible to those who aren’t at a university.

I am at a university, so if anyone needs me to download one for them, let me know.


#13    Tangotiger      (see all posts) 2011/06/02 (Thu) @ 16:13

Another important parameter is cost per ticket, as well as events.

The Expos would have plenty of dollar hotdog games, and those were extremely popular.  If they had more or less events year-to-year, then this would affect attendance year-to-year.


#14    Tangotiger      (see all posts) 2011/07/31 (Sun) @ 11:45

Bumping…


Page 1 of 1 pages


Name (required)
E-Mail (optional; WILL be published)
Website (optional)

<< Back to main


Latest...

COMMENTS

May 25 01:43
Neal Huntington’s best moves

May 25 00:36
Help needed with sticky issue…

May 24 23:50
Rooting for laundry

May 24 20:16
Largest demonstration in Canadian history?

May 24 17:04
Firefox, IE, or Chrome?

May 24 12:07
How to beat the shift

May 24 11:11
Incredible story

May 24 09:41
Racial bias in card collecting: not the collectors, but the players on the cards

May 24 08:13
espnW for hockey: CBC’s WhileTheMenWatch.com

May 24 00:16
Psst… wanna intern… somewhere?