Monday, November 30, 2009
Retrosheet: Invalid Pitch Codes - as per Clem Comly validation rules
I posted this to the Retro group, but presumably others here may find this interesting.
***
There’s 553 pitch records that don’t correspond to Clem’s rules, that he noted last year. I did my best to program those rules, but it’s possible that I either made a mistake, or that I misinterpreted Clem’s rules.
If you click on this link:
http://tangotiger.net/retrosheet/reports/invalid_pitch_codes.html
You will see the entire set, including the reason that the pitch code field is invalid. For the 2009 seasons, there were only 19 invalid records, all of which failed for the same reason: the event was C/E2, and the last character in the pitch code sequence must be an N (and it wasn’t).
If I have made any mistake at all in my report, please let me know, so we can go over the rules and my code to make sure we got it right. I think it’s very good that there are so few mistakes in the pitch code sequence to begin with. It is, by far, the most labor-intensive of the fields to try to process, and with some 4+ millions records that has that field filled, that’s a tiny error rate.
Tom
***
No one reported anything back, so I will presume that it at least passed the sniff test. If somebody is processing the pitch sequences for the 2009 season from Gameday or other sources, I’d be interested to know if we can just tack on the “N” at the end of each sequence for those 19 records.


Recent comments
Older comments
Page 1 of 344 pages 1 2 3 > Last »Complete Archive – By Category
Complete Archive – By Date