I finally got Retrosheet software running on my Mac today.

Here’s how:

1. Install WineBottler (which includes Wine).
http://winebottler.kronenberg.org

2. Run the Command Prompt

3. Navigate to your Retrosheet directory, just as you would on a Windows machine, and run the software.

retromac

As I was re-watching Moneyball this morning, I wanted to check out some data about the 2002 Oakland Athletics. After downloading the 2002 data from Retrosheet, I remembered that I don’t have any way of analyzing the data on my Macbook.

After trying, unsuccessfully, to get Retrosheet’s DiamondWare software running on DosBox, I turned to Chadwick. The installation instructions that come packaged with Chadwick are nearly everything that you need to get the software installed properly. The main piece of missing information is that you need to run the “make install” command as the “root” user.

Here are instructions for installing Chadwick on a Mac.

1. Enable the “root” user in Mac OSX. The directions are posted here: http://support.apple.com/kb/ht1528
directions

 

 

 

 

 

 

2. Download Chadwick source files: http://sourceforge.net/projects/chadwick/files/. The current version is 0.6.2.

3. Unzip the downloaded file.

4. Open Terminal

5. Navigate to the unzipped directory where the Chadwick files are stored.

cd Downloads/chadwick-0.6.2

6. Run the ./configure command.

./configure

7. Run the make command.

make

8. Switch to root

su root

9. Type in the password that you created in Step 10 above.

10. Run the make install command.

make install

11. You should now have Chadwick installed on your Mac.  Type “cwbox” to verify.

chadwick

12. Navigate to the directory where you have downloaded and extracted Retrosheet files, and run commands from that location.

cd /Downloads/2002eve

cwbox -y 2002 -i OAK200209040 2002OAK.EVA

 

I’ve been meaning for awhile to write a post about the Lahman Database. If you’re not already familiar with this database, I encourage you to take a look because it’s a great baseball statistics resource. The current version of the database (5.6) contains MLB pitching, hitting, and fielding statistics from 1871 through 2008. An annual update is usually released not long after the conclusion of the World Series.

To give a brief history of the database, Sean Lahman started this project in 1992 in an effort to make baseball statistics freely available to the general public. Now, a team of researchers works tirelessly to maintain the database and release the annual updates.

Sean Forman extended the Lahman Database for easy use on the web as an online encyclopedia at “baseball-reference.com.” Since 2001, Sean Lahman and Sean Forman have led a group of researchers who volunteered to maintain and update the database, known as the Baseball Databank.

The reason that I give this background information is twofold. First, I’d like to give Sean Lahman, Sean Forman, and their team of researchers proper credit for their extraordinarly work. Secondly, it helps to understand the various websites where you will find references to the Lahman Database.

http://www.baseball1.com/ – This is Sean Lahman’s website. You can download the most recent version of the database from this site.

http://www.baseball-databank.org/ – This is Sean Forman’s website. You can also download the most recent version of the database from this site.

http://www.baseball-reference.com/ – This is Baseball Reference, perhaps the most complete online baseball encyclopedia available. This site runs on the Lahman Database.

For most standard baseball research, the information presented on Baseball Reference will be more than adequate. However, if you’re interested in running more specific queries on this historic set of data, you will need to download a copy of the database, and I will guide you through that process.

1. First, I navigated to: http://www.baseball1.com/content/view/57/82/. I clicked “Download Version 5.6 (1871-2008), and then I clicked “Download SQL Version”.

It’s worth noting that Microsft Access and CSV versions of the database are also available. If these files are sufficient for your purposes, you’ll likely find them easier to use. Just download the files, and open them up in their proper programs (Microsoft Access for the mdb files, and your favorite spreadsheet program for the csv files).

2. I created a blank database on my MySQL server called bball_stats to house the Lahman Database. Procedures on how to create a new database will vary depending on your database setup and access privileges.

3. The next thing that you will need to do is import the SQL file. The file is quite large (36 MB uncompressed, 7.9 MB zipped). I found it easiest to upload the SQL file using a MySQL GUI program called HeidiSQL. The script uploaded in a matter of minutes, and the tables and data were ready for research!

4. To verify that all data had uploaded properly, I checked the rowcount of each table. Here is the expected number of rows for Version 5.6 of the database.

TABLE => ROWS
Master => 17264
Teams => 2595
TeamsFranchises => 120
TeamsHalf => 52
Batting => 91457
Pitching => 39016
Fielding => 154843
FieldingOF => 12028
Salaries => 19819
Managers => 3167
ManagersHalf => 93
Allstar => 4321
AllstarFull => 4522
AwardsPlayers => 2558
AwardsSharePlayers => 6182
AwardsManagers => 53
AwardsShareManagers => 318
HallOfFame => 3477
HOFold => 286
BattingPost => 10422
FieldingPost => 8981
PitchingPost => 4006
SeriesPost => 250
Schools => 732
SchoolsPlayers => 5904
xref_stats => 16631
Appearances => 40139

Finally, while we’re talking about these tables, it’s worth talking about the data that each table contains.

The database is comprised of the following main tables:

MASTER – Player names, DOB, and biographical info
Batting – batting statistics
Pitching – pitching statistics
Fielding – fielding statistics

It is supplemented by these tables:

AllStar – All-Star appearances
Hall of Fame – Hall of Fame voting data
Managers – managerial statistics
Teams – yearly stats and standings
BattingPost – post-season batting statistics
PitchingPost – post-season pitching statistics
TeamFranchises – franchise information
FieldingOF – outfield position data
FieldingPost- post-season fieldinf data
ManagersHalf – split season data for managers
TeamsHalf – split season data for teams
Salaries – player salary data
SeriesPost – post-season series information
AwardsManagers – awards won by managers
AwardsPlayers – awards won by players
AwardsShareManagers – award voting for manager awards
AwardsSharePlayers – award voting for player awards
AllStarFull – Expanded All-Star info
Appearances – Detailed games played info
Schoools – college info
SchoolsPlayers – players college info

Later, I’ll write more about how I’ve used the Lahman Database for baseball research. In the meantime, I encourage you to try the database for yourself. Look for the 2009 update to arrive by the end of the year!

To analyze output from the bevent program, it helps to input the data into a database.  I prefer to use MySQL.  Here’s how.

1) Create a database table.

CREATE TABLE `bevent` (
`gameid` varchar(12) NOT NULL,
`vteam` varchar(3) NOT NULL,
`inning` int(3) NOT NULL,
`battingteam` int(1) NOT NULL,
`outs` int(1) NOT NULL,
`balls` int(1) NOT NULL,
`strikes` int(1) NOT NULL,
`pitchsequence` varchar(25) NOT NULL,
`vscore` int(2) NOT NULL,
`hscore` int(2) NOT NULL,
`batter` varchar(10) NOT NULL,
`batterhand` varchar(2) NOT NULL,
`resbatter` varchar(10) NOT NULL,
`resbatterhand` varchar(2) NOT NULL,
`pitcher` varchar(10) NOT NULL,
`pitcherhand` varchar(2) NOT NULL,
`respitcher` varchar(10) NOT NULL,
`respitcherhand` varchar(2) NOT NULL,
`catcher` varchar(10) NOT NULL,
`firstbase` varchar(10) NOT NULL,
`secondbase` varchar(10) NOT NULL,
`thirdbase` varchar(10) NOT NULL,
`shortstop` varchar(10) NOT NULL,
`leftfield` varchar(10) NOT NULL,
`centerfield` varchar(10) NOT NULL,
`rightfield` varchar(10) NOT NULL,
`firstrunner` varchar(10) NOT NULL,
`secondrunner` varchar(10) NOT NULL,
`thirdrunner` varchar(10) NOT NULL,
`eventtext` varchar(30) NOT NULL,
`leadoff` varchar(1) NOT NULL,
`pinchhit` varchar(1) NOT NULL,
`defensiveposition` int(2) NOT NULL,
`lineupposition` int(2) NOT NULL,
`eventtype` int(4) NOT NULL,
`battereventflag` varchar(1) NOT NULL,
`abflag` varchar(1) NOT NULL,
`hitvalue` int(1) NOT NULL,
`shflag` varchar(1) NOT NULL,
`sfflag` varchar(1) NOT NULL,
`outsonplay` int(1) NOT NULL,
`doubleplayflag` varchar(1) NOT NULL,
`tripleplayflag` varchar(1) NOT NULL,
`rbionplay` int(1) NOT NULL,
`wildpitchflag` varchar(1) NOT NULL,
`passedballflag` varchar(1) NOT NULL,
`fieldedby` int(2) NOT NULL,
`battedballtype` varchar(2) NOT NULL,
`buntflag` varchar(1) NOT NULL,
`foulflag` varchar(1) NOT NULL,
`hitlocation` varchar(5) NOT NULL,
`numerrors` int(1) NOT NULL,
`firsterror` int(1) NOT NULL,
`firsterrortype` varchar(2) NOT NULL,
`seconderror` int(1) NOT NULL,
`seconderrortype` varchar(2) NOT NULL,
`thirderror` int(1) NOT NULL,
`thirderrortype` varchar(2) NOT NULL,
`batterdest` int(2) NOT NULL,
`firstdest` int(2) NOT NULL,
`seconddest` int(2) NOT NULL,
`thirddest` int(2) NOT NULL,
`playonbatter` varchar(8) NOT NULL,
`playonfirst` varchar(8) NOT NULL,
`playonsecond` varchar(8) NOT NULL,
`playonthird` varchar(8) NOT NULL,
`sbfirst` varchar(1) NOT NULL,
`sbsecond` varchar(1) NOT NULL,
`sbthird` varchar(1) NOT NULL,
`csfirst` varchar(1) NOT NULL,
`cssecond` varchar(1) NOT NULL,
`csthird` varchar(1) NOT NULL,
`pofirst` varchar(1) NOT NULL,
`posecond` varchar(1) NOT NULL,
`pothird` varchar(1) NOT NULL,
`respfirst` varchar(10) NOT NULL,
`respsecond` varchar(10) NOT NULL,
`respthird` varchar(10) NOT NULL,
`newgame` varchar(1) NOT NULL,
`endgame` varchar(1) NOT NULL,
`pinchfirst` varchar(1) NOT NULL,
`pinchsecond` varchar(1) NOT NULL,
`pinchthird` varchar(1) NOT NULL,
`removefirst` varchar(10) NOT NULL,
`removesecond` varchar(10) NOT NULL,
`removethird` varchar(10) NOT NULL,
`removebatter` varchar(10) NOT NULL,
`removebatterpos` int(2) NOT NULL,
`fielder1` int(2) NOT NULL,
`fielder2` int(2) NOT NULL,
`fielder3` int(2) NOT NULL,
`assist1` int(2) NOT NULL,
`assist2` int(2) NOT NULL,
`assist3` int(2) NOT NULL,
`assist4` int(2) NOT NULL,
`assist5` int(2) NOT NULL,
`eventnum` int(4) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

2. Download the Retrosheet zip files, and run the beventprogram.  If you setup your database with fields 0-96, make sure you output all fields with bevent.

bevent -y 2008 -f 0-96 2008ANA.EVA 2008BAL.EVA 2008CHA.EVA 2008DET.EVA 2008OAK.EVA 2008SEA.EVA 2008TBA.EVA 2008TOR.EVA 2008ARI.EVN 2008ATL.EVN 2008CHN.EVN 2008CIN.EVN 2008COL.EVN 2008FLO.EVN 2008HOU.EVN 2008LAN.EVN 2008MIL.EVN 2008NYN.EVN 2008PHI.EVN 2008SDN.EVN 2008SFN.EVN 2008SLN.EVN 2008WAS.EVN 2008BOS.EVA 2008CLE.EVA 2008KCA.EVA 2008MIN.EVA 2008PIT.EVN > 2008COMBINED.csv

3. Import 2008COMBINED.csv into your new MySQL database table.

To analyze output from the bgame program, it helps to input the data into a database.  I prefer to use MySQL.  Here’s how.

1) Create a database table.

CREATE TABLE `bgame` (
`gameid` varchar(12) NOT NULL,
`gamedate` varchar(8) NOT NULL,
`gamenumber` int(1) NOT NULL,
`dayofweek` varchar(10) NOT NULL,
`starttime` int(8) NOT NULL,
`dhused` varchar(1) NOT NULL,
`daynight` varchar(1) NOT NULL,
`vteam` varchar(3) NOT NULL,
`hteam` varchar(3) NOT NULL,
`gamesite` varchar(3) NOT NULL,
`vstartingpitcher` varchar(10) NOT NULL,
`hstartingpitcher` varchar(10) NOT NULL,
`humpire` varchar(10) NOT NULL,
`fumpire` varchar(10) NOT NULL,
`sumpire` varchar(10) NOT NULL,
`tumpire` varchar(10) NOT NULL,
`lfumpire` varchar(10) NOT NULL,
`rfumpire` varchar(10) NOT NULL,
`attendance` int(8) NOT NULL,
`psscorer` varchar(25) NOT NULL,
`translator` varchar(25) NOT NULL,
`inputter` varchar(25) NOT NULL,
`inputtime` varchar(15) NOT NULL,
`edittime` varchar(15) NOT NULL,
`howscored` int(8) NOT NULL,
`pitchesentered` int(8) NOT NULL,
`temperature` int(3) NOT NULL,
`winddirection` int(8) NOT NULL,
`windspeed` int(8) NOT NULL,
`fieldcondition` int(8) NOT NULL,
`precipitation` int(8) NOT NULL,
`sky` int(8) NOT NULL,
`timeofgame` int(8) NOT NULL,
`numberofinnings` int(8) NOT NULL,
`vfinalscore` int(8) NOT NULL,
`hfinalscore` int(8) NOT NULL,
`vhits` int(8) NOT NULL,
`hits` int(8) NOT NULL,
`verrors` int(8) NOT NULL,
`herrors` int(8) NOT NULL,
`vlob` int(8) NOT NULL,
`hlob` int(8) NOT NULL,
`wpitcher` varchar(10) NOT NULL,
`lpitcher` varchar(10) NOT NULL,
`spitcher` varchar(10) NOT NULL,
`gwrbi` varchar(10) NOT NULL,
`vbat1` varchar(10) NOT NULL,
`vpos1` int(3) NOT NULL,
`vbat2` varchar(10) NOT NULL,
`vpos2` int(3) NOT NULL,
`vbat3` varchar(10) NOT NULL,
`vpos3` int(3) NOT NULL,
`vbat4` varchar(10) NOT NULL,
`vpos4` int(3) NOT NULL,
`vbat5` varchar(10) NOT NULL,
`vpos5` int(3) NOT NULL,
`vbat6` varchar(10) NOT NULL,
`vpos6` int(3) NOT NULL,
`vbat7` varchar(10) NOT NULL,
`vpos7` int(3) NOT NULL,
`vbat8` varchar(10) NOT NULL,
`vpos8` int(3) NOT NULL,
`vbat9` varchar(10) NOT NULL,
`vpos9` int(3) NOT NULL,
`hbat1` varchar(10) NOT NULL,
`hpos1` int(3) NOT NULL,
`hbat2` varchar(10) NOT NULL,
`hpos2` int(3) NOT NULL,
`hbat3` varchar(10) NOT NULL,
`hpos3` int(3) NOT NULL,
`hbat4` varchar(10) NOT NULL,
`hpos4` int(3) NOT NULL,
`hbat5` varchar(10) NOT NULL,
`hpos5` int(3) NOT NULL,
`hbat6` varchar(10) NOT NULL,
`hpos6` int(3) NOT NULL,
`hbat7` varchar(10) NOT NULL,
`hpos7` int(3) NOT NULL,
`hbat8` varchar(10) NOT NULL,
`hpos8` int(3) NOT NULL,
`hbat9` varchar(10) NOT NULL,
`hpos9` int(3) NOT NULL,
`vfinisher` varchar(10) NOT NULL,
`hfinisher` varchar(10) NOT NULL,
PRIMARY KEY  (`gameid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

2. Download the Retrosheet zip files, and run the bgame program.

bgame -y 2008 2008ANA.EVA 2008BAL.EVA 2008CHA.EVA 2008DET.EVA 2008OAK.EVA 2008SEA.EVA 2008TBA.EVA 2008TOR.EVA 2008ARI.EVN 2008ATL.EVN 2008CHN.EVN 2008CIN.EVN 2008COL.EVN 2008FLO.EVN 2008HOU.EVN 2008LAN.EVN 2008MIL.EVN 2008NYN.EVN 2008PHI.EVN 2008SDN.EVN 2008SFN.EVN 2008SLN.EVN 2008WAS.EVN 2008BOS.EVA 2008CLE.EVA 2008KCA.EVA 2008MIN.EVA 2008PIT.EVN > 2008COMBINED.csv

3. Import 2008COMBINED.csv into your new MySQL database table.

Retrosheet’s bevent program takes a Retrosheet event file and creates a complete description of every at-bat that occurred during the game.  To complete the introduction of Retrosheet’s three pieces of software, I’ll once again start with the event file for the Colorado Rockies @ Los Angeles Dodgers game on 4/9/2007 as an example. Here is the event file.

Here is a step-by-step guide for using bevent.exe.

  1. Download bevent.exe from Retrosheet or here, and unzip the executable file.
  2. Put the event file (or a text file that includes many event files) in the same directory as the bevent.exe program.
  3. You also need to have a team file included in the same directory.  A team file is text file that lists every team, their league, and their three-letter abbreviation.  The team file must have a filename of TEAMYYYY.  For example, here is TEAM2007.
  4. Open the Windows Command Prompt. (Start -> Run -> CMD)
  5. Navigate to the directory where you stored bevent.exe, the event file, and the team file.
  6. Type the command: bevent -y 2007 -f 0-96 dodgers_rockies040907.evn (or the name of your event file).  “-y 2007” specifies the year of the game.  If you are generating game info for a game from 1960, make sure you use -y 1960. “-f 0-96” specifies which fields I want bevent to return.  The default is 0-6, 8-9, 12-13, 16-17, 26-40, 43-45 ,51, 58-61.  0-96 means all fields.
  7. If you want to output the game info to a text file (I’d recommend saving as a .csv file so it opens nicely in spreadsheet software), use the command bevent -y 2007 -f 0-96 dodgers_rockies040907.evn > dodgers_rockies_040907_gameinfo.csv.

bevent_screenshot

Here is the output from bevent.exe.

The output is a comma-delimited file that contains the following fields:

0    game id
1    visiting team
2    inning
3    batting team
4    outs
5    balls
6    strikes
7    pitch sequence
8    vis score
9    home score
10   batter
11   batter hand
12   res batter
13   res batter hand
14   pitcher
15   pitcher hand
16   res pitcher
17   res pitcher hand
18   catcher
19   first base
20   second base
21   third base
22   shortstop
23   left field
24   center field
25   right field
26   first runner
27   second runner
28   third runner
29   event text
30   leadoff flag
31   pinchhit flag
32   defensive position
33   lineup position
34   event type
35   batter event flag
36   ab flag
37   hit value
38   SH flag
39   SF flag
40   outs on play
41   double play flag
42   triple play flag
43   RBI on play
44   wild pitch flag
45   passed ball flag
46   fielded by
47   batted ball type
48   bunt flag
49   foul flag
50   hit location
51   num errors
52   1st error player
53   1st error type
54   2nd error player
55   2nd error type
56   3rd error player
57   3rd error type
58   batter dest (5 if scores and unearned, 6 if team unearned)
59   runner on 1st dest (5 if scores and unearned, 6 if team unearned)
60   runner on 2nd dest (5 if scores and unearned, 6 if team unearned)
61   runner on 3rd dest (5 if socres and uneanred, 6 if team unearned)
62   play on batter
63   play on runner on 1st
64   play on runner on 2nd
65   play on runner on 3rd
66   SB for runner on 1st flag
67   SB for runner on 2nd flag
68   SB for runner on 3rd flag
69   CS for runner on 1st flag
70   CS for runner on 2nd flag
71   CS for runner on 3rd flag
72   PO for runner on 1st flag
73   PO for runner on 2nd flag
74   PO for runner on 3rd flag
75   Responsible pitcher for runner on 1st
76   Responsible pitcher for runner on 2nd
77   Responsible pitcher for runner on 3rd
78   New Game Flag
79   End Game Flag
80   Pinch-runner on 1st
81   Pinch-runner on 2nd
82   Pinch-runner on 3rd
83   Runner removed for pinch-runner on 1st
84   Runner removed for pinch-runner on 2nd
85   Runner removed for pinch-runner on 3rd
86   Batter removed for pinch-hitter
87   Position of batter removed for pinch-hitter
88   Fielder with First Putout (0 if none)
89   Fielder with Second Putout (0 if none)
90   Fielder with Third Putout (0 if none)
91   Fielder with First Assist (0 if none)
92   Fielder with Second Assist (0 if none)
93   Fielder with Third Assist (0 if none)
94   Fielder with Fourth Assist (0 if none)
95   Fielder with Fifth Assist (0 if none)
96   event num

Retrosheet’s bgame program takes a Retrosheet event file and creates a game summary. Again, I’ll use the Colorado Rockies @ Los Angeles Dodgers game on 4/9/2007 as an example. Here is the event file.  Here is a step-by-step guide for using bgame.exe.

  1. Download bgame.exe from Retrosheet or here, and unzip the executable file.
  2. Put the event file (or a text file that includes many event files) in the same directory as the bgame.exe program.
  3. You also need to have a team file included in the same directory.  A team file is text file that lists every team, their league, and their three-letter abbreviation.  The team file must have a filename of TEAMYYYY.  For example, here is TEAM2007.
  4. Open the Windows Command Prompt.
  5. Navigate to the directory where you stored bgame.exe, the event file, and the team file.
  6. Type the command: bgame -y 2007 dodgers_rockies040907.evn (or the name of your event file).  -y 2007 specifies the year of the game.  If you are generating game info for a game from 1960, make sure you use -y 1960.
  7. If you want to output the game info to a text file, use the command bgame -y 2007 dodgers_rockies040907.evn > dodgers_rockies_040907_gameinfo.txt.

Here is the output from bgame.exe.

The output is a comma-delimited file that contains the following fields:

0       game id - This is formatted (Home Team Abbreviation + YY + MM + DD + Game Number (see below))
1       date - This is formatted YYMMDD
2       game number - This field shows 0 if only one game was played between the two teams on that day.  If a double-header was scheduled, this field will show either 1 or 2.
3       day of week - Monday, Tuesday, etc
4       start time - This field is text only.  For instance 3:30 would be 330.  All times are assumed to be PM.
5       DH used flag - This field displays a T (true) if the designated hitter rule was used.  Otherwise, this field is F (false).
6       day/night flag - This field is D or N.
7       visiting team - This is the three-letter abbreviation of the visiting team.
8       home team
9       game site - Every ballpark has a special Retrosheet code.  The code is displayed in this field.  Click here for a listing of ballpark codes.
10      visiting starting pitcher - This is the unique Retrosheet ID of the visiting team's starting pitcher.
11      home starting pitcher
12      home plate umpire - This is the unique Retrosheet ID of the home team's starting pitcher.
13      first base umpire
14      second base umpire
15      third base umpire
16      left field umpire - Big games have two additional umpires in the outfield.  This is the unique Retrosheet ID of the left field umpire.  If no left field umpire was used, this field is blank.
17      right field umpire
18      attendance - The game's attendance
19      PS scorer - The name of the scorer.
20      translator - The name of the translator.
21      inputter - The name of the inputter.
22      input time - The time that the game was input.
23      edit time - The time that the game was edited.
24      how scored - How the game was scored: live, online, tv, radio, etc.
25      pitches entered? - Were pitches entered, the final count, or just the results of the at-bat.  This field shows: pitches, count, or none.
26      temperature - The temperature of the game, in Fahrenheit.
27      wind direction - The wind direction (fromcf, fromlf, fromrf, rtol, ltor, tolf, torf, tocf, unknown)
28      wind speed - The wind speed, in MPH.
29      field condition - The field condition (dry, wet, soaked, unknown)
30      precipitation - drizzle, none, rain, showers, snow, unknown
31      sky - cloudy, dome, night, overcast, sunny, unknown
32      time of game - The duration of the game, in minutes.
33      number of innings - The number of innings in the game.
34      visitor final score - The number of runs scored by the visiting team.
35      home final score
36      visitor hits - The number of hits by the visiting team.
37      home hits
38      visitor errors - The number of errors committed by the visiting team.
39      home errors
40      visitor left on base - The number of runners left on base by the visiting team.
41      home left on base
42      winning pitcher - The unique Retrosheet ID for the winning pitcher.
43      losing pitcher - The unique Retrosheet ID for the losing pitcher.
44      save for - The unique Retrosheet ID for the player who earned the save.
45      GW RBI - The unique Retrosheet ID for the player who hit the game-winning RBI.  This used to be an official MLB statistic.
46      visitor batter 1 - The unique Retrosheet ID of the first hitter for the visiting team.
47      visitor position 1 - The numeric position of the first hitter for the visiting team (1=P, 2=C, 3=1B, 4=2B, 5=3B, 6=SS, 7=LF, 8=CF, 9=RF).
48      visitor batter 2
49      visitor position 2
50      visitor batter 3
51      visitor position 3
52      visitor batter 4
53      visitor position 4
54      visitor batter 5
55      visitor position 5
56      visitor batter 6
57      visitor position 6
58      visitor batter 7
59      visitor position 7
60      visitor batter 8
61      visitor position 8
62      visitor batter 9
63      visitor position 9
64      home batter 1
65      home position 1
66      home batter 2
67      home position 2
68      home batter 3
69      home position 3
70      home batter 4
71      home position 4
72      home batter 5
73      home position 5
74      home batter 6
75      home position 6
76      home batter 7
77      home position 7
78      home batter 8
79      home position 8
80      home batter 9
81      home position 9
82      visiting finisher (NULL if complete game) - The final pitcher for the visiting team.
83      home finisher (NULL if complete game)

Retrosheet has developed three (Windows-only) programs that work with their play-by-play event files.  An event file is essentially a text-based representation of an entire baseball game.  Retrosheet offers event files for nearly every MLB game played since 1953.  For example, here is the event file from Colorado Rockies @ Los Angeles Dodgers game on 4/9/2007.

The first program that I’ll outline is box.exe.  Box.exe creates a box score from the Retrosheet event file.  Here’s how you can use box.exe.

  1. Download box.exe from Retrosheet or here, and unzip the executable file.
  2. Put the event file (or a text file that includes many event files) in the same directory as the box.exe program.
  3. You also need to have a team file included in the same directory.  A team file is text file that lists every team, their league, and their three-letter abbreviation.  The team file must have a filename of TEAMYYYY.  For example, team2007.
  4. Open the Windows Command Prompt.
  5. Navigate to the directory where you stored box.exe, the event file, and the team file.
  6. Type the command: box -y 2007 dodgers_rockies040907.evn (or the name of your event file).  -y 2007 specifies the year of the game.  If you are generating a box score for a game from 1960, make sure you use -y 1960.
  7. If you want to output the box score to a text file, use the command box -y 2007 dodgers_rockies040907.evn > dodgers_rockies_040907_boxscore.txt.

Here is the output from box.exe.