Learn R Programming

Lahman (version 4.0-1)

Appearances: Appearances table

Description

Data on player appearances

Usage

data(Appearances)

Arguments

Format

A data frame with 99466 observations on the following 21 variables.
yearID
Year
teamID
Team; a factor
lgID
League; a factor with levels AA AL FL NL PL UA
playerID
Player ID code
G_all
Total games played
GS
Games started
G_batting
Games in which player batted
G_defense
Games in which player appeared on defense
G_p
Games as pitcher
G_c
Games as catcher
G_1b
Games as firstbaseman
G_2b
Games as secondbaseman
G_3b
Games as thirdbaseman
G_ss
Games as shortstop
G_lf
Games as leftfielder
G_cf
Games as centerfielder
G_rf
Games as right fielder
G_of
Games as outfielder
G_dh
Games as designated hitter
G_ph
Games as pinch hitter
G_pr
Games as pinch runner

Source

Lahman, S. (2015) Lahman's Baseball Database, 1871-2014, 2015 version, http://baseball1.com/statistics/

Details

The Appearances table in the original version has some incorrect variable names. In particular, the 5th column is career_year.

Examples

Run this code
data(Appearances)

# some test cases
# Henry Aaron spent the last two years of his career as DH in Milwaukee
subset(Appearances, playerID == 'aaronha01')
# Herb Washington, strictly a pinch runner for Oakland in 1974-5
subset(Appearances, playerID == 'washihe01')
subset(Appearances, playerID == 'thomeji01')
subset(Appearances, playerID == 'hairsje02')

# Appearances for the 1984 Cleveland Indians
subset(Appearances, teamID == "CLE" & yearID == 1984)


if (require(reshape2) & require(plyr)) {
# Appearances for Pete Rose during his career:
prose <- subset(Appearances, playerID == "rosepe01")


# What was Pete Rose's primary position each year 
# of his career?

prose_melt <- melt(prose, id = c("yearID", "teamID"),
                          measure = 9:17)
# Split out the position from variable
prose_melt <- cbind(prose_melt, colsplit(prose_melt$variable, 
                                         "_", names = c("G", "pos")))

# Two grouping variables because of an in-season trade in 1984
primary_pos <- ddply(prose_melt, .(yearID, teamID), summarise,
                         top_pos = pos[which.max(value)],
                         games = max(value))
primary_pos

# Most pitcher appearances each year since 1950
ddply(subset(Appearances, yearID >= 1950), .(yearID), summarise,
                              maxPitcher = playerID[which.max(G_p)],
                              maxAppear = max(G_p))

# Individuals who have played all 162 games since 1961
all162 <- ddply(subset(Appearances, yearID > 1960), .(yearID), summarise,
                      allGamers = playerID[G_all == 162])
# Number of all-gamers by year
table(all162$yearID)
}

Run the code above in your browser using DataLab