This dataset pertains to children and their families in the United States and is intended to illustrate missing data issues. Note that although the original data are longitudinal, this extract is not.
data(nlsyV)A data frame with 400 randomly subsampled observations on the following 7 variables.
ppvtr.36a numeric vector with data on the Peabody Picture Vocabulary Test (Revised) administered at 36 months
firstindicator for whether child was first-born
b.marrindicator for whether mother was married when child was born
incomea numeric vector with data on family income in year after the child was born
momagea numeric vector with data on the age of the mother when the child was born
momededucational status of mother when child was born (1 = less than high school, 2 = high school graduate, 3 = some college, 4 = college graduate)
momracerace of mother (1 = black, 2 = Hispanic, 3 = white)
Note that momed would typically be an ordered factor while momrace
would typically be an unorderd factor but both are numeric in this
data.frame in order to illustrate the mechanism to change the
type of a missing_variable