Learn R Programming

EdSurvey (version 1.0.6)

getData: Gets data from an edsurvey.data.frame.

Description

Reads in selected columns.

Usage

getData(data, varnames = NULL, drop = FALSE, schoolMergeVarStudent = NULL,
  schoolMergeVarSchool = NULL, dropUnusedLevels = TRUE,
  omittedLevels = TRUE, defaultConditions = TRUE, formula = NULL,
  recode = NULL, includeNaLabel = FALSE, addAttributes = FALSE,
  returnJKreplicates = TRUE)

Arguments

data

an edsurvey.data.frame or light.edsurvey.data.frame.

varnames

a character vector of variable names that will be returned. When both varnames and a formula are specified, variables associated with both are returned. Set to NULL by default.

drop

a logical value. When set to the default value of FALSE, when a single column is returned, it is still represented as a data.frame and is not converted to a vector.

schoolMergeVarStudent

a character variable name from the student file used to merge student and school data files. Set to NULL by default.

schoolMergeVarSchool

a character variable name name from the school file used to merge student and school data files Set to NULL by default.

dropUnusedLevels

a logical value. When set to the default value of TRUE, drops unused levels of all factor variables.

omittedLevels

a logical value. When set to the default value of TRUE, drops those levels of all factor variables that are specified in edsurvey.data.frame. Use print on an edsurvey.data.frame to see the omitted levels.

defaultConditions

a logical value. When set to the default value of TRUE, uses the default conditions stored in edsurvey.data.frame to subset the data. Use print on an edsurvey.data.frame to see the default conditions.

formula

a formula. When included, getData returns data associated with all variables of the formula. When both varnames and a formula are specified, the variables associated with both are returned. Set to NULL by default.

recode

a list of lists to recode variables. Defaults to NULL. Can be set as recode = list(var1 = list(from = c("a","b","c"), to = "d")). See examples.

includeNaLabel

a logical value, should NA (missing) values be returned as literal NAs or as factor levels coded as “NA”.

addAttributes

a logical value. Set to TRUE to get a data.frame that can be used in calls to other functions that usually would take an edsurvey.data.frame.

returnJKreplicates

a logical value indicating if JK replicate weights be returned. Defaults to TRUE.

Value

When addAttributes is FALSE, returns a data.frame containing data associated with requested variables. When addAttributes is TRUE, returns a light.edsurvey.data.frame.

Details

By default an edsurvey.data.frame does not have data read into memory until getData is called. This allows for a minimal memory footprint. To keep this footprint small, you need to limit varnames to just necessary variables. All the data is labeled according to NAEP documentation. note that if both formula and varnames are populated, the variables on both will be included.

For details on using this function, see the vignette available by calling vignette("getData", package = "EdSurvey") in R.

See Also

subset.edsurvey.data.frame for how to remove rows from the output.

Examples

Run this code
# NOT RUN {
# read in the example data (generated, not real student data)
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))

# get two variables, without weights
df <- getData(data=sdf, varnames=c("dsex", "b017451"))
table(df)

# example of using recode
df2 <- getData(data=sdf, varnames=c("dsex", "t088301"),
               recode=list(t088301=list(from=c("Yes, available","Yes, I have access"),
                                        to=c("Yes")),
                           t088301=list(from=c("No, have no access"),
                                        to=c("No"))))
table(df2)

# When readNAEP is called on a data file it appends a default 
# condition to the edsurvey.data.frame. You can see these conditions
# by printing the sdf
sdf

# As per the default condition specified, getData restricts the data to only
# Reporting Sample. This behavior can be changed as follows:
df2 <- getData(data=sdf, varnames=c("dsex", "b017451"), defaultConditions = FALSE)
table(df2)

# Similarly, the default behavior of omitting certain levels specified
# in the edsurvey.data.frame can be changed
df2 <- getData(data=sdf, varnames=c("dsex", "b017451"), omittedLevels = FALSE)
table(df2)

# Merge a school data file by passing a common variable through the arguments 
# `schoolMergeVarStudent` and `schoolMergeVarSchool`. In this example, 
# the variable "c052601" is from the school data file, merging on "scrpsu" and
# "sscrspu":
gddat <- getData(data=sdf, varnames=c("composite", "dsex", "b017451","c052601"),
  schoolMergeVarStudent='scrpsu', schoolMergeVarSchool="sscrpsu", addAttributes = TRUE)
# look at the first few lines
head(gddat)
# }

Run the code above in your browser using DataLab