
Last chance! 50% off unlimited learning
Sale ends in
Subsets an edsurvey.data.frame
, an edsurvey.data.frame.list
,
or a light.edsurvey.data.frame
.
# S3 method for edsurvey.data.frame
subset(x, subset, ..., inside = FALSE)
an object of the same class as x
an edsurvey.data.frame
, an edsurvey.data.frame.list
,
or a light.edsurvey.data.frame
a logical expression indicating elements or rows to keep
not used; included only for compatibility
set to TRUE
to prevent the substitute
condition
from being called on it (see Details)
Paul Bailey and Trang Nguyen
Any variables defined on condition that are not references
to column names on the
edsurvey.data.frame
and are part of the environment where
subset.edsurvey.data.frame
was called will be evaluated
in the environment from which subset.edsurvey.data.frame
was called.
Similar to the difficulty of using subset within a function call because of
the call to substitute on condition,
this function is difficult to use (with inside
set to the default value of
FALSE
) inside another function call.
See Examples for how to call this function from within another function.
Wickham, H. (2014). Advanced R. Boca Raton, FL: Chapman & Hall/CRC.
if (FALSE) {
# read in the example data (generated, not real student data)
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
# table to compare to subsequent tables with subsets
edsurveyTable(composite ~ dsex, data=sdf, returnMeans=FALSE, returnSepct=FALSE)
# subset to just males
newsdf <- subset(x=sdf, subset= dsex == "Male")
# table of dsex after subset
edsurveyTable(composite ~ dsex, data=newsdf, returnMeans=FALSE, returnSepct=FALSE)
# Variable names that are not in the sdf get resolved in the parent frame.
# practically, that means that the following two subset
# calls sdfM1 and sdfM2 do the same thing
male_var <- "Male"
sdfM1 <- subset(x=sdf, subset= dsex == male_var)
sdfM2 <- subset(x=sdf, subset= dsex == "Male")
table(getData(data=sdfM1, varnames="dsex"))
table(getData(data=sdfM2, varnames="dsex"))
# variable can also be resolved as members of lists
genders <- c("Male", "Female","not a sex level")
sdfn <- subset(x=sdf, subset= dsex == genders[2])
table(getData(data=sdfn, varnames="dsex"))
# variables can also be subset using %in%
sdfM3 <- subset(x=sdf, subset= dsex %in% c("Male", "not a sex level"))
table(getData(data=sdfM3, varnames="dsex"))
# if you need to call a name on the sdf dynamically, you can use as.name
dsex_standin <- as.name("dsex")
sdfM4 <- subset(x=sdf, subset= dsex_standin == "Male")
table(getData(data=sdfM4, varnames="dsex"))
# Here is an example of how one might want to call
# subset from within a function or loop.
# First, define a few variables to use dynamically
rhs_vars <- c("dsex", "b017451")
lvls <- c("Male", "Female")
# create a parsed condition
cond <- parse(text=paste0(rhs_vars[1], " == \"",lvls[1],"\""))[[1]]
# when inside=TRUE a parsed condition can be passed to subset
dsdf <- subset(x=sdf, subset=cond, inside=TRUE)
# check the result
table(getData(data=dsdf, varnames="dsex"))
# both of these return data, but uses substantial memory
head(sdf[ , c("origwt","m145101")])
head(sdf[[c("origwt","m145101")]])
# subset an edsurvey.data.frame.list
sdfA <- subset(sdf, scrpsu %in% c(5,45,56))
sdfB <- subset(sdf, scrpsu %in% c(75,76,78))
sdfC <- subset(sdf, scrpsu %in% 100:200)
sdfD <- subset(sdf, scrpsu %in% 201:300)
# construct an edsurvey.data.frame.list from these four datasets
sdfl <- edsurvey.data.frame.list(list(sdfA, sdfB, sdfC, sdfD),
labels=c("A locations",
"B locations",
"C locations",
"D locations"))
sdfl2 <- subset(sdfl, dsex=="Male")
# the number of rows in each element of the sdfl
nrow(sdfl)
# the number of rows after subsetting each element to just the Males
nrow(sdfl2)
}
Run the code above in your browser using DataLab