summary2: Summarize edsurvey.data.frame Variables

Description

Summarizes edsurvey.data.frame variables.

Usage

summary2(
  data,
  variable,
  weightVar = attr(getAttributes(data, "weights"), "default"),
  omittedLevels = FALSE
)

Arguments

data

an edsurvey.data.frame, an edsurvey.data.frame.list, or light.edsurvey.data.frame

variable

character vector of variable names

weightVar

character weight variable name. Default is the default weight of data if it exists. If the given survey data do not have a default weight, the function will produce unweighted statistics instead. Can be set to NULL to return unweighted statistics.

omittedLevels

a logical value. When set to TRUE, drops those levels of the specified variable. Use print on an edsurvey.data.frame to see the omitted levels. Defaults to FALSE.

Value

summary of weighted or unweighted statistics of a given variable in an edsurvey.data.frame

For categorical variables, the summary results are a crosstab of all variables and include the following:

[variable name]

level of the variable in the column name that the row regards. There is one column per element of variable.

number of cases for each category. Weighted N also is produced if users choose to produce weighted statistics.

Percent

percentage of each category. Weighted percent also is produced if users choose to produce weighted statistics.

standard error of the percentage statistics

For continuous variables, the summary results are by variable and include the following:

Variable

name of the variable the row regards

total number of cases (both valid and invalid cases)

Min.

smallest value of the variable

1st Qu.

first quantile of the variable

Median

median value of the variable

Mean

mean of the variable

3rd Qu.

third quantile of the variable

Max.

largest value of the variable

standard deviation or weighted standard deviation

NA's

number of NA in variable and in weight variables

Zero-weights

number of zero-weight cases if users choose to produce weighted statistics

If the weight option is chosen, the function produces weighted percentile and standard deviation. Refer to the vignette titled Statistical Methods Used in EdSurvey and the vignette titled Methods Used for Estimating Percentiles in EdSurvey for how the function calculates these statistics (with and without plausible values).

Examples

Run this code

# NOT RUN {
# read in the example data (generated, not real student data)
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))

# print out summary of weighted statistics of a continuous variable
summary2(sdf, "composite")
# print out summary of weighted statistics of a variable, including omitted levels
summary2(sdf, "b017451", omittedLevels = FALSE)
# make a crosstab
summary2(sdf, c("b017451", "dsex"), omittedLevels = FALSE)

# print out summary of unweighted statistics of a variable
summary2(sdf, "composite", weightVar = NULL)
# }

Run the code above in your browser using DataLab

State of Data and AI Literacy Report 2025