achievementLevels: Return achievement levels for an edsurvey.data.frame.

Description

Returns achievement levels using weights and variance estimates appropriate for the edsurvey.data.frame.

Usage

achievementLevels(achievementVars = NULL, aggregateBy = NULL, data,
  cutpoints = NULL, returnDiscrete = TRUE, returnCumulative = FALSE,
  weightVar = NULL, jrrIMax = 1, schoolMergeVarStudent = NULL,
  schoolMergeVarSchool = NULL, omittedLevels = TRUE,
  defaultConditions = TRUE, recode = NULL)

Arguments

achievementVars

character vector indicating variables to be included in the achievement levels table, potentially with a subject scale or subscale. When the subject scale or subscale is omitted, then the default subject scale or subscale is used. You can find the default composite scale and all subscales using the function showPlausibleValues.

aggregateBy

character vector specifying variables to aggregate achievement levels by. The percent column sums up to 100 for all levels of all variables specified here. When set to default of NULL, the percent column sums up to 100 for all levels of all variables specified in in achievementVars.

data

an edsurvey.data.frame.

cutpoints

numeric vector indicating cut points. Set to standard NAEP cut points for Basic, Proficient, Advanced by default.

returnDiscrete

logical indicating if discrete achievement levels should be returned. Defaults to TRUE.

returnCumulative

logical indicating if cumulative achievement levels should be returned. Defaults to FALSE.

weightVar

character indicating the weight variable to use; see Details.

jrrIMax

numeric value. When using jackknife variance estimation method, the \(V_{jrr}\) term (see Details) can be estimated with any positive number of plausible values and is estimated on the first of the lower of the number of available plausible values and jrrIMax. When jrrIMax is set to Inf, all of the plausible values will be used. Higher values of jrrIMax lead to longer computing times and more accurate variance estimates.

schoolMergeVarStudent

a character variable name from the student file used to merge student and school data files. Set to NULL by default.

schoolMergeVarSchool

a character variable name from the school file used to merge student and school data files. Set to NULL by default.

omittedLevels

a logical value. When set to the default value of TRUE, drops those levels of all factor variables that are specified in achievementVars and aggregateBy. Use print on an edsurvey.data.frame to see the omitted levels.

defaultConditions

a logical value. When set to the default value of TRUE, uses the default conditions stored in edsurvey.data.frame to subset the data. Use print on an edsurvey.data.frame to see the default conditions.

recode

a list of lists to recode variables. Defaults to NULL. Can be set as recode = list(var1= list(from=c("a,"b","c"), to ="d")). See Examples.

Value

A list containing up to two data.frame (s), one for each of the discrete and cumulative achievement levels as determined by returnDiscrete and returnCumulative. The data.frame contains the following columns:

Level

One row for each level of the specified achievement cut points.

Variables in achievementVars

One column for each variable in achievementVars, and one row for each level of each variable in achievementVars.

Percent

Percentage of students at or above each achievement level aggregated as specified by aggregateBy.

StandardError

The standard error of the percentage, accounting for the survey sampling methodology. See the statistics vignette.

n0

The number of observations in the incomding data (the number of rows when omittedLevels and defaultConditions are set to FALSE.

nUsed

The number of observations in the data after applying all filters (see omittedLevels and defaultConditions).

Details

The achievementLevels function applies appropriate weights and variance estimation method for each edsurvey.data.frame, with several arguments for customizing the aggregation and output of the analysis results. Namely, by using these optional arguments, users can choose to generate the percentage of students performing at each achievement level (discrete), at or above each achievement level (cumulative), calculating the percentage distribution of students by achievement levels (discrete or cumulative) and selected characteristics (specified in aggregateBy), and computing the percentage distribution of students by selected characteristics within a specific achievement level.

Calculation of percentages

The details of the methods are shown in the statistics vignette, which you can read by running vignette("statistics", package="EdSurvey") at the R prompt. The methods described in “Estimation of weighted percentages when plausible values are present” are use to calculate all cumulative and discrete probabilities.

When the requested achievement levels are discrete (returnDiscrete = TRUE), the percentage \(\mathcal{A}\) is the percentage of students (within the categories specified in aggregateBy) whose scores lie in the range [\(cutPoints_i\), \(cutPoints_{i+1}\)), i = 0,1,...,n. cutPoints is the score thresholds provided by the user with \(cutPoints_0\) taken to be 0. cutPoints are set to NAEP standard cut points for achievement levels by default. To aggregate by a specific variable, for example, dsex, specify dsex in aggregateBy and all other variables in achievementVars. To aggregate by achievement levels, specify the name of the plausible value in aggergateBy and all other variables in achievementVars

When the requested achievement levels are cumulative (returnCumulative = TRUE) the percentage \(\mathcal{A}\) is the percentage of students (within the categories specified in “aggregateBy”) whose scores lie in the range [\(cutPoints_i\), \(\infty\)), i = 1,2...,n-1. The first and last categories are the same as defined for discrete levels.

Calculation of standard error of percentages

The method used to calculate the standard error of the percentages is described in the Statistics vignette in the section “Estimation of the standard error of weighted percentages when plausible values are present, using the jackknife method.” the value of jrrIMax sets the value of \(m^*\).

References

Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. New York, NY: Wiley.

Examples

Run this code

# NOT RUN {
# read in the example data (generated, not real student data)
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))

# Discrete achievement Levels
achievementLevels(achievementVars=c("composite"), aggregateBy=NULL, data=sdf) 

# Cumulative achievement Levels
achievementLevels(achievementVars=c("composite"), aggregateBy=NULL, data=sdf, 
                  returnCumulative=TRUE) 

# Achievement levels as independent variables, by sex aggregated by composite
achievementLevels(achievementVars=c("composite", "dsex"), aggregateBy="composite",
                  data=sdf, returnCumulative = TRUE) 

# Achievement levels as independent variables, by sex aggregated by sex
achievementLevels(achievementVars=c("composite", "dsex"), aggregateBy="dsex", 
                  data=sdf, returnCumulative=TRUE) 

# Achievement levels as independent variables, by race aggregated by race
achievementLevels(achievementVars=c("composite", "sdracem"),
                  aggregateBy="sdracem", data=sdf, returnCumulative=TRUE) 

# Use recode to change values for specified variables:
achievementLevels(achievementVars=c("composite","dsex", "b017451"),
                           aggregateBy = "dsex", sdf,
                           recode=list(
                             b017451=list(
                               from=c("Never or hardly ever",
                                      "Once every few weeks","About once a week"),
                               to=c("Infrequently")),
                             b017451=list(
                               from=c("2 or 3 times a week","Every day"),
                               to=c("Frequently"))))

# }

Run the code above in your browser using DataLab