- panel.data
REQUIRED. Object of class list, data.frame, or matrix containing longitudinal student data in wide format. If supplied as part of a list, data should be
contained in panel.data$Panel_Data
. Data must be formatted so that student ID is the first variable/column, student grade/time variables for each time period,
from earliest to most recent, are the next variables/columns, and student scale score variables for each year, from earliest to latest, are the remaining
variables/columns. See sgpData
for an exemplar data set. NOTE: The column position of the variables IS IMPORTANT, NOT the names of the variables.
- sgp.labels
REQUIRED. A list, sgp.labels
, of the form list(my.year= ,
my.subject= )
or list(my.year= , my.subject= , my.extra.label)
. The user-specified values are used to save the student growth percentiles,
coefficient matrices, knots/boundaries, and goodness of fit results in an orderly fashion using an appropriate combination of year &
subject & grade. Except in special circumstances, supplying my.year
and my.subject
are sufficient to uniquely label derivative output.
- panel.data.vnames
Vector of variables to use in student growth percentile calculations. If not specified, function attempts to use all available variables.
- additional.vnames.to.return
A list of the form list(VARIABLE_NAME_SUPPLIED=VARIABLE_NAME_TO_BE_RETURNED) indicating data to be returned with results
from studentGrowthPercentiles
analyses.
- grade.progression
Preferred argument to specify a student grade/time progression in the data. For example, 3:4
would indicate to subset the
data where the two most recent grades for which data are available are 3 and 4, respectively. The argument allows for non-sequential grade progressions to be analyzed with automatic
removal of columns where "holes" occur in the supplied grade.progression. For example, for the grade.progression c(7,8,10)
, the penultimate GRADE and SCALE_SCORE column
in the supplied panel.data would be removed. The argument can also be combined with an appropriate panel.data.vnames
argument to remove a year of data would analyze students
progressing from 7 to 8 to 10.
- content_area.progression
Character vector of content area names of same length as grade.progression to be provided if not all identical to 'my.subject' in sgp.labels list. Vector will be used to populate the @Content_Areas slot of the splineMatrix class coefficient matrices. If missing, 'sgp.labels$my.subject' is repeated in a vector length equal to grade.progression.
- year.progression
Character vector of years associated with grade and content area progressions. If missing then the year.progression is assumed to end in 'my.year' provided in
sgp.labels and be of the same length as grade.progression. Vector will be used to populate the @Years slot of the splineMatrix class coefficient matrices.
- year_lags.progression
A numeric vector indicating the time lags/span between observations in the columns supplied to studentGrowthPercentiles
. The default, NULL, allows the function
to calculate the lags/differences based upon the supplied years.
- num.prior
Number of prior scores one wishes to use in the analysis. Defaults to num.panels-1
.
If num.prior=1
, then only 1st order growth percentiles are computed, if num.prior=2
, then 1st and 2nd order are computed,
if num.prior=3
, 1st, 2nd, and 3rd ... NOTE: specifying num.prior
is necessary in some situations (in early grades for example)
where the number of prior data points is small compared to the number of panels of data.
- max.order.for.percentile
A positive integer indicating the maximum order for percentiles desired. Similar limiting of number of priors used can be accomplished using the grade.progression
argument.
- return.additional.max.order.sgp
A positive integer (defaults to NULL) indicating the order of an additional SGP to be returned: SGP_MAX_ORDER_N
.
- subset.grade
Student grade level for sub-setting. If the data fed into the function contains multiple
grades, setting subset.grade=5
selects out those students in grade five in the most recent year of the data. If no sub-setting is desired,
argument do not include the subset.grade
argument. If grade.progression
is supplied, then a subset grade is implicitly specified.
- percentile.cuts
Additional percentile cuts (supplied as a vector) between 1 and 99 associated with each student's conditional distribution.
Default is to provide NO growth percentile cuts (scale scores associated with those growth percentiles) for each student.
- growth.levels
A two letter state acronym or a list of the form list(my.cuts= , my.levels= )
specifying a vector of cuts between 1 and 99 (e.g., 35, 65)
and the associated qualitative levels associated with the cuts (e.g., low, typical, and high). Note that the length of my.levels should be one more than the
length of my.cuts. To add your growth levels to the SGPstateData
data set, please contact the package administrator.
- use.my.knots.boundaries
A list of the form list(my.year= , my.subject= )
specifying a set of pre-calculated
knots and boundaries for B-spline calculations. Most often used to utilize knots and boundaries calculated from a previous analysis. Knots and boundaries are stored
(and must be made available) with panel.data
supplied as a list in panel.data$Knots_Boundaries$my.subject.my.year
. As of SGP_0.0-6 user can also supply
a two letter state acronym to utilize knots and boundaries within the SGPstateData
data set supplied with the SGP package. To add your knots and boundaries to the
SGPstateData
data set, please contact the package administrator. If missing, function automatically calculates knots, boundaries, and loss.hoss values and stores them
in panel.data$Knots_Boundaries
$my.subject.my.year
where my.subject
and my.year
are provided by sgp.labels
.
- use.my.coefficient.matrices
A list of the form list(my.year= , my.subject= )
specifying a set of pre-calculated
coefficient matrices to use for student growth percentile calculations. Can be used to calculate baseline referenced student growth percentiles or to calculate student growth percentiles for small groups of excluded students without recalculating an entire set of data. If missing, coefficient matrices are calculated based upon the provided data and stores them in
panel.data$Coefficient_Matrices$my.subject.my.year
where my.subject
and my.year
are provided by sgp.labels
.
- calculate.confidence.intervals
A character vector providing either a state acronym or a variable name from the supplied panel data. If a state acronym, CSEM tables from the embedded
SGPstateData
(note: CSEM data must be embedded in the SGPstateData
set. To have your state CSEMs embed in the SGPstateData
set, please contact the package
administrator) will be used. If a variable name, the supplied panel data must contain a variable providing student level CSEMs (e.g., with adaptive testing). NOTE: If a variable
name is supplied, the user must also use the argument panel.data.vnames
indicating what variables in the supplied panel.data
will be used for the studentGrowthPercentiles
analysis. For greater control, the user can also supply a list of the form list(state= , confidence.quantiles= , simulation.iterations= , distribution= , round= )
or
list(variable= , confidence.quantiles= , simulation.iterations= , distribution= , round= )
specifying the state
or variable
to use,
confidence.quantiles
to report from the simulated SGPs calculated for each student, simulation.iterations
indicating the number of simulated SGPs to calculate,
distribution
indicating whether to the the Normal
or Skew-Normal
to calculate SGPs, and round
(defaults to 1, which is an integer - see round_any
from plyr
package for details) giving the level to round to. If requested, simulations are calculated and simulated SGPs are stored in panel.data$Simulated_SGPs
.
- print.other.gp
Boolean argument (defaults to FALSE) indicating whether growth percentiles of all orders should be returned. The default returns only the highest order growth percentile for each student.
- print.sgp.order
Boolean argument (defaults to FALSE) indicating whether the order of the growth percentile should be provided in addition to the SGP itself.
- calculate.sgps
Boolean argument (defaults to TRUE) indicating whether student growth percentiles should be calculated following coefficient matrix calculation.
- rq.method
Argument defining the estimation method used in the quantile regression calculations. The default is the "br"
method referring to the Barrodale and Robert's L1 estimation detailed in Koenker (2005) and in the help for the quantile regression (quantreg
) package.
- rq.method.for.large.n
Argument defining the estimation method used in the quantile regression calculations when norm group cohort size exceeds 300,000 students. The default is the "fn"
method referring to the Frisch-Newton estimation detailed in Koenker (2005) and in the help for the quantile regression (quantreg
) package.
- max.n.for.coefficient.matrices
Argument the defines a size threshold above which a subset of data is taken with a number of cases equal to the sgp.subset.size.threshold argument. Default is NULL,
no subset is taken.
- knot.cut.percentiles
Argument that specifies the quantiles to be used for calculation of B-spline knots. Default is to place knots at the 0.2, 0.4, 0.6, and 0.8 quantiles.
- knots.boundaries.by.panel
Boolean argument (defaults to FALSE) indicating whether knots and boundaries should be calculated by panel in supplied panel data instead of aggregating across panel. If panels are on different scales, then different knots and boundaries may be required to accommodate quantile regression analyses.
- exact.grade.progression.sequence
Boolean argument indicating whether the grade.progression supplied is used exactly (TRUE) as supplied or whether lower order analyses are run as part of the whole analysis (FALSE--the default).
- drop.nonsequential.grade.progression.variables
Boolean argument indicating whether to drop variables that do not occur with a non-sequential grade progress. For example, if the grade progression 7, 8, 10 is provided, the penultimate variable in panel.data
is dropped. Default is TRUE.
- convert.0and100
Boolean argument (defaults to TRUE) indicating whether conversion of growth percentiles of 0 and 100 to growth percentiles of 1 and 99, respectively, occurs. The default produces growth percentiles ranging from 1 to 99.
- sgp.quantiles
Argument to specify quantiles for quantile regression estimation. Default is Percentiles. User can additionally submit a vector of quantiles (between 0 and 1). Goodness of fit output only available currently for PERCENTILES.
- sgp.quantiles.labels
Argument to specify integer labels associated with provided 'sgp.quantiles'. Integer labels must a vector of length 1 longer than the length of 'sgp.quantiles'.
- sgp.loss.hoss.adjustment
Argument to control whether SGP is calculated using which.max for values associated with the hoss embedded in SGPstateData. Providing two letter state acronym utilizes this adjustment whereas supply NULL (the default) uses no adjustment.
- sgp.cohort.size
Argument to control the minimum cohort size used to calculate SGPs and associated coefficient matrices. NULL (the default) uses no restriction. If not NULL, argument should be an integer value.
- sgp.less.than.sgp.cohort.size.return
If non-NULL, indicates whether a data set should be returned with the indicated character string in place of the SGP
that would be calculated. If set to TRUE, then character string: < sgp.cohort.size students in cohort. No SGP Calculated
.
- sgp.test.cohort.size
Integer indicating the maximum number of students sampled from the full cohort to use in the calculation of student growth percentiles. Intended to be used
as a test of the desired analyses to be run. The default, NULL, uses no restrictions (no tests are performed, and analyses use the entire cohort of students).
- percuts.digits
Argument specifying how many digits (defaults to 2) to print percentile cuts (if asked for) with.
- isotonize
Boolean argument (defaults to TRUE) indicating whether quantile regression results are isotonized to prevent quantile crossing following the
methods derived by Chernozhukov, Fernandez-Val and Glichon (2010).
- convert.using.loss.hoss
Boolean argument (defaults to TRUE) indicating whether requested percentile cuts are adjusted using the lowest obtainable scale
score (LOSS) and highest obtainable scale score (HOSS). Those percentile cuts above the HOSS are replaced with the HOSS and those percentile cuts below the
LOSS are replaced with the LOSS. The LOSS and HOSS are obtained from the loss and hoss calculated with the knots and boundaries used for spline calculations.
- goodness.of.fit
Boolean argument (defaults to TRUE) indicating whether to produce goodness of fit results associated with produced student growth percentiles.
Goodness of fit results are grid.grobs stored in panel.data$Goodness_of_Fit
$my.subject.my.year
where my.subject
and my.year
are provided by sgp.labels
.
- goodness.of.fit.minimum.n
Integer argument (defaults to 250) indicating the minimum number of observations necessary before goodness of fit plots are constructed."
- goodness.of.fit.output.format
Character argument (defaults to graphical object 'GROB') indicating output format for goodness of fit plots. Options include:
'GROB', 'PDF', 'PNG', 'SVG'.
- return.prior.scale.score
Boolean argument (defaults to TRUE) indicating whether to include the prior scale score in the SGP data output. Useful for examining relationship between prior
achievement and student growth.
- return.prior.scale.score.standardized
Boolean argument (defaults to TRUE) indicating whether to include the standardized prior scale score in the SGP data output.
Useful for examining relationship between prior achievement and student growth.
- return.norm.group.identifier
Boolean argument (defaults to TRUE) indicating whether to include the content areas and years that form students' specific norm group in the SGP data output.
- return.norm.group.scale.scores
Boolean argument (defaults to NULL) indicating whether to return a semi-colon separated character vector of the scores associated with the SGP_NORM_GROUP to
which the student belongs.
- return.norm.group.dates
Boolean argument or character string (defaults to NULL) indicating whether to return a semi-colon separated character vector of the dates associated
with time dependent SGPt calculations. If TRUE is supplied, 'DATE' is the assumed name for the date variable.
- return.norm.group.preference
A single numeric value (defaults to NULL). When multiple SGPs will be produced for some students and a system is required to identify the preferred SGP
that will be matched with the student in the combineSGP
function. This argument provides a ranking that specifies how preferable SGPs produced from the analysis in question is
relative to other possible analyses. LOWER NUMBERS CORRESPOND WITH HIGHER PREFERENCE.
- return.panel.data
Boolean argument indicating whether to return the original data provided in panel.data$Panel_Data
in the SGP list of results.
Defaults to 'identical(parent.frame(), .GlobalEnv)': If the parent environment from which the function is called is .GlobalEnv, then FALSE, otherwise TRUE.
- print.time.taken
Boolean argument (defaults to TRUE) indicating whether to print message indicating information on studentGrowthPercentiles
analysis and time taken.
- parallel.config
parallel configuration argument allowing for parallel analysis by 'tau'. Defaults to NULL.
- calculate.simex
A character state acronym or list including state/csem variable, csem.data.vnames, csem.loss.hoss, simulation.iterations, simulation.sample.size, lambda and extrapolation method.
Returns both SIMEX adjusted SGP (SGP_SIMEX
) as well as the percentile ranked SIMEX SGP (RANK_SIMEX
) values as suggested by Castellano and McCaffrey (2017). Defaults to NULL, no simex calculations performed.
- sgp.percentiles.set.seed
An integer (or NULL) argument indicating whether to set.seed to make analyses fully reproducible. To turn off, set argument to NULL. Default is 314159.
- sgp.percentiles.equated
An object containing information (linkages, year, ...) on equating done for calculating student growth percentiles.
- SGPt
An argument supplied to implement time-dependent SGP analyses (SGPt). Default is NULL giving standard, non-time dependent argument. If set to TRUE, the function assumes the
variables 'TIME' and 'TIME_LAG' are supplied as part of the panel.data. To specify other names, supply a list of the form: list(TIME='my_time_name', TIME_LAG='my_time_lag_name'), substituting
your variable names.
- SGPt.max.time
Boolean argument (defaults to NULL/FALSE) indicating whether cuts/trajectories should be calculated based upon the maximum Time value in the matrices. Such cuts
are sometimes used to provide within window trajectories.
- verbose.output
A Boolean argument indicating whether the function should output verbose diagnostic messages.