prepareSGP
, analyzeSGP
and combineSGP
.
summarizeSGP(sgp_object, state, years, content_areas, sgp.summaries=NULL, summary.groups=NULL, confidence.interval.groups=NULL, produce.all.summary.tables=FALSE, summarizeSGP.baseline=NULL, projection.years.for.target=3, save.old.summaries=FALSE,
highest.level.summary.grouping="STATE", parallel.config=NULL)
@Data
slot. If summaries of student growth percentiles are requested, those quantities must first be produced (possibly by first using analyzeSGP
) and subsequently combined with the @Data
data (possibly with combineSGP
).
SGPstateData
.
summary.group
argument. The default (produced internal to summarizeSGP) summaries include:
MEDIAN_SGP |
The group level median student growth percentile. |
MEDIAN_SGP_COUNT |
The number of students used to compute the median. |
PERCENT_AT_ABOVE_PROFICIENT |
The percentage of students at or above proficient. |
PERCENT_AT_ABOVE_PROFICIENT_COUNT |
The number of students used to compute the percentage at/above proficient. |
PERCENT_AT_ABOVE_PROFICIENT_PRIOR |
The percentage of students at or above proficient in the prior year. |
PERCENT_AT_ABOVE_PROFICIENT_PRIOR_COUNT |
The number of students used to compute the percentage at/above proficient in the prior year. |
percent_in_category()
summary function requires a variable that MUST be a factor with proficiency categories as levels. The function utilizes the SGPstateData
with the provided state name in an attempt to identify achievement levels and whether or not they are considered proficient.
@Names
slot of the provided
SGP object. See prepareSGP
for more information on supplied meta-data.
institution : |
State, District and/or School. |
content area : |
Variable indicating content area (default is CONTENT_AREA) if content area summaries are of interest. |
time : |
Variable indicating time (default is YEAR) if time summaries are of interest. NOTE: Cross year (i.e., multi-year) summaries default to 3 years. |
institution_type : |
Variable(s) indicating the type of institution institution (default EMH_LEVEL) if summaries by institution type is of interest. |
institution_level : |
Variable(s) indicating levels within the institution (default GRADE) if summaries by institution level is of interest. |
demographic : |
Demographics variables if summaries by demographic subgroup are of interest. |
institution_inclusion : |
Variables indicating inclusion for institutional calculations. |
growth_only_summary : |
Variables indicating whether to calculate summaries only for those students with growth in addition to other analyses. |
NULL
can be provided if a grouping subset is not desired. All possible combinations of the group variables are produced.
TYPE : |
Either Bootstrap (default) or CSEM indicating Bootstrap confidence interval calculation (the default) or conditional standard error of measurement based confidence interval calculation (experimental). |
VARIABLES : |
The variables on which to calculate confidence intervals (default is SGP). |
QUANTILES |
The desired confidence quantiles. |
GROUP |
The group summaries for which confidence intervals should be constructed. |
content |
The content area variable if confidence intervals by content area are desired. |
time |
The time variable (default is YEAR) if confidence intervals by time period are desired. |
institution_type |
The institution type variables (e.g., EMH_LEVEL, default is EMH_LEVEL) if confidence intervals by institution level are desired. |
institution_level |
The institution level variables (e.g., GRADE, default is NULL) if confidence intervals by institution level are desired. |
demographic |
The demographic variables if confidence intervals by demographic subgroups are desired. |
institution_inclusion |
The institution inclusion variables if confidence intervals by institution inclusion subgroups are desired. |
growth_only_summary The growth only summary variables if confidence intervals by growth only summary group are desired. |
CSEM
analysis this argument requires that simulated SGPs have been produced (see analyzeSGP
for more information). List slots set to NULL
will not produce confidence intervals. NOTE: This is currently an experimental functionality and is very memory intensive. Groups to be included should be identified selectively! The default 95% confidence intervals are provided in the selected summary tables as two additional columns named LOWER_MEDIAN_SGP_95_CONF_BOUND
and UPPER_MEDIAN_SGP_95_CONF_BOUND
.
SGP_TARGET
variables to summarize based upon years projected forward. Default is 3 years which is what is generally used by
most states.
@Summary
slot (if not NULL) prior to calculating new summaries. By defaulting to FALSE, the function
overwrites previous (e.g., last year's summaries) summaries.
FOREACH
or PARALLEL
. Please consult the manuals and vignettes for information of these packages! The analyzeSGP
help page contains more thorough explanation and examples of the parallel.config
setup. TYPE is a third element of the parallel.config
list that provides necessary information when using FOREACH or PARALLEL packages as the backend. With BACKEND="FOREACH", the TYPE element specifies the flavor of 'foreach' backend. As of version 1.0-1.0, only "doParallel" is supported. TYPE=NA (default) produces summaries sequentially. If BACKEND = "PARALLEL", the parallel
package will be used. This package combines deprecated parallel packages snow
and multicore
. Using the "snow" implementation of parallel
the function will create a cluster object based on the TYPE element specified and the number of workers requested (see WORKERS list description below). The TYPE element indicates the users preferred cluster type (either "PSOCK" for socket cluster of "MPI" for an OpenMPI cluster). If Windows is the operating system, this "snow" implementation must be used and the TYPE element must = "PSOCK". Defaults are assigned based on operating system if TYPE is missing based on system OS. Unix/Mac OS defaults to the "multicore" to avoid worker node pre-scheduling and appears to be more efficient in these operating systems.
The WORKERS element is a list with SUMMARY specifying the number of processors (nodes) desired or available. For example, SUMMARY=2 may be used on a dual core machine to use both cores available. (NOTE: choice of the number of cores is a balance between the number of processors available and the amount of RAM a system has; each system will be different and may require some adjustment).
Default is FOREACH as the back end, TYPE=NA and WORKERS=1, which produces summary tables sequentially: 'list(BACKEND="FOREACH", TYPE=NA, WORKERS=list(SUMMARY=1))'
Example parallel use cases are provided below.
@Summary
slot of the SGP data object. Each institution
has a slot in the @Summary
list.
foreach
package to parallel process summary tables of student data. The proper choice of parallel backend is dependent upon the user's operating system, software and system memory capacity. Please see the foreach
documentation for details. By default, the function will process the summary tables sequentially.
prepareSGP
, analyzeSGP
, combineSGP
## Not run:
# ## summarizeSGP is Step 4 of 5 of abcSGP
# Demonstration_SGP <- sgpData_LONG
# Demonstration_SGP <- prepareSGP(Demonstration_SGP)
# Demonstration_SGP <- analyzeSGP(Demonstration_SGP)
# Demonstration_SGP <- combineSGP(Demonstration_SGP)
# Demonstration_SGP <- summarizeSGP(Demonstration_SGP)
#
# ### Example uses of the parallel.config argument
#
# ## Windows users must use the parallel package and R version >= 2.13:
# # Note the number of workers is 8, and PSOCK type cluster is used.
# # This example is would be good for a single workstation with 8 cores.
# . . .
# parallel.config=list(
# BACKEND="PARALLEL", TYPE="PSOCK",
# WORKERS=list(SUMMARY=2))
# . . .
#
# # doParallel package - only available with R 2.13 or newer
# . . .
# parallel.config=list(
# BACKEND="FOREACH", TYPE="doParallel",
# WORKERS=list(SUMMARY=6))
# . . .
#
# ## parallel package - only available with R 2.13 or newer
# # Note the number of workers is 50, and MPI is used,
# # suggesting this example is for a HPC cluster usage.
# . . .
# parallel.config=list(
# BACKEND="PARALLEL", TYPE="MPI"),
# WORKERS=list(SUMMARY=50))
# . . .
#
# # NOTE: This list of parallel.config specifications is NOT exhaustive.
# # See examples in analyzeSGP documentation for some others.
# ## End(Not run)
Run the code above in your browser using DataLab