summarizeSGP: Summarize student scale scores, proficiency levels and student growth percentiles according to user specified summary group variables

Description

Utility function used to produce summary tables using long formatted data that contain student growth percentiles. An exemplar is provided from the successive execution of prepareSGP, analyzeSGP and combineSGP.

Usage

summarizeSGP(sgp_object,
           state,
           years,
           content_areas,
           sgp.summaries=NULL,
           summary.groups=NULL,
           confidence.interval.groups=NULL,
           produce.all.summary.tables=FALSE,
           summarizeSGP.baseline=NULL,
           parallel.config=NULL)

Arguments

sgp_object

A list containing long formatted data in the Student slot. If summaries of student growth percentiles are requested, those quantities must first be produced (possibly by first using analyzeSGP

state

Acronym indicating state associated with the summaries for access to assessment program information embedded in SGPstateData.

years

A vector indicating year(s) in which to produce summary tables associated with student growth percentile and percentile growth trajectory/projection analyses. If missing the function will use the data to calculate years and produce summaries for the most

content_areas

A vector indicating content area(s) in which to produce student growth percentiles and/or student growth projections/trajectories. If missing the function will use the data to infer the content area(s) available for analyses.

sgp.summaries

A list giving the summaries requested for each group analyzed based upon the summary.group argument. The default (produced internal to summarizeSGP) summaries include: ll{ MEDIAN_SGP The group level median stu

summary.groups

A list consisting of 8 elements indicating the types of groups across which all summaries are taken (Inclusion means that summaries will be calculated for levels of the associated variable). For state data, if the list is not explicitly provided, the fun

confidence.interval.groups

A list consisting of information used to calculate group confidence intervals: ll{ TYPE: Either Bootstrap (default) or CSEM indicating Bootstrap confidence interval calculation (the default) or c

produce.all.summary.tables

A boolean variable, defaults to FALSE, indicating whether the function should produce ALL possible summary table. By default, a set of approximately 70 tables are produced that are used in other parts of the packages (e.g., bubblePlots).

summarizeSGP.baseline

A boolean variable, defaults to FALSE, indicating whether the function should utilize baseline sgp for summary table production. By default, a set of approximately 100 tables are produced that are used in other parts of the packages (e.g., bubblePlots).

parallel.config

A named list with, at a minimum, two elements indicating 1) the BACKEND package to be used for parallel computation and 2) the WORKERS list to specify the number of processors to be used in each major analysis. The BACKEND element can be set = to F

Value

Function returns lists containing the summary tables as data.table objects in the Summary slot of the SGP data object. Each institution has a slot in the Summary list.

Details

Function makes use of the foreach package to parallel process summary tables of student data. The proper choice of parallel backend is dependent upon the user's operating system, software and system memory capacity. Please see the foreach documentation for details. By default, the function will process the summary tables sequentially.

Examples

Run this code

## summarizeSGP is Step 4 of 5 of abcSGP
Demonstration_SGP <- sgpData_LONG
Demonstration_SGP <- prepareSGP(Demonstration_SGP)
Demonstration_SGP <- analyzeSGP(Demonstration_SGP)
Demonstration_SGP <- combineSGP(Demonstration_SGP)
Demonstration_SGP <- summarizeSGP(Demonstration_SGP)

###  Example uses of the parallel.config argument

##  Windows users must use snow: 
Demonstration_SGP <- summarizeSGP(Demonstration_SGP,
	parallel.config=list(
		BACKEND="SNOW", TYPE="SOCK",
		WORKERS=list(SUMMARY=2))

##  Windows users with R version >= 2.13  may prefer the parallel package:
#  Note the number of workers is 8, and SOCK type cluster is used.
#  This example is would be good for a single workstation with 8 cores.
	. . .
	parallel.config=list(
		BACKEND="PARALLEL", TYPE="SOCK",
		WORKERS=list(SUMMARY=2))
	. . .

## FOREACH uses:
# doMC - only available on Linux or Mac OSX
	. . .
	parallel.config=list(
		BACKEND="FOREACH", TYPE="doMC",
		WORKERS=2)
	. . .

#  New doParallel package - only available with R 2.13 or newer
#  Note the SOCK cluster is the only option made available at this time
#  for use with SNOW. 
	. . .
	parallel.config=list(
		BACKEND="FOREACH", TYPE="doParallel", 
		WORKERS=list(SUMMARY=6))
	. . .

##  New parallel package - only available with R 2.13 or newer
#  Note the number of workers is 50, and MPI is used, 
#  suggesting this example is for a HPC cluster usage.
	. . .
	parallel.config=list(
		BACKEND="PARALLEL", TYPE="MPI"),
		WORKERS=list(SUMMARY=50))
	. . .

##  Linux/Mac may use the multicore package:
#    TYPE is not used with MULTICORE.
	. . .
	parallel.config=list(
		BACKEND="MULTICORE",
		WORKERS=4)
	. . .

#  NOTE:  This list of parallel.config specifications is NOT exhaustive.  
#  See examples in analyzeSGP documentation for some others.

Run the code above in your browser using DataLab