preprocRccSet,RccSet-method: Preprocess an RccSet

Description

This function is a wrapper to perform any combination of positive control normalization, background correction, and content normalization on the input RccSet. For each completed preprocessing step, a matrix is added to the assayData of the resulting RccSet object:

posCtrlData: expression data after positive control normalization
bgEstimates: background estimates
bgCorrData: expression data after positive control normalization and background correction
normData: expression data after positive control normalization, background correction, and content normalization

(NOTE: normData is on a log2 scale while all the other matrices are on a linear scale.)

If any step is omitted, the corresponding matrix will not be present in the output's assayData. The parameters for all steps are recorded in the output's experimentData@preprocessing list (accessible through preproc(rccSet) where rccSet is an RccSet output by this function). In addition:

If positive control normalization is performed, a column named 'PosCtrl' is added to the output's phenoData to record the positive control scaling factors.
If the presence/absence call is performed, a matrix named `paData' is added to the output's assayData to indicate the presence/absence of each feature in each sample. See the `pa' argument for details.
If housekeeping normalization is performed, a column labeled `Housekeeping' is added to the featureData to indicate which features were used for it.

Usage

"preprocRccSet"(rccSet, doPosCtrlNorm = TRUE, doBackground = TRUE, doPresAbs = TRUE, doContentNorm = TRUE, pcnSummaryFunction = "sum", bgReference = c("both", "blanks", "negatives"), bgSummaryFunction = "median", bgStringency = 1, nSolverBackground.w1 = 2.18, nSolverBackground.shrink = TRUE, paStringency = 2, normMethod = c("global", "housekeeping"), normSummaryFunction = "median", hkgenes = NULL, hkfeatures = NULL, quietly = FALSE)

Arguments

rccSet

An RccSet.

doPosCtrlNorm

Boolean specifying whether or not to perform positive control normalization. (`pcd' is short for `posCtrlData', the matrix which gets added to assayData when this step is performed.)

doBackground

Boolean specifying whether or not to perform background correction.

doPresAbs

Boolean specifying whether or not the presence/absence call should be performed. For details, see presAbsCall().

doContentNorm

Boolean specifying whether or not content normalization should be performed.

pcnSummaryFunction

Function to be used for the positive control normalization (e.g. "mean", "median", or "sum"). User-defined functions similar to these can be specified here as well.

bgReference

Measurements to use for background estimates: either "blank" (for blank samples), "negatives" (for negative control probes), or "both". For details on exactly how the background estimates are computed in each case, see getBackground().

bgSummaryFunction

Summary function for background measurements (e.g. "mean" or "median"). User-defined functions similar to these can be specified here as well.

bgStringency

Factor by which deviation (SD or MAD) of the summarization output will be multiplied to obtain final background estimates.

nSolverBackground.w1

Value to use for the 'w1' argument to nSolverBackground(). (Only takes effect if bgReference == "both"; see getBackground().)

nSolverBackground.shrink

Value to use for the 'shrink' argument to nSolverBackground(). (Only takes effect if bgReference == "both"; see getBackground().)

paStringency

Multiplier to use in establishing the presence/absence call. For details, see presAbsCall().

normMethod

Specifies the features to be used for content normalization. "global" indicates that all features should be used and "housekeeping" indicates that only housekeeping features should be used. If "housekeeping" is specified and the `hk' argument (below) is also specified, then the features indicated by `hk' will be used. If "housekeeping" is specified and `hk' is left NULL, then the default housekeeping features (i.e. those with CodeClass == "Housekeeping") will be used.

normSummaryFunction

Character specifying the summary function to apply to the selected features (e.g. "mean" or "median") during the content normalization step. User-defined functions similar to these can be specified here as well.

hkgenes

Character vector with gene symbols to be used for content normalization if housekeeping is specified as the normalization method. If specified, all features that match any of the specified symbols will be used. (To specify specific features, use the `hkfeatures' argument instead; see below.)

hkfeatures

Character vector with full feature names ("__", e.g. "Endogenous_ACTG1_NM_001614.1") to be used for content normalization if housekeeping is specified as the normalization method. (Note: if this argument is specified at the same time as `hkgenes', an error will be thrown.)

quietly

Boolean specifying whether or not messages and warnings should be omitted.

Value

A copy of the input RccSet with additional matrices in the assayData for each successive preprocessing step along with parameters for each step recorded in the experimentData@preprocessing list.

Details

For more information on the rationale behind the recommended preprocessing and normalization steps, please see the vignette.

References

NanoString nCounter(R) Expression Data Analysis Guide (2012)

Examples

Run this code

data(example_rccSet)
hknorm_example_rccSet <- preprocRccSet(example_rccSet)

Run the code above in your browser using DataLab