assayData
and
batchStatistics
will be ff
objects. See details.
constructInf(sampleSheet = NULL, arrayNames = NULL, path = ".", arrayInfoColNames = list(barcode="SentrixBarcode_A",position="SentrixPosition_A"), highDensity = FALSE, sep = "_", fileExt = list(green = "Grn.idat", red = "Red.idat"), XY, cdfName, verbose = FALSE, batch=NULL, saveDate = TRUE)
data.frame
containing Illumina sample sheet
information (for required columns, refer to BeadStudio Genotyping
guide - Appendix A).
NULL
, all arrays that can be found in the
specified working directory will be read in.sampleSheet
is specified)
list containing elements 'barcode' which indicates column names in
the sampleSheet
which contains the arrayNumber/barcode number
and 'position' which indicates the strip number. In older style
sample sheets, this information is combined (usually in a column
named 'SentrixPosition') and this should be specified as
list(barcode=NULL, position="SentrixPosition")
sampleSheet
is
specified). If TRUE
, array extensions '\_A', '\_B' in
sampleSheet are replaced with 'R01C01', 'R01C02' etc.NChannelSet
containing X and Y intensities.validCdfNames
)CNSet
objectThis function initializes a container for storing the normalized intensities for the A and B alleles at polymorphic loci and the normalized intensities for the 'A' allele at nonpolymorphic loci. CRLMM genotype calls and confidence scores are also stored in the assayData. This function does not do any preprocessing or genotyping -- it only creates an object of the appropriate size. The initialized values will all be 'NA'.
The ff package provides infrastructure for accessing and writing
data to disk instead of keeping data in memory. Each element of
the assayData
and batchStatistics
slot are ff
objects. ff objects in the R workspace contain pointers to
several files with the '.ff' extension on disk. The location of
where the data is stored on disk can be specified by use of the
ldPath
function. Users should not move or rename this
directory. If only output files are stored in ldPath
,
one can either remove the entire directory prior to rerunning
the analysis or all of the '.ff' files. Otherwise, one would
accumulate a large number of '.ff' files on disk that are no
longer in use.
We have adopted the ff
package in order to reduce crlmm's
memory footprint. The memory usage can be fine-tuned by the
utilities ocSamples
and ocProbesets
provided in
the oligoClasses
package. In most instances, the
user-level interface will be no different than accessing data
from ordinary matrices in R. However, the differences in the
underlying representation can become more noticeable for very
large datasets in which the I/O for accessing data from the disk
can be substantial.
ldPath
, ocSamples
, ocProbesets
, CNSet-class
, preprocessInf
, genotypeInf
## See the Illumina vignettes in inst/scripts of the
## source package for an example
Run the code above in your browser using DataLab