Learn R Programming

SNPhood (version 1.2.3)

collectFiles: Helper function to generate a data frame that can be used as input for the function analyzeSNPhood

Description

collectFiles creates a data frame that can be used as input for the function analyzeSNPhood. The resulting data frame contains information about files that will be processed (column signal) and, optionally, corresponding input files for normalization (column input) and labels to combine datasets to meta-datasets (column individual).

Usage

collectFiles(patternFiles, recursive = FALSE, ignoreCase = TRUE, inputFiles = NA, individualID = NA, genotypeMapping = NA, verbose = TRUE)

Arguments

patternFiles
Character. If vector of length 1, absolute path to one or multiple BAM files that should be processed. Wildcards ("*") are allowed (examples are *CTCF* or *.bam, see also examples). If vector of length > 1, each element must specify the absolute path to a BAM file, with no wildcards being allowed. See also the note above concerning the integration of BamFile or BamFileList objects. For more details, see the examples and the vignette.
recursive
Logical(1). Default FALSE. Should the search for BAM files within the directory be performed recursively? If set to TRUE, all files matching the pattern within the specified directory and all of its subdirectories will be added. If set to FALSE, only files within the specified directory but not any subdirectories will be used.
ignoreCase
Logical(1). Default TRUE. Should the specified pattern be case sensitive?
inputFiles
Character. Default NULL. Input files that should be used as a control for normalization. Supported values are NA (no input normalization), a single character specifying one or multiple input files (comma-separated, see examples) that should be used for all processed files, or a character vector of the same length as the number of files that will be processed. Set to NULL if you want to add the files later manually in the data frame (see vignette).
individualID
Character. Default NULL. Name of the individual IDs. Only relevant if datasets should be pooled.
genotypeMapping
Character. Default NULL. Path to the corresponding genotype file in VCF format, followed by a colon and the name of the column in the VCF file. Genotypes can also be integrated later using the function associateGenotypes
verbose
Logical(1). Default TRUE. Should the verbose mode (i.e., diagnostic messages during execution of the script) be enabled?

Value

a data frame with the three columns signal, input and individual that can be used as input for the function analyzeSNPhood.

Details

Note that if you already have an object of class BamFile or BamFileList, this can easily be integrated into the SNPhood framework by using the path function to specify the value of the parameter patternFiles, see the examples below.

See Also

analyzeSNPhood

Examples

Run this code
## For brevity, only exemplary filenames are given in the following. 
## Note that in reality, absolute paths should be provided.
## First some examples using specific files rather than files that 
## match a pattern in a particular directory

## Load SNPhoodData library
library(SNPhoodData)
files.df = collectFiles(patternFiles = paste0(system.file("extdata", package = "SNPhoodData"),"/*.bam"))

## If you already have BAM files in objects of class \code{\linkS4class{BamFile}} or \code{\linkS4class{BamFileList}},
## you may use the following code snippet:
files = list.files(pattern = "*.bam$",system.file("extdata", package = "SNPhoodData"),full.names = TRUE)
BamFile.o = BamFile(files[1])
BamFiles.o = BamFileList(files)
files.df = collectFiles(patternFiles = path(BamFile.o))
files.df = collectFiles(patternFiles = path(BamFiles.o))

Run the code above in your browser using DataLab