files2SpectraObject: Merge Files in a Directory into a Spectra Object

Description

This function will read all files of a given type in a directory, and use the file names to construct group membership and assign colors and symbols. All the data is placed into an object of S3 class "Spectra". This is the only way to create a "Spectra" object automatically.

Usage

files2SpectraObject(gr.crit = NULL, gr.cols = c("auto"),
freq.unit = "no frequency unit provided",
int.unit = "no intensity unit provided",
descrip = "no description provided",
format = "csv", alignTMS = FALSE,
out.file = "mydata", debug = FALSE, ...)

Arguments

gr.crit

Group Criteria. A vector of character strings which will be searched for among the file names in order to assign an individual spectrum/sample to group membership. Warnings are issued if there are file names that don't match entries in gr.crit

gr.cols

Group Colors. Either the word "auto", in which case colors will be automatically assigned, or a vector of acceptable color names with the same length as gr.crit. In the latter case, colors will be assigned one for one, so the first element o

freq.unit

A character string giving the units of the x-axis (frequency or wavelength).

int.unit

A character string giving the units of the y-axis (some sort of intensity).

descrip

A character string describing the data set that will be stored. This string is used in some plots so it is recommended that its length be less than about 40 characters.

format

A character string giving the format of the files to be processed. Default is csv for US-style csv files. Alternatively, you can specify csv2 for EU-style csv files, dx for JCAMP-DX files, or Btxt for

alignTMS

Logical indicating if we should try to align the TMS (or TSP) peak in proton NMR spectra (applies to format = "Btxt" only). See Details.

out.file

A file name acceptable to the save function. The completed object of S3 class "Spectra" will be written to this file.

debug

Logical; set to TRUE for troubleshooting when using format = "Btxt" or "dx".

...

Other arguments to be passed downstream (At times you might want to pass alternate values of span, sn, and thres to findTMS and related functions).

Value

A object of class Spectra. An unnamed object of S3 class Spectra is also written to out.file. To read it back into the workspace, use new.name <- loadObject(out.file), found in package R.utils.

Warning

Files whose names are not matched using gr.crit are still incorporated into the "Spectra" object, but they are not assigned a group or color and therefore don't plot, though they do take up space in a plot!

Details

The linking of groups with colors is handled by groupNcolor.

The matching of gr.crit against the sample file names is done one at a time, in order. This means that the entries in gr.crit must be mutually exclusive. For example, if you have files with names like "Control_1" and "Sample_1" and use gr.crit = c("Control", "Sample") groups will be assigned as you would expect. But, if you have file names like "Control_1_Shade" and "Sample_1_Sun" you can't use gr.crit = c("Control", "Sample", "Sun", "Shade") because each criteria is grepped in order, and the "Sun/Shade" phrases, being last, will form the basis for your groups. Because this is a grep process, you can get around this by using regular expressions in your gr.crit argument to specify the desired groups in a mutually exclusive manner. In this second example, you could use gr.crit = c("Control(.*)Sun", "Control(.*)Shade", "Sample(.*)Sun", "Sample(.*)Shade") to have your groups assigned based upon both phrases in the file names.

files2SpectraObject acts on the files in the current working directory. If format = "csv" these should be .csv files with the first column containing the frequency values and the second column containing the intensity values. The columns should be unlabeled. The frequency column is assumed to be the same in all .csv files. If format = "dx" or format = "Btxt", then the corresponding file type will be processed (consider setting debug = TRUE for these formats). See readJDX and readBrukerTxt for limitations (there are many options with these formats, especially JCAMP, and most are untested).

If format = "Btxt" and alignTMS = TRUE, the function will try to find the TMS peak and align the spectra on it. Also, spectra of different chemical shift ranges are allowed for this format. In this case, the spectra will first be aligned on TMS and then the set of spectra will be trimmed so that there are no NA's in Spectra$data. Warnings are given as this is done. This is experimental so please check your results carefully! Please feel free to submit data sets that give trouble and I can see if I can improve the processing.

There should be no other files of the given format (extension) in the directory except those containing the data to be processed by files2SpectraObject, as all files with that format in the directory will be processed.

References

https://github.com/bryanhanson/ChemoSpec