This function imports data into a Spectra2D
object. It primarily uses
read.table
to read files so it is
very flexible in regard to file formatting. Be sure to see the …
argument below for important details you need to provide.
files2Spectra2DObject(
gr.crit = NULL,
gr.cols = "auto",
fmt = NULL,
nF2 = NULL,
x.unit = "no frequency unit provided",
y.unit = "no frequency unit provided",
z.unit = "no intensity unit provided",
descrip = "no description provided",
fileExt = "\\.(csv|CSV)$",
out.file = "mydata",
debug = 0,
chk = TRUE,
allowSloppy = FALSE,
...
)
Group Criteria. A vector of character strings which will be
searched for among the file/sample names in order to assign an individual
spectrum to group membership. This is done using grep, so characters
like "." (period/dot) do not have their literal meaning (see below).
Warnings are issued if there are file/sample
names that don't match entries in gr.crit
or there are entries in
gr.crit
that don't match any file names.
Group Colors. See colorSymbol
for some options. One of the following:
Legacy behavior and the default: The word "auto"
, in which case up to 8 colors will
be automatically assigned from package RColorBrewer Set1
.
"Col7"
. A unique set of up to 7 colorblind-friendly colors is used.
"Col8"
. A unique set of up to 8 colors is used.
"Col12"
. A mostly paired set of up to 12 colors is used.
A vector of acceptable color designations with the same length as gr.crit
.
Colors will be assigned one for one, so the first element of
gr.crit
is assigned the first element of gr.col
and so forth. For Col12
you should pay careful attention to the order of gr.crit
in order to match up colors.
See colorSymbol
for further details.
A character string giving the format of the data. Consult
import2Dspectra
for options.
If fileExt
is one of dx, DX, jdx or JDX
, fmt
will automatically
be set to "dx"
and package readJDX
will be used for the import. In this case
check the values of F2 and F1 carefully. The values are taken from the file,
for some files/vendors the values might be in Hz rather than ppm.
Integer giving the number of data points in the F2 (x) dimension. Note: If any dimension is zero-filled you may need to study the acquistion details to get the correct value for this argument. This may be vendor-dependent.
A character string giving the units for the F2 dimension (frequency or wavelength corresponding to the x dimension).
A character string giving the units for the F1 dimension (frequency or wavelength corresponding to the y dimension).
A character string giving the units of the z-axis (some sort of intensity).
A character string describing the data set.
A character string giving the extension of the files to be
processed. regex
strings can be used. For instance, the default
finds files with either ".csv"
or ".CSV"
as the extension.
Matching is done via a grep process, which is greedy.
If fileExt
is one of dx, DX, jdx or JDX
, fmt
will automatically be
set to "dx"
and package readJDX
will be used for the import.
A file name. The completed object of S3 class Spectra2D
will be written
to this file.
Integer. Set to 1 or TRUE
for basic reporting when there are problems.
If importing JCAMP-DX files, values greater than 1 give additional and potentially
huge output. Once you know which file is the problem, you may wish to troubleshoot
directly using package readJDX
.
Logical. Should the Spectra
object be checked for integrity? If you are having
trouble importing your data, set this to FALSE
and do str(your object)
to investigate.
Logical. Experimental Feature If TRUE
, disable checking of the data set, and
return all pieces of the raw import from import2Dspectra
in the spectra$data
object.
The resulting object currently cannot be used by any other functions in this package! The
intent is allow importing of spectra that differ slightly in the number of points in each dimension.
With this option one can use str
on the resulting object to inspect the differences.
Future functions will allow one to clean up the data.
Arguments to be passed to read.table
,
list.files
or readJDX
; see the "Advanced Tricks" section.
For read.table
, You MUST supply values for sep
, dec
and header
consistent
with your file structure, unless they are the same as the defaults for read.table
.
One of these objects:
If allowSloppy = FALSE
, the default, an object of class Spectra2D
.
If allowSloppy = TRUE
, an object of undocumented class SloppySpectra2D
.
These objects are experimental and are not checked by chkSpectra
.
For these objects spectra$F1
and spectra$F2
are NA
, and each
spectra$data
entry is a list with elements F1, F2 and M, which is the matrix
of imported data (basically, the object returned by import2Dspectra
).
In each case,
an unnamed object of S3 class Spectra2D
or SloppySpectra2D
is also written to out.file
.
To read it back into the workspace, use new.name <- loadObject(out.file)
(loadObject
is package R.utils).
The matching of gr.crit
against the sample file names is done one at
a time, in order, using grep. While powerful, this has the potential to lead
to some "gotchas" in certain cases, noted below.
Your file system may allow file/sample names which R
will not like, and will
cause confusing behavior. File/sample names become variables in ChemoSpec
, and R
does not like things like "-" (minus sign or hyphen) in file/sample names. A hyphen
is converted to a period (".") if found, which is fine for a variable name.
However, a period in gr.crit
is interpreted from the grep point of view,
namely a period matches any single character. At this point, things may behave
very differently than one might hope. See make.names
for allowed
characters in R
variables and make sure your file/sample names comply.
The entries in gr.crit
must be
mutually exclusive. For example, if you have files with names like
"Control_1" and "Sample_1" and use gr.crit = c("Control", "Sample")
groups will be assigned as you would expect. But, if you have file names
like "Control_1_Shade" and "Sample_1_Sun" you can't use gr.crit =
c("Control", "Sample", "Sun", "Shade")
because each criteria is grepped in
order, and the "Sun/Shade" phrases, being last, will form the basis for your
groups. Because this is a grep process, you can get around this by using
regular expressions in your gr.crit
argument to specify the desired
groups in a mutually exclusive manner. In this second example, you could
use gr.crit = c("Control(.*)Sun"
, "Control(.*)Shade"
, "Sample(.*)Sun"
,
"Sample(.*)Shade")
to have your groups assigned based upon both phrases in
the file names.
To summarize, gr.crit
is used as a grep pattern, and the file/sample names
are the target. Make sure your file/sample names comply with make.names
.
Finally, samples whose names are not matched using gr.crit
are still
incorporated into the Spectra2D
object, but they are not
assigned a group. Therefore they don't plot, but they do take up space in a
plot! A warning is issued in these cases, since one wouldn't normally want
a spectrum to be orphaned this way.
All these problems can generally be identified by running sumSpectra
once the data is imported.
The ... argument can be used to pass any argument to read.table
or list.files
.
This includes the possibility of passing arguments that will cause trouble later, for instance
na.strings
in read.table
. While one might successfully read in data with NA
,
it will eventually cause problems. The intent of this feature is to allow one to recurse
a directory tree containing the data, and/or to specify a starting point other than the current
working directory. So for instance if the current working directory is not the directory containing
the data files, you can use path = "my_path"
to point to the desired top-level
directory, and recursive = TRUE
to work your way through a set of subdirectories. In addition,
if you are reading in JCAMP-DX files, you can pass arguments to readJDX
via ..., e.g. SOFC = FALSE
.
Finally, while argument fileExt
appears to be a file extension (from its
name and the description elsewhere), it's actually just a grep pattern that you can apply
to any part of the file name if you know how to construct the proper pattern.
files2Spectra2DObject
acts on all files in the current working
directory with the specified fileExt
so there should be no
extraneous files with that extension in the directory.