This is a function that takes use of any one method from ARSER, JTK_CYCLE and Lomb-Scargle to detect rhythmic signals from time-series datasets containing individual information.
meta3d(datafile, designfile, outdir = "metaout", filestyle,
design_libColm, design_subjectColm, minper = 20, maxper = 28,
cycMethodOne = "JTK", timeUnit = "hour", design_hrColm,
design_dayColm = NULL, design_minColm = NULL,
design_secColm = NULL, design_groupColm = NULL,
design_libIDrename = NULL, adjustPhase = "predictedPer",
combinePvalue = "fisher", weightedMethod = TRUE,
outIntegration = "both", ARSmle = "auto", ARSdefaultPer = 24,
dayZeroBased = FALSE, outSymbol = "", parallelize = FALSE,
nCores = 1)
a character string. The name of data file containing time-series experimental values of all individuals.
a character string. The name of experimental design file,
at least containing the library ID(column names of datafile
),
subject ID(the individual corresponding to each library ID), and
sampling time information of each library ID.
a character string. The name of directory used to store output files.
a character vector(length 1 or 3). The data format of
input files, must be "txt"
, or "csv"
, or a character
vector containing field separator character(sep
), quoting
character(quote
), and the character used for decimal
points(dec
, for details see read.table
).
a numeric value. The order index(from left to right)
of the column storing library ID in designfile
.
a numeric value. The order index(from left to
right) of the column storing subject ID in designfile
.
a numeric value. The minimum period length of interested
rhythms. The default is 20
for circadian rhythms.
a numeric value. The maximum period length of interested
rhythms. The default is 28
for circadian rhythms.
a character string. The selected method for analyzing
time-series data of each individual, must be one of "ARS"
(ARSER),
"JTK"
(JTK_CYCLE), or "LS"
(Lomb-Scargle).
a character string. The basic time-unit, must be one of
"day"
, "hour"
(default for circadian study),
"minute"
, or "second"
depending on specific experimental
design.
a numeric value. The order index(from left to right)
of the column storing time point value-sampling hour information in
designfile
. If there is no such column in designfile
,
set it as NULL
.
a numeric value. The order index(from left to right)
of the column storing time point value-sampling day information in
designfile
. If there is no such column in designfile
,
set it as NULL
(default).
a numeric value. The order index(from left to right)
of the column storing time point value-sampling minute information in
designfile
. If there is no such column in designfile
,
set it as NULL
(default).
a numeric value. The order index(from left to right)
of the column storing time point value-sampling second information in
designfile
. If there is no such column in designfile
,
set it as NULL
(default).
a numeric value. The order index(from left to
right) of the column storing experimental group information of each
individual in designfile
. If there is no such column in
designfile
, set it as NULL
(default) and take all
individuals as one group.
a character vector(length 2) containing a
matchable character string in each library ID of designfile
, and
a replacement character string. If it is not necessary to replace
characters in library ID of designfile
, set it as NULL
(
default).
a character string. The method used to adjust each
calculated phase before getting integrated phase, must be one of
"predictedPer"
(adjust phase with predicted period length)
or "notAdjusted"
(not adjust phase).
a character string. The method used to integrate
p-values of multiple individuals, currently only "fisher"
(
Fisher's method) could be selected.
logical. If TRUE
(default), weighted score
based on p-value of each individual will be used to integrate period,
phase and amplitude values of multiple individuals.
a character string. This parameter controls what
kinds of analysis results will be outputted, must be one of
"both"
, "onlyIntegration"
, or "noIntegration"
.
See meta2d
for more information.
a character string. The strategy of using MLE method in
"ARS"
, must be one of "auto"
, "mle"
, or
"nomle"
. See meta2d
for more information.
a numeric value. The expected period length of
interested rhythm, which is a necessary parameter for ARS
. See
meta2d
for more information.
logical. If TRUE
, the first sampling day is
recorded as day zero in the designfile
.
a character string. A common prefix exists in the names of output files.
logical. If TRUE
, computation will be done in paralleL
Doesn't work in windows machine
a integer. Bigger or equal to one, number of cores to use
meta3d
will write analysis results to outdir
instead of
returning them as objects. Output files with "meta3dSubjectID" in
the file name are analysis results for each individual. Files named with
"meta3dGroupID" store integrated p-values, period, phase, baseline,
amplitude and relative amplitude values from multiple individuals of
each group and calculated FDR values based on integrated p-values.
This function is originally aimed to analyze large scale periodic data with
individual information. Please pay attention to the data format of
datafile
and designfile
(see Examples
part).
Time-series experimental values(missing values as NA
) from
all individuals should be stored in datafile
, with the first row
containing all library ID(unique identification number for each sample)
and the first column containing all detected molecular names(eg.
transcript or gene name). The designfile
should at least have
three columns-library ID, subject ID and sampling time column.
Experimental group information of each subject ID may be in another
column. In addition, sampling time information may be stored in multiple
columns instead of one column. For example, sampling time-"36 hours" may
be recorded as "day 2"(sampling day column, design_dayColm
) plus
"12 hours"(sampling hour column, design_hrColm
). The library ID
in datafile
and designfile
should be same. If there are
different characters between library ID in these two files, try
design_libIDrename
to keep them same.
ARS
, JTK
or LS
could be used to analyze time-series
profiles individual by individual. meta3d
requires that all
individuals should be analyzed by the same method before integrating
calculated p-value, period, phase, baseline value, amplitude and relative
amplitude values group by group. However, the sampling pattern among
individuals may be different and the requirement of sampling pattern for
each method is not same(see more information about these methods and their
limitations in meta2d
). Please carefully select a proper
method for the specific dataset. meta3d
also help users select
the suitable method through warning notes.
P-values from different individuals are integrated with Fisher's method
("fisher"
)(Fisher,1925; implementation code from MADAM).For
short time-series profiles(eg. 10 time points or less), p-values given by
Lomb-Scargle may be over conservative, which will also lead to
conservative integrated p-values. The integrated period, baseline,
amplitude and relative amplitude values are arithmetic mean of multiple
individuals, respectively. The phase is
mean of circular quantities(adjustPhase = "predictedPer"
)
or a arithmetic mean (adjustPhase = "notAdjusted"
) of multiple
individual phases. For completely removing the potential problem of
averaging phases with quite different period length(also mentioned
in meta2d
), setting minper
, maxper
and
ARSdefaultPer
to a same value may be the only known way. If
weightedMethod = TRUE
is selected, weighted scores(
-log10(p-values)
) will be taken into account in integrating
period, phase, baseline, amplitude and relative amplitude.
Glynn E. F., Chen J., and Mushegian A. R. (2006). Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms. Bioinformatics, 22(3), 310--316
Fisher, R.A. (1925). Statistical methods for research workers. Oliver and Boyd (Edinburgh).
Kugler K. G., Mueller L.A., and Graber A. (2010). MADAM - an open source toolbox for meta-analysis. Source Code for Biology and Medicine, 5, 3.
# NOT RUN {
# write 'cycHumanBloodData' and 'cycHumanBloodDesign' into two 'csv' files
write.csv(cycHumanBloodData, file="cycHumanBloodData.csv",
row.names=FALSE)
write.csv(cycHumanBloodDesign, file="cycHumanBloodDesign.csv",
row.names=FALSE)
# detect circadian transcripts with JTK in studied individuals
meta3d(datafile="cycHumanBloodData.csv", cycMethodOne="JTK",
designfile="cycHumanBloodDesign.csv", outdir="example",
filestyle="csv", design_libColm=1, design_subjectColm=2,
design_hrColm=4, design_groupColm=3)
# }
Run the code above in your browser using DataLab