Usage
ARTIVAnet(targetData, parentData, targetNames = NULL, parentNames =
NULL, dataDescription=NULL, saveEstimations=TRUE, saveIterations=FALSE,
savePictures = TRUE, outputPath=NULL, dyn=1, segMinLength=2, maxCP=NULL,
maxPred=NULL, nbCPinit=NULL, CPinit=NULL, niter=50000, burn_in=NULL,
PSRFactor=FALSE,PSRF_thres=1.05, segmentAnalysis=TRUE,
edgesThreshold=0.5, layout = "fruchterman.reingold", cCP= 0.5,
cEdges=0.5, alphaCP=1, betaCP=0.5, alphaEdges=1, betaEdges=0.5, v0=1,
gamma0=0.1, alphad2=2, betad2=0.2, silent=FALSE)
Arguments
targetData
A vector with the temporal gene expression measurements for the target gene (i.e. the gene whose regulation factors are looked for).
parentData
A matrix (or a vector if only 1 parent gene) with the temporal gene
expression measurements for the proposed parent genes (i.e. potential
regulation factors). Parent genes are shown in row and expression values
in column. For computational reasons, we advise not to test
simultaneously more than 10 parent genes.
targetNames
A vector with the names for target gene(s) (optional, default:
targetNames=NULL
parentNames
A vector with the names for parent gene(s) (optional, default:
parentNames=NULL
).
dataDescription
(Required only when the gene expression measurements contain repeated
values for the same time points). A vector indicating the ordering of
the time measurements in the data. For example
dataDescription=rep(1:n, each=2), if there are two measurements for
each time point AND the repetitions for each time point are next to
each other. Note that temporal gene expression measurements have to be
organized identically in arguments targetData
and
parentData
(optional, default: dataDescription=NULL
).
saveEstimations
Boolean, if TRUE
all estimated posterior distributions are saved
as text files either in a new sub folder named "ARTIVA_Results" created
by default in the current folder or in a folder specified with argument
outputPath
(see below) (optional, default:
saveEstimations=TRUE
).
saveIterations
Boolean, if TRUE
the configuration for all iterations is saved as
text files either in a new sub folder named "ARTIVA_Results" created
by default in the current folder or in a folder specified with argument
outputPath
(see below) (optional, default:
saveIterations=FALSE
).
savePictures
Boolean, if TRUE
all estimated posterior distributions and
networks are plotted in a pdf file either in a new sub folder named
"ARTIVA_Results" created by default in the current folder or in a
folder specified with argument outputPath
(see below)
(optional, default: savePictures=TRUE
).
outputPath
File path to a folder in which the output results have to be saved,
either a complete path or the name of a folder to be created in the
current directory (optional, default: outputPath=NULL
).
dyn
Time delay to be considered in the auto-regressive process, between the
temporal expression measurements of the analyzed target gene and the
ones of the parent genes (optional, default: dyn=1
).
segMinLength
Minimal length (number of time points) to define a temporal
segment. Must be - strictly - greater than 1 if there is no
repeated measurements for each time point in arguments
targetData
and parentData
(optional, default: segMinLength=2
).
maxCP
Maximal number of CPs to be considered by the ARTIVA inference
procedure. Note that for a temporal course with n
time points
the maximal number of CPs is (n-1-dyn)
(see before for a
description of argument dyn
). For long temporal courses (more
than 20 time points), we advise - for computational reasons - to limit
the maximal number of CP to 15 (optional, default: maxCP=NULL
,
if maxCP=NULL
then the maximal number of CPs is set to
maxCP=min(round((n-1-dyn)/segMinLength)-1, 15)
where n
is the
number of time points).
maxPred
Maximal number of simultaneous incoming edges for each segment of the
regulatory network estimated for the target gene (default:
maxPred=NULL
, if maxPred=NULL
then the maximal number of
incoming edges is set to
maxPred=min(dim(parentData)[1],15)
)
nbCPinit
Number of CPs to be considered at the algorithm initialization
(optional, default: nbCPinit=NULL
, if nbCPinit=NULL
then the
initial number of CPs is randomly set to a value between 0 and maxCP/2
.).
CPinit
A vector with the initial positions for potential CPs.
(optional, default: CPinit = NULL
, if CPinit = NULL
then the initial vector is chosen randomly according to priors, with
nbCPinit
changepoints (see previous argument nbCPinit
) ).
niter
Number of iterations to be performed in the RJ-MCMC sampling
(optional, default: niter = 50000
).
burn_in
Number of initial iterations that are discarded for the estimation of
the model distribution (posterior distribution). The
ARTIVAnet
function is a RJ-MCMC algorithm which, at
each iteration, randomly samples a new configuration of the
time-varying regulatory network from probability distributions based
on constructing a Markov chain that has the network model distribution
as its equilibrium distribution (The equilibrium distribution is
obtained when the Markov Chain converges, which requires a large
number of iterations).
Typically, initial iterations are notconfident because the Markov Chain
has not stabilized. The burn-in samples allow to not consider these
initial iterations in the final analysis (optional, default:
burn_in=NULL
, if burn_in=NULL
then the first 25% of the
iterations is left for burn_in
). PSRFactor
Boolean, if TRUE
the RJ-MCMC procedure is stopped when the Potential
Scale Ratio Factor or PSRF (which is a usual convergence criterion) is
below a specified threshold (see documentation for argument
PSRF_thres
below) or when the maximal number of iterations is
reached (see documentation for argument niter
previously) (optional, default: PSRFactor=FALSE
).
For more details about the PSRF criterion, see Gelman and Rubin (1992).
PSRF_thres
(Only when PSRFactor=TRUE
) RJ-MCMC stopping threshold: the
RJ-MCMC procedure is stopped when the Potential Scale Ratio Factor
(PSRF) is below this threshold (see documentation for argument
PSRFactor
previously). Values of the PSRF below 1.1 are usually
taken as indication of sufficient convergence (optional, default:
PSRF_threshold=1.05
).
segmentAnalysis
Boolean, if TRUE
the posterior distribution for the edges in each
temporal segment delimited by the estimated changepoints (CPs) is
computed. (The estimated CPs are the k
CPs with maximal posterior
probability, where k
is the number of CPs having the maximal
porsterior probability (such that each temporal segment is larger than
segMinLength
) (see documentation below for output values
CPpostDist
, nbSegs
, CPposition
. If segmentAnalysis=FALSE
only the posterior distributions of the CPs number and CPs positions are computed (optional, default: segmentAnalysis=TRUE
).
edgesThreshold
Probability threshold for the selection of the edges of the time-varying
regulatory network when segmentAnalysis=TRUE
(optional, default: edgesThreshold=0.5
).
layout
Name of the function determining the placement of the vertices for
drawing a graph, possible values among others: "random",
"circle",
"sphere",
"fruchterman.reingold",
"kamada.kawai",
"spring",
"reingold.tilford",
"fruchterman.reingold.grid"
,
see package igraph0
for more details (default:
layout="fruchterman.reingold"
).
cCP
Maximal probability to propose the birth (resp. death) of a changepoint
(CP) during the RJ-MCMC iterations (optional, default: cCP=0.5
).
cEdges
Maximal probability - when a move update of the network topology is
chosen - to propose ecah edge move (birth or death of an edge) within a
temporal segment (optional, default: cEdges=0.5
).
alphaCP
Hyperparameter for sampling the number k
of CPs. k
follows
a Gamma distribution: Gamma(alphaCP, betaCP (see below)) (optional,
default: alphaCP=1
). Note that function
choosePriors
can be used to choose alphaCP
(or betaCP
) according to the desired dimension penalisation.
betaCP
Hyperparameter for sampling the number k
of CPs. k
follows
a Gamma distribution: Gamma(alphaCP (see before), betaCP
)
(optional, default: betaCP=0.5
). Note that function
choosePriors
can be used to choose betaCP
(or
alphaCP
) according to the desired dimension penalisation.
alphaEdges
Hyperparameter for sampling the number l
of regulatory
interactions between the target gene and the parent genes. l
follows a Gamma distribution: rgamma(1,
shape=alphaEdges, rate=betaEdges)
(see betaEdges
below), (default:
alphaEdges=1
). Function choosePriors
can be used to
choose alphaEdges
(or betaEdges
) according to
the desired dimension penalisation.
betaEdges
Hyperparameter for sampling the number l
of regulatory
interactions between the target gene and the parent genes. l
follows a Gamma
distribution: rgamma(1,
shape=alphaEdges, rate=betaEdges)
(see alphaEdges
upper) (optional, default:
betaEdges=0.5
). Function choosePriors
can be used
to choose betaEdges
(or alphaEdges
) according to the desired
dimension penalisation.
v0
Hyperparameter for sampling the noise variance (denoted by Sig2
)
in the auto-regressive model defining the regulatory time vrarying network. The prior
distribution of the noise variance is an Inverse Gamma distribution with
shape parameter v0/2
and scale parameter gamma0/2
: rinvgamma(1,shape=v0/2,
scale = gamma0)
, (optional, default: v0=1
).
gamma0
Hyperparameter for sampling the noise variance (denoted by Sig2
) in the ARTIVA package) in the auto-regressive
model defining the regulatory time vrarying network. The prior
distribution of the noise variance is an Inverse Gamma distribution with
shape parameter v0/2
and scale parameter gamma0/2
: rinvgamma(1,
shape=v0/2, scale = gamma0)
, (optional, default: gamma0=0.1
).
alphad2
Hyperparameter for sampling a parameter that represents the expected
signal-to-noise ratio (denoted by delta2
in the ARTIVA
package). It is sampled according to an Inverse Gamma distribution:
rinvgamma(1, shape=alphad2,
scale=betad2)
, (optional, default: alphad2=2
).
betad2
Hyperparameter for sampling a parameter that represents the expected
signal-to-noise ratio (denoted by delta2
in the ARTIVA
package). It is sampled according to an Inverse Gamma distribution:
rinvgamma(1, shape=alphad2,
scale=betad2)
, (optional, default: betad2=0.2
).
silent
Boolean, if TRUE
messages are printed along the ARTIVA procedure (optional, default: silent=FALSE
).