The function rp.flm.test
tests the composite null hypothesis of a Functional Linear Model with scalar response (FLM),
$$H_0:\,Y=\big<X,\beta\big>+\epsilon,$$
versus a general alternative. If \(\beta=\beta_0\) is provided, then the simple hypothesis \(H_0:\,Y=\big<X,\beta_0\big>+\epsilon\) is tested.
The way of testing the null hypothesis is via a Projected Cramer-von Mises test (see Details).
rp.flm.test (X.fdata, Y,beta0.fdata=NULL, est.method="pc",
p=NULL, type.basis="bspline", B=5000, n.proj=50,
verbose = TRUE, F.code=TRUE, sigma="vexponential",
par.list=list(theta=diff(range(X.fdata$argvals))/20),
same.rwild=FALSE,...)
Functional covariate for the FLM. The object must be in the class fdata
.
Scalar response for the FLM. Must be a vector with the same number of elements as functions are in X.fdata
.
Functional parameter for the simple null hypothesis, in the fdata
class. Recall that the argvals
and rangeval
arguments of beta0.fdata
must be the same of X.fdata
. A possibility to do this is to consider, for example for \(\beta_0=0\) (the simple null hypothesis of no interaction),
beta0.fdata=fdata(mdata=rep(0,length(X.fdata$argvals)),
argvals=X.fdata$argvals,rangeval=X.fdata$rangeval)
.
If beta0.fdata=NULL
(default), the function will test for the composite null hypothesis.
Estimation method for the unknown parameter \(\beta\), only used in the composite case. Mainly, there are two options: specify the number of basis elements for the estimated \(\beta\) by p
or optimally select p
by a data-driven criteria (see Details section for discussion). Then, it must be one of the following methods:
"pc"
If p
, the number of basis elements, is given, then \(\beta\) is estimated by fregre.pc
. Otherwise, an optimum p
is chosen using fregre.pc.cv
and the "SICc"
criteria.
"pls"
If p
is given, \(\beta\) is estimated by fregre.pls
. Otherwise, an optimum p
is chosen using fregre.pls.cv
and the "SICc"
criteria.
This is the default argument as it has been checked empirically that provides a good balance between the performance of the test and the estimation of \(\beta\).
"basis"
If p
is given, \(\beta\) is estimated by fregre.basis
. Otherwise, an optimum p
is chosen using fregre.basis.cv
and the "GCV.S"
criteria. In these functions, the same basis for the arguments basis.x
and basis.b
is considered.
The type of basis used will be the given by the argument type.basis
and must be one of the class of create.basis
. Further arguments passed to create.basis
(not rangeval
that is taken as the rangeval
of X.fdata
), can be passed throughout …
.
Number of elements of the basis considered. If it is not given, an optimal p
will be chosen using a specific criteria (see est.method
and type.basis
arguments).
Type of basis used to represent the functional process. Depending on the hypothesis it will have a different interpretation:
Simple hypothesis. One of these options:
"bspline"
If p
is given, the functional process is expressed in a basis of p
B-splines. If not, an optimal p
will be chosen by min.basis
, using the "GCV.S"
criteria.
"fourier"
If p
is given, the functional process is expressed in a basis of p
fourier functions. If not, an optimal p
will be chosen by min.basis
, using the "GCV.S"
criteria.
"pc"
p
must be given. Expresses the functional process in a basis of p
PC.
"pls"
p
must be given. Expresses the functional process in a basis of p
PLS.
Although other of the basis supported by create.basis
are possible too, "bspline"
and "fourier"
are recommended. Other basis may cause incompatibilities.
Composite hypothesis. This argument is only used when est.method="basis"
and, in this case, claims for the type of basis used in the basis estimation method of the functional parameter. Again, basis
"bspline"
and "fourier"
are recommended, as other basis may cause incompatibilities.
Number of bootstrap replicates to calibrate the distribution of the test statistic. B=5000
replicates are the recommended for carry out the test, although for exploratory analysis (not inferential), an acceptable less time-consuming option is B=500
.
Number of projections (it can be a vector).
Either to show or not information about computing progress.
logical
, Check if the fortran dll is loaded
Argument passed to rproc2fdata
Argument passed to rproc2fdata
logical
, if TRUE
the function generates the same Wild bootstrap residuals for all projections.
Further arguments passed to create.basis
.
An object with class "htest"
whose underlying structure is a list containing the following components:
The value of the test statistic.
A vector of length B
with the values of the bootstrap test statistics.
The p-value of the test.
The method used.
The number of bootstrap replicates used.
The number of projections used
The type of basis used.
The estimated functional parameter \(\beta\) in the composite hypothesis. For the simple hypothesis, the given beta0.fdata
.
The number of basis elements passed or automatically chosen.
The character string "Y=<X,b>+e".
The Functional Linear Model with scalar response (FLM), is defined as \(Y=\big<X,\beta\big>+\epsilon\),
for a functional process \(X\) such that \(E[X(t)]=0\), \(E[X(t)\epsilon]=0\) for all \(t\) and for a scalar variable \(Y\) such that \(E[Y]=0\).
Then, the test assumes that Y
and X.fdata
are centred and will automatically center them. So, bear in mind that when you apply the test for Y
and X.fdata
, actually,
you are applying it to Y-mean(Y)
and fdata.cen(X.fdata)$Xcen
.
The test statistic corresponds to the Cramer-von Mises norm of the Residual Marked empirical Process based on Projections \(R_n(u,\gamma)\) defined in Garcia-Portugues et al. (2014). The expression of this process in a \(p\)-truncated basis of the space \(L^2[0,T]\) leads to the \(p\)-multivariate process \(R_{n,p}\big(u,\gamma^{(p)}\big)\), whose Cramer-von Mises norm is easily computed.
The choice of an appropriate \(p\) to represent the functional process \(X\), in case that is not provided, is done via the estimation of \(\beta\) for the composite hypothesis. For the simple hypothesis, as no estimation of \(\beta\) is done, the choice of \(p\) depends only on the functional process \(X\). As the result of the test may change for different \(p\)'s, we recommend to use an automatic criterion to select \(p\) instead of provide a fixed one. The distribution of the test statistic is approximated by a wild bootstrap on the residuals, using the golden section bootstrap.
Finally, the graph shown if plot.it=TRUE
represents the observed trajectory, and the bootstrap trajectories under the null, of the process RMPP integrated on the projections:
$$R_n(u)\approx\frac{1}{G}\sum_{g=1}^G R_n(u,\gamma_g),$$
where \(\gamma_g\) are simulated as Gaussians processes. This gives a graphical idea of how distant is the observed trajectory from the null hypothesis.
flm.test
, rwild
,
fregre.pc
, fregre.pls
, fregre.basis
,
fregre.pc.cv
, fregre.pls.cv
,
fregre.basis.cv
, min.basis
,
create.basis
# NOT RUN {
# Simulated example #
X=rproc2fdata(n=100,t=seq(0,1,l=101),sigma="OU")
beta0=fdata(mdata=cos(2*pi*seq(0,1,l=101))-(seq(0,1,l=101)-0.5)^2+
rnorm(101,sd=0.05),argvals=seq(0,1,l=101),rangeval=c(0,1))
Y=inprod.fdata(X,beta0)+rnorm(100,sd=0.1)
dev.new(width=21,height=7)
par(mfrow=c(1,3))
plot(X,main="X")
plot(beta0,main="beta0")
plot(density(Y),main="Density of Y",xlab="Y",ylab="Density")
rug(Y)
# }
# NOT RUN {
# Composite hypothesis: do not reject FLM
res.rp=rp.flm.test(X,Y,B=50,n.proj=100)
res.pcvm=flm.test(X,Y,B=50,G=100)
res.rp
respcvm
# }
Run the code above in your browser using DataLab