fda (version 2.4.8)

fRegress: Functional Regression Analysis

Description

This function carries out a functional regression analysis, where either the dependent variable or one or more independent variables are functional. Non-functional variables may be used on either side of the equation. In a simple problem where there is a single scalar independent covariate with values \(z_i, i=1,\ldots,N\) and a single functional covariate with values \(x_i(t)\), the two versions of the model fit by fRegress are the scalar dependent variable model

$$y_i = \beta_1 z_i + \int x_i(t) \beta_2(t) \, dt + e_i$$

and the concurrent functional dependent variable model

$$y_i(t) = \beta_1(t) z_i + \beta_2(t) x_i(t) + e_i(t).$$

In these models, the final term \(e_i\) or \(e_i(t)\) is a residual, lack of fit or error term.

In the concurrent functional linear model for a functional dependent variable, all functional variables are all evaluated at a common time or argument value $t$. That is, the fit is defined in terms of the behavior of all variables at a fixed time, or in terms of "now" behavior.

All regression coefficient functions \(\beta_j(t)\) are considered to be functional. In the case of a scalar dependent variable, the regression coefficient for a scalar covariate is converted to a functional variable with a constant basis. All regression coefficient functions can be forced to be smooth through the use of roughness penalties, and consequently are specified in the argument list as functional parameter objects.

Usage

fRegress(y, ...)
# S3 method for formula
fRegress(y, data=NULL, betalist=NULL, wt=NULL,
                 y2cMap=NULL, SigmaE=NULL,
                 method=c('fRegress', 'model'), sep='.', ...)
# S3 method for character
fRegress(y, data=NULL, betalist=NULL, wt=NULL,
                 y2cMap=NULL, SigmaE=NULL,
                 method=c('fRegress', 'model'), sep='.', ...)
# S3 method for fd
fRegress(y, xfdlist, betalist, wt=NULL,
                     y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...)
# S3 method for fdPar
fRegress(y, xfdlist, betalist, wt=NULL,
                     y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...)
# S3 method for numeric
fRegress(y, xfdlist, betalist, wt=NULL,
                     y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...)

Arguments

y

the dependent variable object. It may be an object of five possible classes:

  • character or formula a formula object or a character object that can be coerced into a formula providing a symbolic description of the model to be fitted satisfying the following rules:

    The left hand side, formula y, must be either a numeric vector or a univariate object of class fd or fdPar. If the former, it is replaced by fdPar(y, ...).

    All objects named on the right hand side must be either numeric or fd (functional data) or fdPar. The number of replications of fd or fdPar object(s) must match each other and the number of observations of numeric objects named, as well as the number of replications of the dependent variable object. The right hand side of this formula is translated into xfdlist, then passed to another method for fitting (unless method = 'model'). Multivariate independent variables are allowed in a formula and are split into univariate independent variables in the resulting xfdlist. Similarly, categorical independent variables with k levels are translated into k-1 contrasts in xfdlist. Any smoothing information is passed to the corresponding component of betalist.

  • scalar a vector if the dependent variable is scalar.

  • fd a functional data object if the dependent variable is functional. A y of this class is replaced by fdPar(y, ...) and passed to fRegress.fdPar.

  • fdPar a functional parameter object if the dependent variable is functional, and if it is desired to smooth the prediction of the dependent variable.

data

an optional list or data.frame containing names of objects identified in the formula or character y.

xfdlist

a list of length equal to the number of independent variables (including any intercept). Members of this list are the independent variables. They can be objects of either of these two classes:

  • scalar a numeric vector if the independent variable is scalar.

  • fd a (univariate) functional data object.

In either case, the object must have the same number of replications as the dependent variable object. That is, if it is a scalar, it must be of the same length as the dependent variable, and if it is functional, it must have the same number of replications as the dependent variable. (Only univariate independent variables are currently allowed in xfdlist.)

betalist

For the fd, fdPar, and numeric methods, betalist must be a list of length equal to length(xfdlist). Members of this list are functional parameter objects (class fdPar) defining the regression functions to be estimated. Even if a corresponding independent variable is scalar, its regression coefficient must be functional if the dependent variable is functional. (If the dependent variable is a scalar, the coefficients of scalar independent variables, including the intercept, must be constants, but the coefficients of functional independent variables must be functional.) Each of these functional parameter objects defines a single functional data object, that is, with only one replication.

For the formula and character methods, betalist can be either a list, as for the other methods, or NULL, in which case a list is created. If betalist is created, it will use the bases from the corresponding component of xfdlist if it is function or from the response variable. Smoothing information (arguments Lfdobj, lambda, estimate, and penmat of function fdPar) will come from the corresponding component of xfdlist if it is of class fdPar (or for scalar independent variables from the response variable if it is of class fdPar) or from optional arguments if the reference variable is not of class fdPar.

wt

weights for weighted least squares

y2cMap

the matrix mapping from the vector of observed values to the coefficients for the dependent variable. This is output by function smooth.basis. If this is supplied, confidence limits are computed, otherwise not.

SigmaE

Estimate of the covariances among the residuals. This can only be estimated after a preliminary analysis with fRegress.

method

a character string matching either fRegress for functional regression estimation or mode to create the argument lists for functional regression estimation without running it.

sep

separator for creating names for multiple variables for fRegress.fdPar or fRegress.numeric created from single variables on the right hand side of the formula y. This happens with multidimensional fd objects as well as with categorical variables.

returnMatrix

logical: If TRUE, a two-dimensional is returned using a special class from the Matrix package.

optional arguments

Value

These functions return either a standard fRegress fit object or or a model specification:

fRegress fit

a list of class fRegress with the following components:

  • y the first argument in the call to fRegress (coerced to class fdPar)

  • xfdlist the second argument in the call to fRegress.

  • betalist the third argument in the call to fRegress.

  • betaestlist a list of length equal to the number of independent variables and with members having the same functional parameter structure as the corresponding members of betalist. These are the estimated regression coefficient functions.

  • yhatfdobj a functional parameter object (class fdPar) if the dependent variable is functional or a vector if the dependent variable is scalar. This is the set of predicted by the functional regression model for the dependent variable.

  • Cmatinv a matrix containing the inverse of the coefficient matrix for the linear equations that define the solution to the regression problem. This matrix is required for function fRegress.stderr that estimates confidence regions for the regression coefficient function estimates.

  • wt the vector of weights input or inferred

If class(y) is numeric, the fRegress object also includes:

  • df equivalent degrees of freedom for the fit.

  • OCV the leave-one-out cross validation score for the model.

  • gcv the generalized cross validation score.

    If class(y) is either fd or fdPar, the fRegress object returned also includes 5 other components:

  • y2cMap an input y2cMap

  • SigmaE an input SigmaE

  • betastderrlist an fd object estimating the standard errors of betaestlist

  • bvar a covariance matrix

  • c2bMap a map

  • model specification

    The fRegress.formula and fRegress.character functions translate the formula into the argument list required by fRegress.fdPar or fRegress.numeric. With the default value 'fRegress' for the argument method, this list is then used to call the appropriate other fRegress function.

    Alternatively, to see how the formula is translated, use the alternative 'model' value for the argument method. In that case, the function returns a list with the arguments otherwise passed to these other functions plus the following additional components:

    • xfdlist0 a list of the objects named on the right hand side of formula. This will differ from xfdlist for any categorical or multivariate right hand side object.

    • type the type component of any fd object on the right hand side of formula.

    • nbasis a vector containing the nbasis components of variables named in formula having such components

    • xVars an integer vector with all the variable names on the right hand side of formula containing the corresponding number of variables in xfdlist. This can exceed 1 for any multivariate object on the right hand side of class either numeric or fd as well as any categorical variable.

    Details

    Alternative forms of functional regression can be categorized with traditional least squares using the following 2 x 2 table:

    explanatory variable
    response | scalar | function
    | |
    scalar | lm | fRegress.numeric
    | |
    function | fRegress.fd or | fRegress.fd or
    | fRegress.fdPar | fRegress.fdPar or linmod

    For fRegress.numeric, the numeric response is assumed to be the sum of integrals of xfd * beta for all functional xfd terms.

    fRegress.fd or .fdPar produces a concurrent regression with each beta being also a (univariate) function.

    linmod predicts a functional response from a convolution integral, estimating a bivariate regression function.

    In the computation of regression function estimates in fRegress, all independent variables are treated as if they are functional. If argument xfdlist contains one or more vectors, these are converted to functional data objects having the constant basis with coefficients equal to the elements of the vector.

    Needless to say, if all the variables in the model are scalar, do NOT use this function. Instead, use either lm or lsfit.

    These functions provide a partial implementation of Ramsay and Silverman (2005, chapters 12-20).

    References

    Ramsay, James O., Hooker, Giles, and Graves, Spencer (2009) Functional Data Analysis in R and Matlab, Springer, New York.

    Ramsay, James O., and Silverman, Bernard W. (2005), Functional Data Analysis, 2nd ed., Springer, New York.

    See Also

    fRegress.formula, fRegress.stderr, fRegress.CV, linmod

    Examples

    Run this code
    # NOT RUN {
    ###
    ###
    ### scalar response and explanatory variable
    ###   ... to compare fRegress and lm
    ###
    ###
    # example from help('lm')
         ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
         trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
         group <- gl(2,10,20, labels=c("Ctl","Trt"))
         weight <- c(ctl, trt)
    lm.D9 <- lm(weight ~ group)
    
    fRegress.D9 <- fRegress(weight ~ group)
    
    (lm.D9.coef <- coef(lm.D9))
    
    (fRegress.D9.coef <- sapply(fRegress.D9$betaestlist, coef))
    
    # }
    # NOT RUN {
    all.equal(as.numeric(lm.D9.coef), as.numeric(fRegress.D9.coef))
    # }
    # NOT RUN {
    ###
    ###
    ### vector response with functional explanatory variable
    ###
    ###
    
    ##
    ## set up
    ##
    annualprec <- log10(apply(CanadianWeather$dailyAv[,,"Precipitation.mm"],
                              2,sum))
    # The simplest 'fRegress' call is singular with more bases
    # than observations, so we use a small basis for this example
    smallbasis  <- create.fourier.basis(c(0, 365), 25)
    # There are other ways to handle this,
    # but we will not discuss them here
    tempfd <- smooth.basis(day.5,
              CanadianWeather$dailyAv[,,"Temperature.C"], smallbasis)$fd
    
    ##
    ## formula interface
    ##
    
    precip.Temp.f <- fRegress(annualprec ~ tempfd)
    
    ##
    ## Get the default setup and modify it
    ##
    
    precip.Temp.mdl <- fRegress(annualprec ~ tempfd, method='m')
    # First confirm we get the same answer as above:
    precip.Temp.m <- do.call('fRegress', precip.Temp.mdl)
    # }
    # NOT RUN {
    all.equal(precip.Temp.m, precip.Temp.f)
    # }
    # NOT RUN {
    #  set up a smaller basis than for temperature
    nbetabasis  <- 21
    betabasis2.  <- create.fourier.basis(c(0, 365), nbetabasis)
    betafd2.     <- fd(rep(0, nbetabasis), betabasis2.)
    # add smoothing
    betafdPar2.  <- fdPar(betafd2., lambda=10)
    
    precip.Temp.mdl2 <- precip.Temp.mdl
    precip.Temp.mdl2[['betalist']][['tempfd']] <- betafdPar2.
    
    # Now do it.
    precip.Temp.m2 <- do.call('fRegress', precip.Temp.mdl2)
    
    # Compare the two fits
    precip.Temp.f[['df']] # 26
    precip.Temp.m2[['df']]# 22 = saved 4 degrees of freedom
    
    (var.e.f <- mean(with(precip.Temp.f, (yhatfdobj-yfdPar)^2)))
    (var.e.m2 <- mean(with(precip.Temp.m2, (yhatfdobj-yfdPar)^2)))
    # with a modest increase in lack of fit.
    
    ##
    ## Manual construction of xfdlist and betalist
    ##
    
    xfdlist <- list(const=rep(1, 35), tempfd=tempfd)
    
    # The intercept must be constant for a scalar response
    betabasis1 <- create.constant.basis(c(0, 365))
    betafd1    <- fd(0, betabasis1)
    betafdPar1 <- fdPar(betafd1)
    
    betafd2     <- with(tempfd, fd(basisobj=basis, fdnames=fdnames))
    # convert to an fdPar object
    betafdPar2  <- fdPar(betafd2)
    
    betalist <- list(const=betafdPar1, tempfd=betafdPar2)
    
    precip.Temp <- fRegress(annualprec, xfdlist, betalist)
    # }
    # NOT RUN {
    all.equal(precip.Temp, precip.Temp.f)
    # }
    # NOT RUN {
    ###
    ###
    ### functional response with vector explanatory variables
    ###
    ###
    
    ##
    ## simplest:  formula interface
    ##
    daybasis65 <- create.fourier.basis(rangeval=c(0, 365), nbasis=65,
                      axes=list('axesIntervals'))
    Temp.fd <- with(CanadianWeather, smooth.basisPar(day.5,
                    dailyAv[,,'Temperature.C'], daybasis65)$fd)
    TempRgn.f <- fRegress(Temp.fd ~ region, CanadianWeather)
    
    ##
    ## Get the default setup and possibly modify it
    ##
    TempRgn.mdl <- fRegress(Temp.fd ~ region, CanadianWeather, method='m')
    # }
    # NOT RUN {
    <!-- %names(TempRgn.mdl) -->
    # }
    # NOT RUN {
    # make desired modifications here
    # then run
    TempRgn.m <- do.call('fRegress', TempRgn.mdl)
    
    # no change, so match the first run
    # }
    # NOT RUN {
    all.equal(TempRgn.m, TempRgn.f)
    # }
    # NOT RUN {
    ##
    ## More detailed set up
    ##
    # }
    # NOT RUN {
    <!-- %str(TempRgn.mdl$xfdlist) -->
    # }
    # NOT RUN {
    region.contrasts <- model.matrix(~factor(CanadianWeather$region))
    rgnContr3 <- region.contrasts
    dim(rgnContr3) <- c(1, 35, 4)
    dimnames(rgnContr3) <- list('', CanadianWeather$place, c('const',
       paste('region', c('Atlantic', 'Continental', 'Pacific'), sep='.')) )
    
    const365 <- create.constant.basis(c(0, 365))
    region.fd.Atlantic <- fd(matrix(rgnContr3[,,2], 1), const365)
    # }
    # NOT RUN {
    <!-- %str(region.fd.Atlantic) -->
    # }
    # NOT RUN {
    region.fd.Continental <- fd(matrix(rgnContr3[,,3], 1), const365)
    region.fd.Pacific <- fd(matrix(rgnContr3[,,4], 1), const365)
    region.fdlist <- list(const=rep(1, 35),
         region.Atlantic=region.fd.Atlantic,
         region.Continental=region.fd.Continental,
         region.Pacific=region.fd.Pacific)
    # }
    # NOT RUN {
    <!-- %str(TempRgn.mdl$betalist) -->
    # }
    # NOT RUN {
    beta1 <- with(Temp.fd, fd(basisobj=basis, fdnames=fdnames))
    beta0 <- fdPar(beta1)
    betalist <- list(const=beta0, region.Atlantic=beta0,
                 region.Continental=beta0, region.Pacific=beta0)
    
    TempRgn <- fRegress(Temp.fd, region.fdlist, betalist)
    
    # }
    # NOT RUN {
    all.equal(TempRgn, TempRgn.f)
    # }
    # NOT RUN {
    ###
    ###
    ### functional response with
    ###            (concurrent) functional explanatory variable
    ###
    ###
    
    ##
    ##  predict knee angle from hip angle;  from demo('gait', package='fda')
    ##
    ## formula interface
    ##
    (gaittime <- as.numeric(dimnames(gait)[[1]])*20)
    gaitrange <- c(0,20)
    gaitbasis <- create.fourier.basis(gaitrange, nbasis=21)
    harmaccelLfd <- vec2Lfd(c(0, (2*pi/20)^2, 0), rangeval=gaitrange)
    gaitfd <- smooth.basisPar(gaittime, gait,
           gaitbasis, Lfdobj=harmaccelLfd, lambda=1e-2)$fd
    hipfd  <- gaitfd[,1]
    kneefd <- gaitfd[,2]
    
    knee.hip.f <- fRegress(kneefd ~ hipfd)
    # }
    # NOT RUN {
    <!-- %knee.hip.mdl <- fRegress(kneefd ~ hipfd, method='m') -->
    # }
    # NOT RUN {
    ##
    ## manual set-up
    ##
    #  set up the list of covariate objects
    # }
    # NOT RUN {
    <!-- %conbasis <- create.constant.basis(c(0,20)) -->
    # }
    # NOT RUN {
    const  <- rep(1, dim(kneefd$coef)[2])
    xfdlist  <- list(const=const, hipfd=hipfd)
    
    beta0 <- with(kneefd, fd(basisobj=basis, fdnames=fdnames))
    beta1 <- with(hipfd, fd(basisobj=basis, fdnames=fdnames))
    
    betalist  <- list(const=fdPar(beta0), hipfd=fdPar(beta1))
    
    fRegressout <- fRegress(kneefd, xfdlist, betalist)
    
    # }
    # NOT RUN {
    all.equal(fRegressout, knee.hip.f)
    # }
    # NOT RUN {
    #See also the following demos:
    
    #demo('canadian-weather', package='fda')
    #demo('gait', package='fda')
    #demo('refinery', package='fda')
    #demo('weatherANOVA', package='fda')
    #demo('weatherlm', package='fda')
    # }
    

    Run the code above in your browser using DataCamp Workspace