Learn R Programming

ArfimaMLM (version 1.3)

arfimaPrep: Preparing a dataset for subsequent analysis accoring to the Arfima-MLM/Arfima-OLS framework

Description

This function prepares a repeated cross-sectional dataset or pooled cross-sectional time-series data for subsequent analyses according to the Arfima-MLM/Arfima-OLS framework. This function is mainly used internally as part of arfimaMLM and arfimaOLS, but can also be used independently if the user prefers to separate the data preparation from the subsequent estimation of the multilevel (or simple linear) model. The function performs the aggregation and fractional differencing of time/level variables as well as the necessary procedures to remove deterministic components from the dependent as well as the major independent variables.

Usage

arfimaPrep(data, timevar
           , varlist.mean, varlist.fd
           , varlist.xdif, varlist.ydif
           , d = "Hurst", arma = NULL
           , ecmformula = NULL, decm = "Hurst"
           , drop = 5, ...)

Arguments

data
Data frame to be transformed by the function.
timevar
Name of the variable indicating different timepoints in data.
varlist.mean
Character vector of variable names in data that are averaged/aggregated over each timepoint specified by timevar. The variable list must include all variables listed in varlist.fd, varlist.xdif, va
varlist.fd
Character vector of variable names in data that are fractionally differenced (after aggregating over each timepoint specified by timevar). The variable list must include all variables listed in varlist.ydif See detai
varlist.xdif
Character vector of variable names in data for which the within-timepoint deviation from the respective mean value is calculated (for each timepoint specified by timevar). See details for further information.
varlist.ydif
Character vector of variable names in data for which the temporal deterministic component is removed by substracting the difference of the within-timepoint average and its stationary series free of autocorrelation (with each timepoint specifi
d
Call for a specific estimation method for the fractional differencing parameter in the fractal-package (``Hurst'') or in the
arma
List of variables for which AR and MA parameters are to be estimated (after fractional differencing) as well as a vector containing the respective orders of the model to fit. order[1] corresponds to the AR part and order[2] to th
ecmformula
Specification of the cointegration regression to receive the residuals for the error correction mechanism (ecm) to be included in the transfromed dataset: linear formula object with the response on the left of a ~ operator and the independent variables, s
decm
Call for estimation method for the fractional differencing parameter (see d for details). Can be either ``Hurst'' ``ML'', ``GPH'', or ``Sperio''. Default is ``Hurst''.
drop
Number of time points from the beginning of the series dropped from analysis. Default is 5.
...
Further arguments passed to the estimation procedures used within the function.

Value

  • The function returns a list of datasets and estimation results with the following items:
  • data.meanData frame of variable means declared in varlist.mean, varlist.fd, varlist.xdif, or varlist.ydif for each time point specified by the level variable in timevar.
  • data.fdData frame of fractionally differenced level variables for each time point specified in timevar, which were declared as .fd or .ydif in formula. If arma was additionally specified for a variable, it contains the residuals of the ARMA model fitted after (fractionally) differencing.
  • data.mergedMerged data frame which can be subsequently used to estimate the multilevel model. Consitst of the original data, data.mean, data.fd, as well as the variables specified in varlist.xdif and .ydif
  • dMatrix of fractional differencing parameters estimated for the level variables (varlist.fd and varlist.ydif) as well as the estimation method for each variable. Returns the specified value for d if it was specified in the initial call of the function.
  • armaList of arima results for each variable specified in the model call. Contains AR/MA estimates as well as the model residuals.
  • ecmOutput of the cointegration regression (returned if ecmformula is specified). The lagged residuals of the cointegration regression are included in data.fd and data.merged.

Details

  • The varlistsvarlist.fd,varlist.xdif, andvarlist.ydifselect variables fromdatafor transformations according to the Arfima-MLM framework to prepare the estimation of the actual model.

    Adding variables invarlist.fdallows the user to select variables which are supposed to be transformed to a fractionally differenced level-variable (by aggregating individuals over each time point prior to fractionally differencing the series), or variables which are already included as a level-variable in the original dataset and are just supposed to be fractionally differenced before the multilevel model is estimated.

    For variables invarlist.xdif, the corresponding variables indatais simply filtered through the timepoint averages:$$x.star[it]=x[it]-X[t]$$For variables invarlist.ydif(e.g.$y[it]$), the function will remove the daily deterministic component from the individual level variable, such that it only consists of within-timepoint, as well as non-temporally autocorrelated between-timepoint variation:$$y.star[it]=y[it]-(Y[t]-\Delta[df]Y[t])$$

  • In order to prevent errors in the estimation procedure, none of the original variable names indatashould include ``.fd'', ``.xdif'', or ``.ydif''.

References

Lebo, M. and Weber, C. 2015. ``An Effective Approach to the Repeated Cross Sectional Design.'' American Journal of Political Science 59(1): 242-258.

See Also

fracdiff, hurstSpec, fd, and ArfimaMLM for a package overview.

Examples

Run this code
require(fractal)
require(fracdiff)

### set basic parameters for simulation
t = 100 # number of time points
n = 500 # number of observations within time point
N = t*n # total number of observations

### generate fractional ARIMA Time Series for y_t, x1_t, z1_t, z2_t
set.seed(123)
y_t <- fracdiff.sim(t, d=0.4, mu=10)$series
x1_t <- fracdiff.sim(t, d=0.3, mu=5)$series
z1_t <- fracdiff.sim(t, d=0.1, mu=2)$series
z2_t <- fracdiff.sim(t, d=0.25, mu=3)$series

### simulate data
data <- NULL; data$time <- rep(seq(1:t),each=n)
data <- data.frame(data)
data$x1 <- rnorm(N,rep(x1_t,each=n),2)
data$x2 <- rnorm(N,0,40)
data$z1 <- rnorm(N,rep(z1_t,each=n),3)
data$z2 <- rep(z2_t,each=n)
b1 <- 0.2+rep(rnorm(t,0,0.1),each=n)
data$y <- (b1*data$x1-0.05*data$x2+0.3*rep(z1_t,each=n)
            +0*data$z2+rnorm(N,rep(y_t,each=n),1))


### prepare datasets for model estimation

# basic example
dat1 <- arfimaPrep(data = data, timevar="time"
                   , varlist.mean = c("y","x1","z1","z2")
                   , varlist.fd = c("y", "z1","z2")
                   , varlist.xdif = "x1", varlist.ydif = "y")
                   
# including error correction mechanism
# change estimation method for differencing parameter for all variables
dat2 <- arfimaPrep(data = data, timevar="time"
                   , varlist.mean = c("y","x1","z1","z2")
                   , varlist.fd = c("y", "z1","z2")
                   , varlist.xdif = "x1", varlist.ydif = "y"
                   , d = "ML", ecmformula = y.mean ~ x1.mean
                   , decm="Sperio")
                   
# vary estimation method for differencing parameter between variables
# specify AR/MA models                   
dat3 <- arfimaPrep(data = data, timevar="time"
                   , varlist.mean = c("y","x1","z1","z2")
                   , varlist.fd = c("y", "z1","z2")
                   , varlist.xdif = "x1", varlist.ydif = "y"
                   , d=list(y="ML", z1="Sperio", z2=0.25)
                   , arma=list(y=c(1,0),z2=c(0,1)))

# specify AR/MA models while holding AR[2] fixed for y
dat4 <- arfimaPrep(data = data, timevar="time"
                   , varlist.mean = c("y","x1","z1","z2")
                   , varlist.fd = c("y", "z1","z2")
                   , varlist.xdif = "x1", varlist.ydif = "y"
                   , arma=list(y=list(c(1,3),0),z2=c(0,1)))                   

ls(dat1)
head(dat1$mean)
head(dat2$merged)
dat3$arma

Run the code above in your browser using DataLab