Learn R Programming

fArma (version 3010.79)

LrdModelling: Long Range Dependence Modelling

Description

A collection and description of functions to investigate the long range dependence or long memory behavior of an univariate time series process. Included are functions to simulate fractional Gaussian noise and fractional ARMA processes, and functions to estimate the Hurst exponent by several different methods. The Functions and methods are: Functions to simulate long memory time series processes: ll{ fnmSim Simulates fractional Brownian motion, - mvn from the numerical approximation of the stochastic integral, - chol from the Choleski's decomposition of the covariance matrix, - lev using the method of Levinson, - circ using the method of Wood and Chan, - wave using the wavelet synthesis, fgnSim Simulates fractional Gaussian noise, - beran using the method of Beran, - durbin using the method Durbin and Levinson, - paxson using the method of Paxson, farimaSim simulates FARIMA time series processes. } Functions to estimate the Hurst exponent: ll{ aggvarFit Aggregated variance method, diffvarFit Differenced aggregated variance method, absvalFit aggregated absolute value (moment) method, higuchiFit Higuchi's or fractal dimension method, pengFit Peng's or variance of residuals method, rsFit R/S Rescaled Range Statistic method, perFit periodogram method, boxperFit boxed (modified) periodogram method, whittleFit Whittle estimator, hurstSlider Interactive Display of Hurst Estimates. } Function for the wavelet estimator: ll{ waveletFit wavelet estimator. }

Usage

fbmSim(n = 100, H = 0.7, method = c("mvn", "chol", "lev", "circ", "wave"),
    waveJ = 7, doplot = TRUE, fgn = FALSE)
fgnSim(n = 1000, H = 0.7, method = c("beran", "durbin", "paxson"))
farimaSim(n = 1000, model = list(ar = c(0.5, -0.5), d = 0.3, ma = 0.1),
    method = c("freq", "time"), ...) 
   
aggvarFit(x, levels = 50, minnpts = 3, cut.off = 10^c(0.7, 2.5), 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL)    
diffvarFit(x, levels = 50, minnpts = 3, cut.off = 10^c(0.7, 2.5), 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL) 
absvalFit(x, levels = 50, minnpts = 3, cut.off = 10^c(0.7, 2.5), moment = 1, 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL) 
higuchiFit(x, levels = 50, minnpts = 2, cut.off = 10^c(0.7, 2.5), 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL)
pengFit(x, levels = 50, minnpts = 3, cut.off = 10^c(0.7, 2.5), 
    method = c("mean", "median"), 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL)
rsFit(x, levels = 50, minnpts = 3, cut.off = 10^c(0.7, 2.5), 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL)
perFit(x, cut.off = 0.1, method = c("per", "cumper"), 
    doplot = FALSE, title = NULL, description = NULL)
boxperFit(x, nbox = 100, cut.off = 0.10, 
    doplot = FALSE, trace = FALSE, title = NULL, description = NULL)      
whittleFit(x, order = c(1, 1), subseries = 1, method = c("fgn", "farma"), 
    trace = FALSE, spec = FALSE, title = NULL, description = NULL)
hurstSlider(x = fgnSim())
  
waveletFit(x, length = NULL, order = 2, octave = c(2, 8), 
    doplot = FALSE, title = NULL, description = NULL)
        
## S3 method for class 'fHURST':
show(object)

Arguments

cut.off
[*Fit] - a numeric vector with the lower and upper cut-off points for the estimation. They should be chosen to define a linear range. The default values are c(0.7, 2.5), i.e. 10^0.7 and 10^2.5, respectively.
description
[*Fit] - a character string which allows for a brief description.
doplot
[*Fit] - a logical flag, by default FALSE. Should a plot be displayed?
fgn
[fbmSim] - a logical flag, if FALSE, the functions returns a FBM series otherwise a FGN series.
H
[fgnSim] - the Hurst exponent, a numeric value between 0.5 and 1, by default 0.7.
length
[waveletFit] - the length of data to be used, must be power of 2. If set to NULL, the previous power will be used.
levels
[*Fit] - the number of aggregation levels or number of blocks from which the variances or moments are computed.
method
[fbmSim] - the method how to generate the FBM time series sequence, one of the following five character strings: "mvn", "chol", "lev", "circ", or "wave". [fgnSim] -
minnpts
[*Fit] - the minimum number of points or blocksize to be used to estimate the variance or moments at any aggregation level.
model
a list with model parameters ar, ma and d. ar is a numeric vector giving the AR coefficients, d is an integer value giving the degree of differencing, and ma
moment
[absvalHurst] - an integer value, by default 1 which denotes absolute values. For values larger than one this argument determines what absolute moment should be calculated.
n
[fgnSim][farimaSim] - number of data points to be simulated, a numeric value, by default 1000.
nbox
[boxperFit] - is the number of boxes to divide the data into. A numeric value, by default 100.
object
an object of class fHurst.
octave
[waveletFit] - beginning and ending octave for estimation. An integer vector with two elements. By default c(2, 8). If the upper value is too large, it will be replaced by the maximum allowed value.
order
[waveletFit] - the order of the wavelet. An integer value, by default 2.
spec
[whittleFit] - Should the periodogram be returned? A logical flag, by default FALSE.
subseries
[whittleFit] - allows optionally to subdivide the series into subseries. A numeric value, by default 1.
title
a character string which allows for a project title.
trace
a logical value, by defaul FALSE. Should the estimation process be traced?
waveJ
[fbmSim] - an integer parameter for the simulation of FBM using the wavelet method.
x
[*Fit] - the numeric vector of data, an object of class timeSeries, or any other object which can be transofrmed into a numeric vector by the function as.vector.
...
arguments to be passed.

Value

  • fgnSim and farimaSim return a numeric vector of length n, the FGN or FARIMA series. *Fit returns an S4 object of class fHURST with the following slots:
  • @callthe function call.
  • @methoda character string with the selected method string.
  • @hursta list with at least one element, the Hurst exponent named H. Optional values may be the value of the fitted slope beta, or information from the fit.
  • @parametersa list with a varying number of elements describing the input parameters from the argument list.
  • @dataa list holding the input data.
  • @fita list object with all information of the fit.
  • @plota list object which holds information to create a plot of the fit.
  • @titlea character string with the name of the test.
  • @descriptiona character string with a brief description of the test.
  • waveletFit

Details

Functions to Simulate Long Memory Processes: Fractional Gaussian Noise: The function fgnSim simulates a series of fractional Gaussian noise, FGN. FGN provides a parsimonious model for stationary increments of a self-similar process parameterised by the Hurst exponent H and variance. Fractional Gaussian noise with H < 0.5 demonstrates negatively autocorrelated or anti-persistent behaviour, and FGN with H > 0.5 demonstrates 1/f , long memory or persistent behaviour, and the special case. The case H = 0.5 corresponds to the classical Gaussian white noise. One can select from three different methods. The first generator named "beran" uses the fast Fourier transform to generate the series based on SPLUS code written originally by J. Beran [1994]. The second generator named "durbin" produces a FGN series by using the Durbin-Levinson coefficients. The algorithm was reimplemented in pure S based on the C source code written by V. Teverovsky [199x]. The third generator named "paxson" was proposed by V. Paxson [199x], this approaximate method is a very fast and requires low storage. However, the algorithm reveals some weakness in the method which was discussed by D.A. Rolls [2001]. Fractional ARIMA Processes: The function farimaSim is a generator for fractional ARIMA time series processes. A Gaussian FARIMA(0,d,0) series can be created, where d is related to the the Hurst exponent H through d=H-0.5. This is a particular case of the more general Gaussian FARIMA(p,d,q) process which follows the same asymptotic relations for their autocovariance and the spectral density as do the Gaussian FARIMA(0,d,0) processes. Two different generators are implement in S. The first named "freq" works in the frequence domain and generates the series from the fast Fourier transform based on SPLUS code written originally by J. Beran [1994]. The second method creates the series in the time domain, therefore named "time". The algorithm was reimplemented in pure S based on the Fortran source code from the R's fracdiff package originally written by C. Fraley [1991]. Details for the algorithm are given in J Haslett and A.E. Raftery [1989]. Functions to Estimate the Hurst Exponent: These are 9 functions as described by M.S. Taqqu, V. Teverovsky, and W. Willinger [1995] to estimate the self similarity parameter and/or the intensity of long-range dependence in a time series. Aggregated Variance Method: The function aggvarFit computes the Hurst exponent from the variance of an aggregated FGN or FARIMA time series process. The original time series is divided into blocks of size m. Then the sample variance within each block is computed. The slope beta=2*H-2 from the least square fit of the logarithm of the sample variances versus the logarithm of the block sizes provides an estimate for the Hurst exponent H. Differenced Aggregated Variance Method: To distinguish jumps and slowly decaying trends which are two types of non-stationary, from long-range dependence, the function diffvarFit differences the sample variances of successive blocks. The slope beta=2*H-2 from the least square fit of the logarithm of the differenced sample variances versus the logarithm of the block sizes provides an estimate for the Hurst exponent H. Aggregated Absolute Value/Moment Method: The function absvalFit computes the Hurst exponent from the moments moment=M of absolute values of an aggregated FGN or FARIMA time series process. The first moment M=1 coincides with the absolute value method, and the second moment M=2 with the aggregated variance method. Again, the slope beta=M*(H-1) of the regression line of the logarithm of the statistic versus the logarithm of the block sizes provides an estimate for the Hurst exponent H. Higuchi or Fractal Dimension Method: The function higuchiFit implements a technique which is very similar to the absolute value method. Instead of blocks a sliding window is used to compute the aggregated series. The function involves the calculation the calculation of the length of a path and, in principle, finding its fractal Dimension D. The slope D=2-H from the least square fit of the logarithm of the expected path lengths versus the logarithm of the block (window) sizes provides an estimate for the Hurst exponent H. Peng or Variance of Residuals Method: The function pengFit uses the method described by peng. In Peng's variance of residuals method the series is also divided into blocks of size m. Within each block the cumulated sums are computed up to t and a least-squares line a+b*t is fitted to the cumulated sums. Then the sample variance of the residuals is computed which is proportional to m^(2*H). The "mean" or "median" are computed over the blocks. The slope beta=2*H from the least square provides an estimate for the Hurst exponent H. The R/S Method: The function rsFit implements the algorithm named rescaled range analysis which is dicussed for example in detail by B. Mandelbrot and Wallis [199x], B. Mandelbrot [199x] and B. Mandelbrot and M.S. Taqqu [199x]. The Periodogram Method: The function perFit estimates the Hurst exponent from the periodogram. In the finite variance case, the periodogram is an estimator of the spectral density of the time series. A series with long range dependence will show a spectral density with a lower law behavior in the frequency. Thus, we expect that a log-log plot of the periodogram versus frequency will display a straight line, and the slopw can be computed as 1-2H. In practice one uses only the lowest 10% of the frequencies, since the power law behavior holds only for frequencies close to zero. Varying this cut off may provide additional information. Plotting H versus the cut off, one should select that cut off where the curve flattens out to estimate H. This approach can be selected by the argument method="per". Alternatively we can select method="cumper". In this case, instead of using the periodgram itself, the cmulative periodgram will be investigated. The slope of the double logarithmic fit is given by 2-2H. More details can be found in the work of J. Geweke and S. Porter-Hudak [1983] and in Taqqu [?]. The Boxed or Modified Periodogram Method: The function boxperFit is a modification of the periodogram method. The algorithm devides the frequency axis into logarithmically equally spaced boxes and averages the periodogram values corresponding to the frequencies inside the box. The Whittle Estimator: The function whittleFit performs also a periodogram analysis. The algorithm is based on the minimization of a likelihood function defined in the frequency domain. For FGN and FARIMA(0,d,0) processes the parameter H or d is the unknown parameter which minimizes the function. This approach also allows to compute confidence intervals. Unlike the previous eight estimators the Whittle estimator is not a graphical method, it just returns the values of H or d together with their confidence intervals. The function allows also to investigate FARIMA(p,d,q) models, then the parameter set to be optimized is enlarged by the AR and MA coefficients. It is worth to remark, that the empirical series is required to be a Gaussian process and that the underlying form must be specified. The original functions were written by V. Teverovsky and W. Willinger for SPLUS calling internal functions written in C. The software can be found on M. Taqqu's home page: http://math.bu.edu/people/murad/ In addition the Whittle estimator uses SPlus functions written by J. Beran. They can be found in the appendix of his book or on the StatLib server: http://lib.stat.cmu.edu/S/ Note, all nine R functions and internal utility functions are reimplemented entirely in S. Functions to perform a Wavelet Analysis: The function waveletFit computes the Discrete Wavelet Transform, averages the squares of the coefficients of the transform, and then performs a linear regression on the logarithm of the average, versus the log of the scale parameter of the transform. The result should be directly proportional to H providing an estimate for the Hurst exponent.

References

Beran J. (1992); Statistics for Long-Memory Processes, Chapman and Hall, New York, 1994. Haslett J., Raftery A.E. (1989); Space-Time Modelling with Long-Memory Dependence: Assessing Ireland's Wind Power Resource, Applied Statistics 38, pp. 1--50.

Paxson V. (1995); Fast Approximation of Self-Similar Network Traffic, Technical report, LBL-36750/UC-405, Berkeley, and Computer Communcation Review27, p.5--18, 1997. Rolls D.A. (2001); Improved Fast Approximate Synthesis of Fractional Gaussian Noise, Thesis, Department of Mathematics and Statistics, Queen's University at Kingston, Kingston, Ontario, Canada, 5 pages. Taqqu M., et al. Hurst Exponent, Several Preprints.

Examples

Run this code
## fgnSim -
   par(mfrow = c(3, 1), cex = 0.75)  
   
   # Beran's Method:
   plot(fgnSim(n = 200, H = 0.75), type = "l",  
     ylim = c(-3, 3), xlab = "time", ylab = "x(t)", main = "Beran")
   
   # Durbin's Method:
   plot(fgnSim(n = 200, H = 0.75, method = "durbin"), type = "l",
     ylim = c(-3, 3), xlab = "time", ylab = "x(t)", main = "Durbin")
   
   # Paxson's Method:
   plot(fgnSim(n = 200, H = 0.75, method = "paxson"), type = "l",
     ylim = c(-3, 3), xlab = "time", ylab = "x(t)", main = "Paxson")

Run the code above in your browser using DataLab