Installation

The fractalRegression package cannot currently be installed through CRAN (the Comprehensive R Archive Network) https://cran.r-project.org/package=fractalRegression. But, once it’s available there, it can be installed withe the following:

install.packages("fractalRegression")

The development version is available on Aaron Likens’ Github (https://github.com/aaronlikens/fractalRegression) and can be installed using R devtools. This is a source package and requires compilation of C++ code. Windows users can install RTools software package to get necessary components: https://cran.r-project.org/bin/windows/Rtools/ Intel Mac users can install xcode along with the xcode commandline tools. Users with Mac silicon will need to a bit more fiddling to get the build chain working including the R recommended gfortran compiler. Some useful tips can be found here: https://mac.r-project.org/.

devtools::install_github("aaronlikens/fractalRegression")

Introduction

The aim with this fractalRegression package, and the subsequent vignettes, is to show users how to use the various functions in the package to perform univariate mono- and multi- fractal analysis as well a bivariate fractal correlation and regression analyses.

Detrended Fluctuation Analysis (DFA)

In this first example, we apply DFA, originally developed by Peng et al. (1994) to simulated white noise, pink noise, and anti-correlated noise using the dfa() function.

White Noise Example

# Generate white noise
set.seed(23422345)
noise <- rnorm(5000)
scales <- logscale(scale_min=16, scale_max = 1025, scale_ratio = 2)

# Run DFA on white noise
dfa.noise.out <- dfa(x = noise, order = 1, verbose = 1, scales=scales, scale_ratio = 2)

Since we simulated this white noise, we would expect that our α is close to 0.5. Indeed, we observe the estimate from the above analysis is 0.5207799. Next we use the fgn_sim() to simulate fractional Gaussian noise with known properties.

Pink Noise Example

# Generate pink noise
set.seed(34534534)
pink.noise <- fgn_sim(n = 5000, H = 0.9)

# Run DFA on pink noise
dfa.pink.out <- dfa(x = pink.noise, order = 1, verbose = 1, scales = scales, scale_ratio = 2)

Since we simulated this data with an H = 0.9, we would expect that our α is close to this value. Again, we observe the estimate from the above analysis is 0.833253. Next, we use fgn_sim() to generate some anti-correlated noise.

Anti-Correlated Fractional Gaussian Noise Example

# Generate anti-correlated noise 
set.seed(24633453)
anticorr.noise <- fgn_sim(n = 5000, H = 0.25)

# Run DFA on anti-correlated noise 
dfa.anticorr.out <- dfa(x = anticorr.noise, order = 1, verbose = 1, scales = scales, scale_ratio = 2)

As with above, since we simulated with H = 0.25, we observed a close estimate of the α exponent, which is 0.2301879.

Now let’s take a look at the log-log plots for the three types of noise using the dfa.plot() function. Given the estimates above, we see that the slopes for white noise, pink noise, and anti-correlated noise conform to our expectations. In the figures below, the intercept and the slopes (i.e., α exponents) are shown in addition to the R2.

par(mfrow=c(3,1))
dfa.plot(dfa.noise.out)
dfa.plot(dfa.pink.out)
dfa.plot(dfa.anticorr.out)

Multifractal Detrended Fluctuation Analysis

Now that we have run a mono-fractal analysis using dfa(), we want to expand this to look for evidence of multifractality using multifractal detrended fluctuation analysis (MF-DFA) developed by Kantelhardt et al. (2002). That is, we aim to determine whether there is evidence of multiple scaling relationships and interactions across scales. We can do this easily using the mfdfa() function.

# Run MF-DFA on simulated pink and white noise
mf.dfa.pink.out <- mfdfa(x = pink.noise, q = c(-5:5), order = 1, scales = scales)

mf.dfa.white.out <- mfdfa(x = noise, q = c(-5:5), order = 1, scale = scales)

Let’s first make sure that our α estimate is the same as before when q = 2. We can check this easily with the code below, and it looks good. For example, the pink noise when q = 2, Hq = 0.833253, which is equal to our α = 0.833253 from above.

mf.dfa.pink.out$Hq[mf.dfa.pink.out$q==2]

## [1] 0.833253

mf.dfa.white.out$Hq[mf.dfa.white.out$q==2]

## [1] 0.5207799

Next we are going to work with data that we include in our package (fractaldata). This data was originally provided by Ihlen (2012). It includes a white noise time series, monofractal time series, and a multifractal time series.

Now let’s run MFDFA on these times series. In this case we replicate the choice of parameters in Ihlen (2012) with a q range of -5 to 5, and order = 1, a scale min of 16, and a scale max 1,024.

scales <- logscale(scale_min = 16, scale_max = 1025, scale_ratio = 1.1)
white.mf.dfa.out <- mfdfa(x = fractaldata$whitenoise, q = c(-5:5), order = 1, scales = scales)

mono.mf.dfa.out <- mfdfa(x = fractaldata$monofractal, q = c(-5:5), order = 1, scales = scales)

multi.mf.dfa.out <- mfdfa(x = fractaldata$multifractal, q = c(-5:5), order = 1, scales = scales)

A common way to understand if there is evidence of multifractality is to examine a plot showing the Hq estimates for different values of q. If all the plots have the same slope, that provides evidence of monofractality. If there are distinct slopes, then there is evidence of multifractality. It’s also important to check here that the slopes of log_2_scale and log_fq are linear, thus implying that they are scale invariant. If not, then it could be the case that a higher order polynomial detrending is appropriate (see Kantelhardt et al., 2001). We are going to use the mfdfa.plot() function to compare the monofractal and multifractal signals.

# Let's first make a plot of the monofractal case
mfdfa.plot(mono.mf.dfa.out)

## Loading required package: colorRamps

Now let’s compare the above plot for the monofractal and multifractal results. In comparing, the top left part of both plots, we see that the slopes of the lines for the multifractal signal are divergent compared to the monofractal case.

mfdfa.plot(multi.mf.dfa.out)

It’s also common to examine the relationship between Hq and q as well as H and D(H). We can see the comparison in the bottom right of the two figures above, and the relative difference in the widths of the mutlifractal spectra.

A common metric for comparing the multifractal spectrum is to calculate the width (W) as the hmax − hmin. Let’s do this to compare the monofractal and multifractal time series. While from the figure above, it’s quite clear that the width of the multifractal spectrum for the multifractal signal is larger, let’s calculate it here anyways for the sake of completeness and because spectrum width can be used as dependent variable in statistical analyses.

mono.W <- max(mono.mf.dfa.out$h) - min(mono.mf.dfa.out$h)
multi.W <- max(multi.mf.dfa.out$h) - min(multi.mf.dfa.out$h)

We observe here that the W for the multifractal signal is 1.3566793 and for the monofractal signal, W is 0.0754753.

Detrended Cross-Correlation Analysis (DCCA)

Moving on from the univariate analyses, if we have two non-stationary signals and we want to examine the scale-wise fluctuations as well as the scale-wise cross-correlation of these fluctuations, we can use DCCA using the dcca() function, which was originally developed originally by Podobnik and Stanley (2008) and Zebende (2011).

Simulate some data using a Mixed-correlated ARFIMA model (MC-ARFIMA).

First, however, we are going to simulate some data first. These functions are taken from and available at Ladislav Kristoufek’s website (http://staff.utia.cas.cz/kristoufek/Ladislav_Kristoufek/Codes_files/MC-ARFIMA.R) and they correspond with the following paper (Kristoufek, 2013). We implement a function that includes all of these options called mc_ARFIMA().

The MC-ARFIMA models take the form of the two equations shown below:

$x_t = \alpha \sum^{+ \infty}_{n=0}a_n(d_1)\varepsilon_{1,t-n}+\beta \sum^{+ \infty}_{n=0}a_n(d_2)\varepsilon_{2,t-n}$

$y_t = \gamma \sum^{+ \infty}_{n=0}a_n(d_3)\varepsilon_{3,t-n}+\delta \sum^{+ \infty}_{n=0}a_n(d_4)\varepsilon_{4,t-n}$

Simulate some data with the MC-ARFIMA model

In this case, we use the parameters from Kristoufec (2013) for Model 1 (p. 6487) in which case the resulting two time series of length 10,000 exhibit long range correlations (LRC) as well as long range cross-correlations (LRCC).

set.seed(987345757)
sim1 <- mc_ARFIMA(process='Mixed_ARFIMA_ARFIMA',alpha = 0.2, beta = 1, gamma = 1, delta = 0.2, n = 10000, d1 = 0.4, d2 = 0.3, d3 = 0.3, d4=0.4, rho=0.9)
plot(sim1[,1],type='l', ylab= "Signal Amplitude", xlab='Time', main = "MC-ARFIMA with LRC and LRCC")
lines(sim1[,2], col='blue')

Run DCCA on the MC-ARFIMA with LRC and LRCC

scales <- logscale(scale_min = 10, scale_max = 1000, scale_ratio = 1.1)
dcca.out.arfima <- dcca(sim1[,1], sim1[,2], order = 1, scales = scales)
dcca.out.arfima <- as.data.frame(dcca.out.arfima)
error <- sd(dcca.out.arfima$rho)/sqrt(length(dcca.out.arfima$rho))
dcca.plot <- ggplot(data=dcca.out.arfima, aes(x=scales,y=rho)) + geom_point() +geom_line() + ggtitle("DCCA on MC-ARFIMA processes with LRC and LRCC")+ geom_pointrange(aes(ymin=rho-error,ymax=rho+error))
#geom_smooth(method=lm, formula = y ~ poly(x, 2), se = TRUE)
dcca.plot

In the above figure, we see that the correlation between the MC-ARFIMA processes are consistently high and continue to be high at increasing time scales. Standard errors are plotted around each point. Now let’s contrast this with MC-ARFIMA processes with LRC and short-range cross-correlation (SRCC).

Simulate MC-ARFIMA model with LRC and SRCC

set.seed(423422)
sim2 <- mc_ARFIMA(process = "Mixed_ARFIMA_AR", alpha = 1,beta = 1,gamma = 1,delta = 1,n =10000,d1=0.4,d2=0.4,theta1=0.8,theta2=0.8,rho=0.9)
plot(sim2[,1],type='l', ylab= "Signal Amplitude", xlab='Time', main = "MC-ARFIMA with LRC and SRCC")
lines(sim2[,2], col='blue')

Run DCCA on the MC-ARFIMA with LRC and SRCC

scales <- logscale(scale_min = 10, scale_max = 1001, scale_ratio = 1.1)
dcca.out.arfima2 <- dcca(sim2[,1], sim2[,2], order = 1, scales = scales)
dcca.out.arfima2 <- as.data.frame(dcca.out.arfima2)
error <- sd(dcca.out.arfima2$rho)/sqrt(length(dcca.out.arfima2$rho))
dcca.plot2 <- ggplot(data=dcca.out.arfima2, aes(x=scales,y=rho)) + geom_point() +geom_line() + ggtitle("DCCA on MC-ARFIMA processes with LRC and SRCC") + geom_pointrange(aes(ymin=rho-error,ymax=rho+error))
#geom_smooth(method=lm, formula = y ~ poly(x, 2), se = TRUE)
dcca.plot2

In contrast to the previous DCCA analysis, the above figure shows a signal that begins with a high cross-correlation, but that begins to deviate and trend lower substantially at increasing scale sizes.

Multiscale Regression Analysis (MRA)

Let’s consider the above simulations, and consider the question: what if we want to from a scale-wise correlation framework to a regression framework? Or, put differently, can we use the scale-wise fluctuations of one variable to predict the scale-wise fluctuations of the other? As with a traditional regression approach, we will use one of our variables as our predictor (xt) and the other as our outcome (yt).

scales <- logscale(scale_min = 10, scale_max = 1001, scale_ratio = 1.1)
mra.out <- as.data.frame(mra(x = sim1[,1], y = sim1[,2],order = 1, scales = scales))
error <- sd(mra.out$betas)/sqrt(length(mra.out$betas))
mra.plot <- ggplot(data=mra.out, aes(x=scales,y=betas)) + geom_point() +geom_line() +ggtitle("Multiscale Regression Analysis for MC-ARFIMA with LRC and LRCC") + geom_pointrange(aes(ymin=betas-error,ymax=betas+error))
#geom_smooth(method=lm, formula = y ~ poly(x, 2), se = TRUE)
mra.plot

Generally, we observe that the β coefficients are relatively stable at increasing time scales with a general, perhaps quadratically increasing trend. Here it is also important to investigate the change in R2 as well as the t-values. Below we see that the R2 is quite high at most of the time scales with Rmin2= 0.67 and Rmax2= 1.85and all of the t-values greater than the conventional cut-off at 1.96. So between these two component ARFIMA processes, the output of MRA shows that much of the scale specific variance in yt is explained and predicted by xt.

knitr::kable(mra.out, format='html', digits =2,align='c',caption = "Output from MRA")

Multiscale Lagged Regression Analysis (MLRA)

Building on MRA, we can add in lagged relationships to the analysis using mlra() and see not only whether there are scale-wise variations in the β coefficients, but changes in these at particular time lags.

scales <- logScale(scale.min = 10, scale.max = 1000, scale.ratio = 1.1)
mlra.out <- mlra(x = sim1[,1], y = sim1[,2],order = 1, scales = scales, lags=100, direction='p')
colfunc <- colorRampPalette(c("green", "red"))
graphics::matplot(mlra.out$betas, type='l', col = colfunc(49), ylab="Beta Coefficient", xlab='Lag', main="Multiscale Lagged Regression Analysis")

The figure above shows that there is high β values across scales only for lags near to 0. But, it’s difficult to see the scale-wise variation in this type of plot. Another option is to use image() or image.plot() to visualize the results of the mlra() function. This plot more clearly shows a color gradient corresponding to the strength of the /bet**a coefficient across scales on the y-axis and at increasing lags (x-axis).

x <- 0:100
y <- scales
image.plot(x, y, mlra.out$betas, axes=TRUE, legend.lab = "Beta Coefficient", ylab="Scale", xlab="Lag", main="Multiscale Lagged Regression Analysis")

#contour(x, y, mlra.out,levels=seq(0,1,by=1),add=TRUE,col='black')

An Empirical Example

So far we’ve demonstrated analyses on simulated data. Next we are going to analyze a pair of empirical handmovement time series from a dyad that are included in the crqa package and discussed in detail in Wallot et al. (2016). These data are hand-movement velocity profiles.

Load data from the crqa package and visualize it

require(crqa)
data(crqa)
head(handmovement)

##      P1_TT_d    P1_TT_n     P2_TT_d     P2_TT_n
## 1 0.22702423 0.26757616 0.006000000 0.000000000
## 2 0.47086091 0.23433310 0.008366600 0.006708204
## 3 0.14652304 0.16225289 0.006708204 0.006000000
## 4 0.40472830 0.02319483 0.009055385 0.010392305
## 5 0.07187489 0.04341659 0.007348469 0.006708204
## 6 0.10543719 0.06870953 0.005196152 0.007348469

plot(handmovement$P1_TT_d, type='l', main = "Dominant hand-movement velocity profiles of participants", xlab='time',ylab='movement velocity')
lines(handmovement$P2_TT_d, col=2)

MF-DFA on Empirical Data

Using the dominant hand variable for participant 1 P1_tt_d and for participant two P1_tt_d, we are going to look at the fractal dynamics first. We are going to run mfdfa() and look at the q=2 (monofractal) scaling as well as multifractal scaling.

# Participant 1
q <- seq(-5,5, by=0.25)

scales <- logscale(scale_min = 16, scale_max = length(handmovement$P1_TT_d)/4, scale_ratio = 2)

mf.dfa.P1hand.out <- mfdfa(x = handmovement$P1_TT_d, q = q, order = 1, scales = scales)

# Participant 2
mf.dfa.P2hand.out <- mfdfa(x = handmovement$P2_TT_d, q = q, order = 1, scales = scales)

# Examine the alpha exponent for q=2, which is equivalent to running DFA
mf.dfa.P1hand.out$Hq[mf.dfa.P1hand.out$q==2]

## [1] 0.8788367

mf.dfa.P2hand.out$Hq[mf.dfa.P2hand.out$q==2]

## [1] 0.9174263

For Participant 1, we observe a value of 0.88 and for Participant 2 a value of 0.92, which suggests both exhibit long-range correlations and the signals approximate pink noise. Next, want to examine the multi-fractal spectra.

# Create the plot 
plot(mf.dfa.P1hand.out$h,mf.dfa.P1hand.out$Dh, type='b', lwd=1, lty=2, pch=19,ylim=c(-0.4,1),xlim=c(-.8,.8), ylab="D(h)", xlab="h", main= "Multifractal Spectrum for Dominant Hand Movements")
lines(mf.dfa.P2hand.out$h,mf.dfa.P2hand.out$Dh, type='b', pch=19,lwd=3, col="blue")
legend(-.85,1, legend = c("P1", "P2"), col=c("black", "blue"), lwd=3)

P1.W <- max(mf.dfa.P1hand.out$h) - min(mf.dfa.P1hand.out$h)
P2.W <- max(mf.dfa.P2hand.out$h) - min(mf.dfa.P2hand.out$h)

When comparing multi-fractal spectrum width (W), hmax − hmin, we observe that both signals have a similar W. Specifically, Participant 1 W = NaN and Participant 2 W = NaN. From the figures and W estimates, there does appear to be scale-wise variation in the scaling exponents. However, a surrogate test could make this inference more robust.

DCCA on Empirical Data

Next, we are going to work with these hand movement time series from a dyad and try to examine the scale-wise cross-correlation between the time series. But first, let’s see if they are cross-correlated in general.

ccf(handmovement$P1_TT_d, handmovement$P2_TT_d, lag.max = 500, main = "Cross-correlation function of P1 and P2 Hand Movements")

It appears that there might be some lagged relationship between the two signals. Now, we can run and examine the results of dcca() examining the scale-wise detrended cross-correlation coefficients.

# Set scales 
scales <- seq(15, 1000, by = 5)
# Note that a small amount of noise was added to these time series to avoid processor specific issues.
# This is not a necessary step for typical analyses
p1 = handmovement$P1_TT_d + rnorm(1, 0,.001)
p2 = handmovement$P2_TT_d + rnorm(1, 0,.001)
dcca.out = dcca(x = p1, y = p2, order = 1, scales = scales)
dcca.out <- as.data.frame(dcca.out)
# dcca.plot <- ggplot(data=dcca.out, aes(x=scales,y=rho)) + geom_point() +geom_line()
plot(dcca.out$scales, dcca.out$rho, type = 'b', pch = 16, xlab = 'Scale',
     ylab = expression(rho))

# dcca.plot

At smaller scales, ρ approximates zero. However, at increasing scale sizes the trend of the ρ estimates become negative exhibit quite large fluctuations.

MRA on Empirical Data

Building on the above DCCA results, we use mra() to determine if can we use the the scale-wise fluctuations in Particiapnt 2’s hand movements to predict those in Participant 1.

scales <- seq(15, 1000, by = 5)
p1 = handmovement$P1_TT_d + rnorm(1, 0, .001)
p2 = handmovement$P2_TT_d + rnorm(1, 0, .001)
mra.out <- as.data.frame(mra(x = p1, y = p2, order = 2, scales = scales))
mra.plot <- ggplot(data=mra.out, aes(x=scales,y=betas)) + geom_point() +geom_line()
mra.plot

In the table below, we filter out those β coefficients, whose corresponding t-values are greater than +/- 1.96 to provide an index of how many scales there are where Participant 2’s hand movements are significant predictors for Participant 1’s hand movements.

mra.out.sig <- subset(mra.out, abs(mra.out$t_observed) > 1.96)
knitr::kable(mra.out.sig, format='html', digits =2,align='c',caption = "Output from MRA on Handmovement Data")

Lastly, let’s take a look at the MLRA plot. Below we can see that most of the highest beta coefficients are at very short lags; however, there is some considerable variability around the scales.

scales <- logScale(scale.min = 10, scale.max = 1000, scale.ratio = 1.1)
mlra.out.emp <- mlra(x = handmovement$P1_TT_d, y =  handmovement$P2_TT_d,order = 1, scales = scales, lags=100, direction='p')
x <- 0:100
y <- scales
image.plot(x, y, mlra.out.emp$betas, axes=TRUE, legend.lab = "Beta Coefficient", ylab="Scale", xlab="Lag", main="Multiscale Lagged Regression Analysis Hand Movements")

#contour(x, y, mlra.out,levels=seq(0,1,by=1),add=TRUE,col='black')

References

Ihlen, E. A. F. (2012). Introduction to Multifractal Detrended Fluctuation Analysis in Matlab. Frontiers in Physiology, 3. https://doi.org/10.3389/fphys.2012.00141
Kantelhardt, J. W., Koscielny-Bunde, E., Rego, H. H. A., Havlin, S., & Bunde, A. (2001). Detecting long-range correlations with detrended fluctuation analysis. Physica A: Statistical Mechanics and Its Applications, 295(3), 441–454. https://doi.org/10.1016/S0378-4371(01)00144-3
Kantelhardt, J. W., Zschiegner, S. A., Koscielny-Bunde, E., Havlin, S., Bunde, A., & Stanley, H. E. (2002). Multifractal detrended fluctuation analysis of nonstationary time series. Physica A: Statistical Mechanics and Its Applications, 316(1), 87–114. https://doi.org/10.1016/S0378-4371(02)01383-3
Kristoufek, L. (2013). Mixed-correlated ARFIMA processes for power-law cross-correlations. Physica A: Statistical Mechanics and Its Applications, 392(24), 6484–6493. https://doi.org/10.1016/j.physa.2013.08.041
Peng, C.-K., Buldyrev, S. V., Havlin, S., Simons, M., Stanley, H. E., & Goldberger, A. L. (1994). Mosaic organization of DNA nucleotides. Physical Review E, 49(2), 1685–1689. https://doi.org/10.1103/PhysRevE.49.1685
Podobnik, B., & Stanley, H. E. (2008). Detrended Cross-Correlation Analysis: A New Method for Analyzing Two Nonstationary Time Series. Physical Review Letters, 100(8), 084102. https://doi.org/10.1103/PhysRevLett.100.084102
Wallot, S., Mitkidis, P., McGraw, J. J. and Roepstorff, A. (2016). Beyond synchrony: joint action in a complex production task reveals beneficial effects of decreased interpersonal synchrony. PloS one, 11(12), e0168306.