Learn R Programming

ctbi

Please cite the following companion paper if you’re using the ctbi package: Ritter, F.: Technical note: A procedure to clean, decompose, and aggregate time series, Hydrol. Earth Syst. Sci., 27, 349–361, https://doi.org/10.5194/hess-27-349-2023, 2023.

The goal of ctbi is to clean, decompose, impute and aggregate univariate time series. Ctbi stands for Cyclic/Trend decomposition using Bin Interpolation : the time series is divided into a sequence of non-overlapping bins. The long-term trend is a linear interpolation of the mean values between successive bins and the cyclic component is the mean stack of detrended data within all bins. Outliers present in the residuals are flagged using an enhanced Boxplot rule (called Logbox) that is adapted to non-Gaussian data and keeps the type I error at $\frac{0.1}{\sqrt{n}}$ % (percentage of erroneously flagged outliers). Logbox replaces the original 1.5 constant with $A \times \log(n)+B+C/n$. The variable n is the sample size, $C = 36$ corrects for biases emerging in small samples, and A and B are automatically calculated on a predictor of the maximum tail weight. The strength of the cyclic pattern within each bin is quantified by a new metric, the Stacked Cycles Index (SCI), with SCI ~ 0 associated with no cyclicity and SCI = 1 a perfectly cyclic signal.

Installation

You can install the latest version of ctbi on CRAN.

Example

library(ctbi)
example1 <- data.frame(year = 1700:1988,sunspot = as.numeric(sunspot.year))
example1[sample(1:289,30),'sunspot'] <- NA # contaminate data with missing values
example1[c(5,30,50),'sunspot'] <- c(-50,300,400) # contaminate data with outliers
example1 <- example1[-(70:100),]
bin.period <- 11 # aggregation performed every 11 years (the year is numeric here)
bin.side <- 1989 # to capture the last year, 1988, in a complete bin
bin.FUN <- 'mean'
bin.max.f.NA <- 0.2 # maximum of 20% of missing data per bin
ylim <- c(0,Inf) # negative values are impossible

list.main <- ctbi(example1,bin.period=bin.period,
                       bin.side=bin.side,bin.FUN=bin.FUN,
                       ylim=ylim,bin.max.f.NA=bin.max.f.NA)
data0.example1 <- list.main$data0 # cleaned raw dataset
data1.example1 <- list.main$data1 # aggregated dataset.
mean.cycle.example1 <- list.main$mean.cycle # this data set shows a moderate seasonality
summary.bin.example1 <- list.main$summary.bin # confirmed with SCI = 0.50
summary.outlier.example1 <- list.main$summary.outlier

Copy Link

Version

Install

install.packages('ctbi')

Monthly Downloads

217

Version

2.0.5

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Francois Ritter

Last Published

January 20th, 2023

Functions in ctbi (2.0.5)

hidd.sum

hidd.sum
hidd.rel.time

hidd.rel.time
hidd.seq

hidd.seq
ctbi.timeseries

ctbi.timeseries
hidd.mean

hidd.mean
ctbi.plot

ctbi.plot
hidd.median

hidd.median
hidd.replace

hidd.replace
hidd.mad

hidd.mad
hidd.check.bin.period

hidd.check.bin.period
hidd.count.noNA

hidd.count.noNA
ctbi.cycle

ctbi.cycle
ctbi

ctbi
ctbi.long.term

ctbi.long.term
ctbi.outlier

ctbi.outlier
hidd.count.NA

hidd.count.NA
hidd.sd

hidd.sd