imputeTS v2.6

0

Monthly downloads

0th

Percentile

Time Series Missing Value Imputation

Imputation (replacement) of missing values in univariate time series. Offers several imputation functions and missing data plots. Available imputation algorithms include: 'Mean', 'LOCF', 'Interpolation', 'Moving Average', 'Seasonal Decomposition', 'Kalman Smoothing on Structural Time Series models', 'Kalman Smoothing on ARIMA models'.

Readme

Project Status: Active The project has reached a stable, usable state and is being actively developed. Build Status AppVeyor Build Status codecov CRAN Version CRAN Release CRAN Downloads

imputeTS: Time Series Missing Value Imputation

The imputeTS package specializes on (univariate) time series imputation. It offers several different imputation algorithm implementations. Beyond the imputation algorithms the package also provides plotting and printing functions of time series missing data statistics. Additionally three time series datasets for imputation experiments are included.

Installation

The imputeTS package can be found on CRAN. For installation execute in R:

 install.packages("imputeTS")

If you want to install the latest version from GitHub (can be unstable) run:

library(devtools)
install_github("SteffenMoritz/imputeTS")

Usage

  • Imputation

    To impute (fill all missing values) in a time series x, run the following command:

     na.interpolation(x)
    

    Output is the time series x with all NA's replaced by reasonable values.

    This is just one example for an imputation algorithm. In this case interpolation was the algorithm of choice for calculating the NA replacements. There are several other algorithms (see also under caption "Imputation Algorithms"). All imputation functions are named alike starting with na. followed by a algorithm label e.g. na.mean, na.kalman, ...

  • Plotting

    To plot missing data statistics for a time series x, run the following command:

     plotNA.distribution(x)
    

    This is also just one example for a plot. Overall there are four different types of missing data plots. (see also under caption "Missing Data Plots").

  • Printing

    To print statistics about the missing data in a time series x, run the following command:

     statsNA(x)
    
  • Datasets

    To load the 'heating' time series (with missing values) into a variable y and the 'heating' time series (without missing values) into a variable z, run:

     y <- tsHeating
     z <- tsHeatingComplete
    

    There are three datasets provided with the package, the 'tsHeating', the 'tsAirgap' and the 'tsNH4' time series. (see also under caption "Datasets").

Imputation Algorithms

Here is a table with available algorithms to choose from:

Function Description
na.interpolation Missing Value Imputation by Interpolation
na.kalman Missing Value Imputation by Kalman Smoothing
na.locf Missing Value Imputation by Last Observation Carried Forward
na.ma Missing Value Imputation by Weighted Moving Average
na.mean Missing Value Imputation by Mean Value
na.random Missing Value Imputation by Random Sample
na.remove Remove Missing Values
na.replace Replace Missing Values by a Defined Value
na.seadec Seasonally Decomposed Missing Value Imputation
na.seasplit Seasonally Splitted Missing Value Imputation

This is a rather broad overview. The functions itself mostly offer more than just one algorithm. For example na.interpolation can be set to linear or spline interpolation.

More detailed information about the algortihms and their options can be found in the imputeTS reference manual.

Missing Data Plots

Here is a table with available plots to choose from:

Function Description
plotNA.distribution Visualize Distribution of Missing Values
plotNA.distributionBar Visualize Distribution of Missing Values (Barplot)
plotNA.gapsize Visualize Distribution of NA gapsizes
plotNA.imputations Visualize Imputed Values

More detailed information about the plots can be found in the imputeTS reference manual.

Datasets

There are two datasets (each in two versions) available:

Dataset Description
tsAirgap Time series of monthly airline passengers (with NAs)
tsAirgapComplete Time series of monthly airline passengers (complete)
tsHeating Time series of a heating systems supply temperature (with NAs)
tsHeatingComplete Time series of a heating systems supply temperature (complete)
tsNH4 Time series of NH4 concentration in a wastewater system (with NAs)
tsNH4Complete Time series of NH4 concentration in a wastewater system (complete)

The tsAirgap, tsHeating and tsNH4 time series are with NAs. Their complete versions are without NAs. Except the missing values their versions are identical. The NAs for the time series were artifically inserted by simulating the missing data pattern observed in similar non-complete time series from the same domain. Having a complete and incomplete version of the same dataset is useful for conducting experiments of imputation functions.

More detailed information about the datasets can be found in the imputeTS reference manual.

Support

If you found a bug or have suggestions, feel free to get in contact via steffen.moritz10 at gmail.com

All feedback is welcome

Version

2.6

License

GPL-3

Functions in imputeTS

Name Description
na.replace Replace Missing Values by a Defined Value
na.kalman Missing Value Imputation by Kalman Smoothing and State Space Models
imputeTS-package imputeTS-package description
na.seadec Seasonally Decomposed Missing Value Imputation
na.locf Missing Value Imputation by Last Observation Carried Forward
na.ma Missing Value Imputation by Weighted Moving Average
na.random Missing Value Imputation by Random Sample
tsNH4 Time series of NH4 concentration in a wastewater system (with NAs)
na.mean Missing Value Imputation by Mean Value
tsNH4Complete Time series of NH4 concentration in a wastewater system (complete)
na.remove Remove Missing Values
na.interpolation Missing Value Imputation by Interpolation
plotNA.distributionBar Visualize Distribution of Missing Values (Barplot)
tsHeatingComplete Time series of a heating systems supply temperature (complete)
tsAirgap Time series of monthly airline passengers (with NAs)
plotNA.distribution Visualize Distribution of Missing Values
tsAirgapComplete Time series of monthly airline passengers (complete)
plotNA.gapsize Visualize Distribution of NA gap sizes
na.seasplit Seasonally Splitted Missing Value Imputation
tsHeating Time series of a heating systems supply temperature (with NAs)
plotNA.imputations Visualize Imputed Values
statsNA Print Statistics about Missing Values
No Results!

Vignettes of imputeTS

Name
Figures.d/Rlogo.png
Figures.d/distribution.pdf
Figures.d/distributionbar.pdf
Figures.d/gapsize.pdf
Figures.d/imputations.pdf
Figures.d/imputations2.pdf
Figures.d/sponsorlogo.jpg
Figures.d/statsna.png
Figures.d/tsairgap.png
RJournal.sty
imputeTS-Time-Series-Missing-Value-Imputation-in-R.Rnw
No Results!

Last month downloads

Details

Date 2018-03-20
LazyData yes
Type Package
ByteCompile TRUE
BugReports https://github.com/SteffenMoritz/imputeTS/issues
URL https://github.com/SteffenMoritz/imputeTS
Repository CRAN
License GPL-3
VignetteBuilder utils
RoxygenNote 6.0.1
LinkingTo Rcpp
NeedsCompilation yes
Packaged 2018-03-20 12:34:11 UTC; Steve
Date/Publication 2018-03-20 18:08:55 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/imputeTS)](http://www.rdocumentation.org/packages/imputeTS)