baytrends (version 1.1.0)

fillMissing: Fill Missing Values

Description

Replace missing values in time-series data by interpolation.

Usage

fillMissing(x, span = 10, Dates = NULL, max.fill = 10)

Arguments

x

the sequence of observations. Missing values are permitted and will be replaced.

span

the maximum number of observations on each side of each range of missing values to use in constructing the time-series model. See Details.

Dates

an optional vector of dates/times associated with each value in x. Useful if there are gaps in dates/times.

max.fill

the maximum gap to fill.

Value

The observations in x with missing values replaced by interpolation.

Details

Missing values at the beginning and end of x will not be replaced.

The argument span is used to help set the range of values used to construct the StructTS model. If span is set small, then the variance of epsilon dominates and the estimates are not smooth. If span is large, then the variance of level dominates and the estimates are linear interpolations. The variances of level and epsilon are components of the state-space model used to interpolate values, see StructTS for details. See Note for more information about the method.

If span is set larger than 99, then the entire time series is used to estimate all missing values. This approach may be useful if there are many periods of missing values. If span is set to any number less than 4, then simple linear interpolation will be used to replace missing values.

Added from smwrBase.

References

Beauchamp, J.J., 1989, Comparison of regression and time-series methods for synthesizing missing streamflow records: Water Resources Bulletin, v. 25, no. 5, p. 961--975.

Elshorbagy, A.A., Panu, U.S., and Simonovic, S.P., 2000, Group-based estimation of missing hydrological data, I. Approach and general methodology: Hydrological Sciences Journal, v. 45, no. 6, p. 849--866.

See Also

tsSmooth, StructTS

Examples

Run this code
# NOT RUN {
#library(smwrData)
data(Q05078470)
# Create missing values in flow, the first sequence is a peak and the second is a recession
Q05078470$FlowMiss <- Q05078470$FLOW
Q05078470$FlowMiss[c(109:111, 198:201)] <- NA
# Interpolate the missing values
Q05078470$FlowFill <- fillMissing(Q05078470$FlowMiss)
# How did we do (line is actual, points are filled values)?
par(mfrow=c(2,1), mar=c(5.1, 4.1, 1.1, 1.1))
with(Q05078470[100:120, ], plot(DATES, FLOW, type="l"))
with(Q05078470[109:111, ], points(DATES, FlowFill))
with(Q05078470[190:210, ], plot(DATES, FLOW, type="l"))
with(Q05078470[198:201, ], points(DATES, FlowFill))
# }

Run the code above in your browser using DataLab