Learn R Programming

kzs (version 1.2.0)

skzs: Spatial Kolmogorov-Zurbenko Spline

Description

SKZS utilizes splines to construct a smooth estimate of a single outcome variable over two dimensional input variables.

Usage

skzs(data, y, x1, x2, delta1, delta2, h1, h2, k = 1, show.edges = FALSE, plot = TRUE)

Arguments

data
a data frame of 3-dimensional points (X1, X2, Y) where Y is a one dimensional response variable and X = (X1, X2) are two-dimensional real values in the space R2. The data frame must contain a minimum of 3 columns (2 input variables and 1 resp
y
an integer specifying the column in the data frame containing the Y values; the values to be used as the response variable.
x1
an integer specifying the column in the data frame containing the X1 values; the first input variable.
x2
an integer specifying the column in the data frame containing the X2 values; the second input variable.
delta1
is the physical range of smoothing in terms of unit values of the input variable x1.
delta2
is the physical range of smoothing in terms of unit values of the input variable x2.
h1
a scale reading of all outcomes of the algorithm. More specifically, h1 is the interval width of a uniform scale overlapping the x1 input variable.
h2
a scale reading of all outcomes of the algorithm. More specifically, h2 is the interval width of a uniform scale overlapping the x2 input variable.
k
the number of iterations SKZS will execute; k may also be interpreted as the order of smoothness (as a polynomial of degree k-1). By default, k is set to perform a single iteration.
show.edges
a logical indicating whether or not to display the resulting data beyond the range of x1 and x2 values of the input data frame. If false, then the extended edges are suppressed. By default, this argument
plot
a logical indicating whether or not to return a 3-dimensional plot of the SKZS outcome. By default, this argument is set to true.

Value

  • a three column data frame containing:
  • x1the x1 coordinates of a grid determined by a uniform scale (defined by h1) overlaying the input x1 variable.
  • x2the x2 coordinates of a grid determined by a uniform scale (defined by h2) overlaying the input x2 variable.
  • zkthe estimated values of the one dimensional response variable, Y(x1, x2) after k iterations.

Details

The relation between variables Y and X = (x1, x2) as a function of a current value of X is often desired as a result of practical research. Usually we search for some simple function Y(x1, x2) when given a data set of 3-dimensional points (Y, x1, x2). When plotted, these points frequently resemble a noisy plot, and thus Y(x1, x2) is desired to be a smooth outcome from the original data to capture important patterns in the data, while leaving out the noise. SKZS estimates a solution to this problem through use of splines, a particular nonparametric estimator of a function. Given a data set of 3-dimensional points, splines will estimate the smooth values of the response Y from the two dimensional input variables x1 and x2. SKZS averages all values of Y contained in a rectangle made up of sides delta1 and delta2 and centered at the point (xk1, xk2), a particular point on the uniform scale overlaying the x1 and x2 axes. The SKZS algorithm is designed to smooth all fast fluctuations in Y within the delta-range (x1, x2), while keeping ranges more then delta1 and delta2 untouched. The separation of short scales less than delta1, delta2 and long scales more than delta1, delta2 is becoming more effective with higher k, while effective range of separation is becoming $\code{delta(j)}*sqrt(\code{k})$.

References

"Spline Smoothing." http://economics.about.com/od/economicsglossary/g/splines.htm

See Also

argskzs, kzs

Examples

Run this code
# EXAMPLE - Estimating the Sinc function in the interval (-3pi, 3pi)
  #           Load the LATTICE package 
  require(lattice)
  ### (1) Create a 3D plot of the signal to be estimated by SKZS
  
  # Create a random sample of size 250 for X = (X1, X2)
  u <- seq(-3*pi, 3*pi, 3*pi/50)
  v <- u			
  x1 <- sample(u, size = 250, replace=TRUE)  
  x2 <- sample(v, size = 250, replace=TRUE)
  
  # Store x1 and x2 into a data frame
  d <- data.frame(cbind(x1,x2))

  # Keep only the unique (x1,x2) data points
  df <- unique(d) 

  # Create the lattice of points
  df <- expand.grid(x1 = x1, x2 = x2)
  
  # Apply the Sinc function to x1 and x2 and store the result as z
  df$z <- sin(sqrt(df$x1^2 + df$x2^2)) / sqrt(df$x1^2 + df$x2^2)
  df$z[is.na(df$z)] <- 1
  
  # Any point outside the circle of radius 3pi is set to 0. This 
  # provides a better picture of the outcome solely for the purposes
  # of this example.
  dst <- sqrt((df$x1 - 0)^2 + (df$x2 - 0)^2)
  df$dist <- dst	
  df$z[df$dist > 3*pi] <- 0

  # 3D plot of signal to be estimated
  wireframe(z ~ x1 * x2, df, main = "Signal to be estimated", drape = TRUE, 
  colorkey = TRUE, scales = list(arrows = FALSE))
  
  ### (2) Create a 3D plot of the signal buried in noise
  ez <- rnorm(length(df$z), mean = 0, sd = 1) * 1/2 
  df$z_noisy <- ez + df$z
  wireframe(z_noisy ~ x1 * x2, df, main = "Signal buried in noise", drape = TRUE, 
  colorkey = TRUE, scales = list(arrows = FALSE))
  
  ### (3) Create the data set to be used in SKZS --- n = 4000
  #	  same process as in (1)
  x1 <- sample(u, size = 4000, replace = TRUE)
  x2 <- sample(v, size = 4000, replace = TRUE)
  d <- data.frame(cbind(x1,x2))
  df <- unique(d)
  df$z <- sin(sqrt(df$x1^2 + df$x2^2)) / sqrt(df$x1^2 + df$x2^2)
  df$z[is.na(df$z)] <- 1
  
  dst <- sqrt((df$x1 - 0)^2 + (df$x2 - 0)^2)
  df$dist <- dst
  df$z[df$dist > 3*pi] <- 0
  
  ez <- rnorm(length(df$z),mean=0,sd=1)*1/2
  df$z_noisy <- ez + df$z
  dfn <- df[,-3:-4]  
  
  ### (4) Create a 2D view of the 3D plots above
  par(mfrow = c(2,1))
  plot(df$z ~ df$x1, main = "2D plot of the signal to be estimated
n = 4000", 
  xlab = "x", ylab = "Z(x)")
  plot(df$z_noisy ~ df$x1, main = "2D plot of the signal buried in noise
n = 4000", 
  xlab = "x", ylab = "Z(x)")
  
  ### (5) Execute SKZS on the data...arguments were chosen arbitrarily.  
  #	  Try other argument values to test the outcome
  skzs(data=dfn, y=3, x1=1, x2=2, delta1=3, delta2=3, h1=3*pi/60, h2=3*pi/60, k=1, 
  show.edges=FALSE, plot=TRUE)

Run the code above in your browser using DataLab