Learn R Programming

SpatialVx (version 0.1-7)

CSIsamples: Forecast Verification with Cluster Analysis: The Variation

Description

A variation on cluster analysis for forecast verification as proposed by Marzban and Sandgathe (2008).

Usage

CSIsamples(x, ...)

## S3 method for class 'default': CSIsamples(x, ..., xhat, nbr.csi.samples = 100, threshold = 20, k = 100, width = 25, stand = TRUE, z.mult = 0, hit.threshold = 0.1, max.csi.clust = 100, diss.metric = "euclidean", linkage.method = "average", verbose = FALSE)

## S3 method for class 'SpatialVx': CSIsamples(x, ..., time.point = 1, model = 1, nbr.csi.samples = 100, threshold = 20, k = 100, width = 25, stand = TRUE, z.mult = 0, hit.threshold = 0.1, max.csi.clust = 100, diss.metric = "euclidean", linkage.method = "average", verbose = FALSE)

## S3 method for class 'CSIsamples': summary(object, ...)

## S3 method for class 'CSIsamples': plot(x, ...)

## S3 method for class 'summary.CSIsamples': plot(x, ...)

## S3 method for class 'CSIsamples': print(x, ...)

Arguments

x,xhat
default method: matrices giving the verification and forecast fields, resp.

SpatialVx method: x is an object of class SpatialVx.

plot, print methods: list object of class

object
list object of class CSIsamples.
nbr.csi.samples
integer giving the number of samples to take at each level of the CA.
threshold
numeric giving a value over which is to be considered an event.
k
numeric giving the value for centers in the call to kmeans.
width
numeric giving the size of the samples for each cluster sample.
stand
logical, should the data first be standardized before applying CA?
z.mult
numeric giving a value by which to multiply the z- component. If zero, then the CA is performed on locations only. Can be used to give more or less weight to the actual values at these locations.
hit.threshold
numeric between zero and one giving the threshold for the proportion of a cluster that is from the verification field vs the forecast field used for determining whether the cluster consitutes a hit (vs false alarm or miss depending).
max.csi.clust
integer giving the maximum number of clusters allowed.
diss.metric
character giving which method to use in the call to dist (which dissimilarity metric should be used?).
linkage.method
character giving the name of a linkage method acceptable to the method argument from the hclust function of package fastcluster.
time.point
numeric or character indicating which time point from the SpatialVx verification set to select for analysis.
model
numeric indicating which forecast model to select for the analysis.
verbose
logical, should progress information be printed to the screen?
...
Not used by CSIsamples method functions.

summary method function: the argument silent may be specified, which is a logical stating whether to print the information to the screen (FALSE) or not (TRUE). If not given,

Value

  • A list is returned by CSIsamples with components:
  • data.namecharacter vector giving the names of the verification and forecast fields analyzed, resp.
  • callan object of class call giving the function call.
  • resultsmax.csi.clust by nbr.csi.samples matrix giving the caluclated CSI for each sample and iteration of CA.
  • The summary method function invisibly returns the same list, but with the additional component:
  • csivector of length max.csi.clust giving the sample average CSI for each iteration of CA.
  • The plot method functions do not return anything. Plots are created.

Details

This function carries out the procedure described in Marzban and Sandgathe (2008) for verifying forecasts. Effectively, it combines the verification and forecast fields (keeping track of which values belong to which field) and applies CA to the combined field. Clusters identified with a proportion of values belonging to the verification field within a certain range (defined by the hit.threshold argument) are determined to be hits, misses or false alarms. From this information, the CSI (at each number of clusters; scale) is calculated. A sampling scheme is used to speed up the process.

The plot and summary functions all give the same information, but in different formats: i.e., CSI by number of clusters (scale).

References

Marzban, C., Sandgathe, S. (2008) Cluster Analysis for Object-Oriented Verification of Fields: A Variation. Mon. Wea. Rev., 136, (3), 1013--1025.

See Also

hclust, hclust, kmeans, clusterer

Examples

Run this code
grid<- list( x= seq( 0,5,,100), y= seq(0,5,,100))
obj<-Exp.image.cov( grid=grid, theta=.5, setup=TRUE)
look<- sim.rf( obj)
look2 <- sim.rf( obj)

res <- CSIsamples(x=look, xhat=look2, 10, threshold=0, k=100,
                  width=2, z.mult=0, hit.threshold=0.25, max.csi.clust=75)
plot(res)
y <- summary(res)
plot(y)
data(UKfcst6)
data(UKobs6)
data(UKloc)

hold <- make.SpatialVx(UKobs6, UKfcst6, thresholds=0,
    loc=UKloc, map=TRUE, field.type="Rainfall", units="mm/h",
    data.name=c("Nimrod", "obs 6", "fcst 6"))

res <- CSIsamples(hold, threshold=0, k=200, z.mult=0.3, hit.threshold=0.2,
                  max.csi.clust=150, verbose=TRUE)
plot(res)
summary(res)
y <- summary(res)
plot(y)

Run the code above in your browser using DataLab