wave: Weakly associated vectors sampling

Description

Select a spread sample from inclusion probabilities using the weakly associated vectors sampling method.

Usage

wave(
  X,
  pik,
  bound = 1,
  tore = FALSE,
  shift = FALSE,
  toreBound = -1,
  comment = FALSE,
  fixedSize = TRUE
)

Value

A vector of size $N$ with elements equal 0 or 1. The value 1 indicates that the unit is selected while the value 0 is for non-chosen unit.

Arguments

X: matrix representing the spatial coordinates.
pik: vector of the inclusion probabilities. The length should be equal to N.
bound: a scalar representing the bound to reach. See Details. Default is 1.
tore: an optional logical value, if we are considering the distance on a tore. See Details. Default is TRUE.
shift: an optional logical value, if you would use a shift perturbation. See Details. Default is FALSE.
toreBound: a numeric value that specify the size of the grid. Default is -1.
comment: an optional logical value, indicating some informations during the execution. Default is FALSE.
fixedSize: an optional logical value, if you would impose a fixed sample size. Default is TRUE

Details

The main idea is derived from the cube method (Devill and Tillé, 2004). At each step, the inclusion probabilities vector pik is randomly modified. This modification is carried out in a direction that best preserves the spreading of the sample.

A stratification matrix $\bf A$ is computed from the spatial weights matrix calculated from the function wpik. Depending if $\bf A$ is full rank or not, the vector giving the direction is not selected in the same way.

If matrix $\bf A$ is not full rank, a vector that is contained in the right null space is selected: $$ Null(\bf A) = \{ \bf x \in R^N | \bf A\bf x = \bf 0 \}. $$

If matrix $\bf A$ is full rank, we find $\bf v$, $\bf u$ the singular vectors associated to the smallest singular value $\sigma $ of $\bf A$ such that

$$ \bf A\bf v = \sigma \bf u,~~~~ \bf A^\top \bf u = \sigma \bf v.$$

Vector $ \bf v $ is then centered to ensure fixed sample size. At each step, inclusion probabilities is modified and at least on component is set to 0 or 1. Matrix $\bf A $ is updated from the new inclusion probabilities. The whole procedure it repeated until it remains only one component that is not equal to 0 or 1.

For more informations on the options tore and toreBound, see distUnitk. If tore is set up TRUE and toreBound not specified the toreBound is equal to $$N^{1/p}$$ where $p$ is equal to the number of column of the matrix X.

For more informations on the option shift, see wpik.

If fixedSize is equal TRUE, the weakest associated vector is centered at each step of the algorithm. This ensures that the size of the selected sample is equal to the sum of the inclusion probabilities.

References

Deville, J. C. and Tillé, Y. (2004). Efficient balanced sampling: the cube method. Biometrika, 91(4), 893-912

Examples

Run this code


#------------
# Example 2D
#------------

N <- 50
n <- 15
pik <- rep(n/N,N)
X <- as.matrix(cbind(runif(N),runif(N)))
s <- wave(X,pik)

#------------
# Example 2D grid 
#------------

N <- 36 # 6 x 6 grid
n <- 12 # number of unit selected
x <- seq(1,sqrt(N),1)
X <- as.matrix(cbind(rep(x,times = sqrt(N)),rep(x,each = sqrt(N))))
pik <- rep(n/N,N)
s <- wave(X,pik, tore = TRUE,shift = FALSE)

#------------
# Example 1D
#------------

N <- 100
n <- 10
X <- as.matrix(seq(1,N,1))
pik <- rep(n/N,N)
s <- wave(X,pik,tore = TRUE,shift =FALSE,comment = TRUE)

Run the code above in your browser using DataLab