# NullDistribution

##### Specification of the Reference Distribution

Specification of the asymptotic, approximative (Monte Carlo) and exact reference distribution.

- Keywords
- htest

##### Usage

```
asymptotic(maxpts = 25000, abseps = 0.001, releps = 0)
approximate(nresample = 10000L, parallel = c("no", "multicore", "snow"),
ncpus = 1L, cl = NULL, B)
exact(algorithm = c("auto", "shift", "split-up"), fact = NULL)
```

##### Arguments

- maxpts
an integer, the maximum number of function values. Defaults to

`25000`

.- abseps
a numeric, the absolute error tolerance. Defaults to

`0.001`

.- releps
a numeric, the relative error tolerance. Defaults to

`0`

.- nresample
a positive integer, the number of Monte Carlo replicates used for the computation of the approximative reference distribution. Defaults to

`10000L`

.- parallel
a character, the type of parallel operation: either

`"no"`

(default),`"multicore"`

or`"snow"`

.- ncpus
an integer, the number of processes to be used in parallel operation. Defaults to

`1L`

.- cl
an object inheriting from class

`"cluster"`

, specifying an optional parallel or snow cluster if`parallel = "snow"`

. Defaults to`NULL`

.- B
deprecated, use

`nresample`

instead.- algorithm
a character, the algorithm used for the computation of the exact reference distribution: either

`"auto"`

(default),`"shift"`

or`"split-up"`

.- fact
an integer to multiply the response values with. Defaults to

`NULL`

.

##### Details

`asymptotic`

, `approximate`

and `exact`

can be supplied to the
`distribution`

argument of, e.g., `independence_test`

to
provide control of the specification of the asymptotic, approximative (Monte
Carlo) and exact reference distribution respectively.

The asymptotic reference distribution is computed using a randomised
quasi-Monte Carlo method (Genz and Bretz, 2009) and is applicable to arbitrary
covariance structures with dimensions up to 1000. See
`GenzBretz`

in package mvtnorm for
details on `maxpts`

, `abseps`

and `releps`

.

The approximative (Monte Carlo) reference distribution is obtained by a
conditional Monte Carlo procedure, i.e., by computing the test statistic for
`nresample`

random samples from all admissible permutations of the
response \(\bf{Y}\) within each block (Hothorn *et al.*, 2008). By
default, the distribution is computed using serial operation
(`parallel = "no"`

). The use of parallel operation is specified by
setting `parallel`

to either `"multicore"`

(not available for MS
Windows) or `"snow"`

. In the latter case, if `cl = NULL`

(default)
a cluster with `ncpus`

processes is created on the local machine unless a
default cluster has been registered (see
`setDefaultCluster`

in package
parallel) in which case that gets used instead. Alternatively, the use
of an optional parallel or snow cluster can be specified by
`cl`

. See ‘Examples’ and package parallel for details on
parallel operation.

The exact reference distribution, currently available for univariate
two-sample problems only, is computed using either the shift algorithm
(Streitberg and R<U+00F6>hmel, 1984, 1986, 1987) or the split-up
algorithm (van de Wiel, 2001). The shift algorithm handles blocks pertaining
to, e.g., pre- and post-stratification, but can only be used with positive
integer-valued scores \(h(\bf{Y})\). The split-up algorithm can be
used with non-integer scores, but does not handle blocks. By default, an
automatic choice is made (`algorithm = "auto"`

) but the shift and
split-up algorithms can be selected by setting `algorithm`

to either
`"shift"`

or `"split-up"`

respectively.

##### Note

Starting with coin version 1.1-0, the default for `algorithm`

is
`"auto"`

, having identical behaviour to `"shift"`

in previous
versions. In earlier versions of the package, `algorithm = "shift"`

silently switched to the split-up algorithm if non-integer scores were
detected, whereas the current version exits with a warning.

In versions of coin prior to 1.3-0, the number of Monte Carlo replicates
in `approximate()`

was specified using the now deprecated `B`

argument. **This will be made defunct and removed in a future release.**
It has been replaced by the `nresample`

argument (for conformity with the
libcoin, party and partykit packages).

##### References

Genz, A. and Bretz, F. (2009). *Computation of Multivariate Normal and
t Probabilities*. Heidelberg: Springer-Verlag.

Hothorn, T., Hornik, K., van de Wiel, M. A. and Zeileis, A. (2008).
Implementing a class of permutation tests: The coin package. *Journal of
Statistical Software* **28**(8), 1--23. 10.18637/jss.v028.i08

Streitberg, B. and R<U+00F6>hmel, J. (1984). Exact nonparametrics
in APL. *APL Quote Quad* **14**(4), 313--325.
10.1145/384283.801115

Streitberg, B. and R<U+00F6>hmel, J. (1986). Exact distributions
for permutations and rank tests: an introduction to some recently published
algorithms. *Statistical Software Newsletter* **12**(1), 10--17.

Streitberg, B. and R<U+00F6>hmel, J. (1987). Exakte verteilungen
f<U+00FC>r rang- und randomisierungstests im allgemeinen
c-stichprobenfall. *EDV in Medizin und Biologie* **18**(1), 12--19.

van de Wiel, M. A. (2001). The split-up algorithm: a fast symbolic method
for computing p-values of distribution-free statistics. *Computational
Statistics* **16**(4), 519--538. 10.1007/s180-001-8328-6

##### Examples

```
# NOT RUN {
## Approximative (Monte Carlo) Cochran-Mantel-Haenszel test
## Serial operation
set.seed(123)
cmh_test(disease ~ smoking | gender, data = alzheimer,
distribution = approximate(nresample = 100000))
# }
# NOT RUN {
## Multicore with 8 processes (not for MS Windows)
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
distribution = approximate(nresample = 100000,
parallel = "multicore", ncpus = 8))
## Automatic PSOCK cluster with 4 processes
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
distribution = approximate(nresample = 100000,
parallel = "snow", ncpus = 4))
## Registered FORK cluster with 12 processes (not for MS Windows)
fork12 <- parallel::makeCluster(12, "FORK") # set-up cluster
parallel::setDefaultCluster(fork12) # register default cluster
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
distribution = approximate(nresample = 100000,
parallel = "snow"))
parallel::stopCluster(fork12) # clean-up
## User-specified PSOCK cluster with 8 processes
psock8 <- parallel::makeCluster(8, "PSOCK") # set-up cluster
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
distribution = approximate(nresample = 100000,
parallel = "snow", cl = psock8))
parallel::stopCluster(psock8) # clean-up
# }
```

*Documentation reproduced from package coin, version 1.3-1, License: GPL-2*