rsm.sample: Conditional Sampler for Regression-Scale Models

Description

Generates replicates of the MLEs of the parameters occuring in a regression-scale model using as reference distribution the conditional distribution of the MLEs given the value of the ancillary.

Usage

rsm.sample(data = stop("no data given"), R = 10000, 
    ran.gen = stop("candidate distribution is missing, with no default"), 
           trace = TRUE, step = 100, …)

Arguments

data

A special conditional sampling data object. This object must be a list with the following elements:

anc: the vector containing the values of the ancillary; usually the Pearson residuals. It has to be of the same length than the number of observations in the linear regression model.
X: the model matrix. It may be obtained applying model.matrix to the fitted rsm object of interest. The number of observations has to be the same than the dimension of the ancillary, and the number of covariates must correspond to the number of regression coefficients defined in the coef component.
coef: the vector of true values of the regression coefficients, that is, the values used in the simulation study.
disp: the true value of the scale parameter used in the simulation study.
family: a family.rsm object characterizing the error distribution of the linear regression model. The following generator functions are available in the marg package of the R package bundle hoa: student (Student's t), extreme (Gumbel or extreme value), logistic, logWeibull, logExponential, logRayleigh and Huber (Huber's least favourable). The demonstration file margdemo.R that accompanies the marg package shows how to create a new generator function.
fixed: a logical value. If TRUE the scale parameter is known.

The make.sample.data function can be used to create this data object from a fitted rsm model.

the number of replicates.

ran.gen

a function which describes how the candidate values used in the Metropolis-Hastings algorithm should be generated. It must be a function of at least two arguments. The first one is the data object data, and the second argument is R, the number of replicates required. Any other information needed may be passed through the … argument. The returned value should be a R times k matrix of simulated values. For the value of k see the details section below.

trace

a logical value; if TRUE, the iteration number is printed. Defaults to TRUE.

step

a numercial value defining after how many iterations to print the iteration number. Default is 100.

…

absorbs additional arguments to ran.gen. These are passed unchanged each time this function is called.

Value

The returned value is an object of class cs containing the following components:

sim

a matrix with R rows each of which contains a sample from the conditional distribution of the MLEs.

rho

the acceptance probabilities at each Metropolis-Hastings step, that is, the probabilities with which the candidate values drawn from the candidate generation distribution are accepted.

seed

the value of .Random.seed when rsm.sample was called.

data

the data as passed to rsm.sample.

the value of R as passed to rsm.sample.

call

the original call to rsm.sample.

Side Effects

The function rsm.sample causes creation of the dataset .Random.seed if it does not already exist, otherwise its value is updated.

Demonstration

The file csamplingdemo.R contains code that can be used to run a conditional simulation study similar to the one described in Brazzale (2000, Section 7.3) using the data given in Example 3 of DiCiccio, Field and Fraser (1990).

Details

The rsm.sample function uses the Metropolis-Hastings algorithm to generate an ergodic chain with equilibrium distribution equal to the conditional distribution of the MLEs given the ancillary. Because of the broad applicability of this algorithm the candidate generation density was not built in, but has to be supplied by the user through the ran.gen argument. The output of this function must be a R times k matrix, where k = p + 1 or k = p + 2 depending on whether the scale parameter is fixed or not. The first p columns contain the MLEs of the regression coefficients, the following the MLEs of the scale parameter if unknown, and the last column contains the probabilities of the candidate values drawn from the candidate generation distribution. Note that these probabilities need only be calculated up to a normalizing constant.

All information is supplied through the data argument. The user has to keep to the structure described above. If a conditional simulation is to be performed for a fitted rsm object, the make.sample.data function can be used to generate this special object. It is advisable to specify the logical switch fixed in the conditional sampling object, although it needs not (in which case the scale parameter is supposed to be unknown).

The conditional simulation (cs) object generated by rsm.sample contains all information necessary for further investigation, such as the derivation of the conditional distribution of test statistics, the calculation of conditional coverage levels of confidence intervals and many more. As the computation is somewhat tricky, an example is given in the demonstration file csamplingdemo.R.

References

Brazzale, A. R. (2000) Practical Small-Sample Parametric Inference. Ph.D. Thesis N. 2230, Department of Mathematics, Swiss Federal Institute of Technology Lausanne.

DiCiccio, T. J., Field, C. A. and Fraser, D. A. S. (1990) Approximations of marginal tail probabilities and inference for scalar parameters. Biometrika, 77, 77--95.