refine: Refine estimates iteratively.

Description

This is a generic function with currently methods for SLik, SLik_j and SLikp objects (as produced by MSL). Depending on the value of its newsimuls argument, and on whether the function used to generate empirical distributions can be called by R, it (1) defines new parameters points and/or (2) infers their summary likelihood or tail probabilities for each parameter point independently, adds the inferred values results as input for refined inference of likelihood or P-value response surface, and provides new point estimates and confidence intervals.

Usage

# S3 method for SLik
refine(object, method=NULL, ...)


# S3 method for default
refine(object, surfaceData, Simulate =
            attr(surfaceData,"Simulate"), maxit = 1, n = NULL, 
            useEI = list(max=TRUE,profileCI=TRUE,rawCI=FALSE), 
            newsimuls = NULL, trypoints=NULL, CIs = useCI, useCI = TRUE, level = 0.95, 
            verbose = list(most=interactive(),final=NULL,movie=FALSE,proj=FALSE),
            precision = Infusion.getOption("precision"),
            nb_cores = NULL, packages=attr(object$logLs,"packages"), 
            env=attr(object$logLs,"env"), method,  using = object$using, 
            eval_RMSEs=TRUE, update_projectors = FALSE,
            cluster_args=list(),
            cl_seed=.update_seed(object),
            nbCluster=quote(refine_nbCluster(nr=nrow(data))),
            ...)

Value

refine returns an updated SLik or SLik_j object.

Arguments

object

an SLik or SLik_j object

surfaceData

A data.frame with attributes, usually taken from the object and thus not specified by user, usable as input for infer_surface.

Simulate

Character string: name of the function used to simulate samples. The only meaningful non-default value is NULL, in which case refine may return (if newsimuls is also NULL) a data frame of parameter points on which to run a simulation function.

maxit

Maximum number of iterative refinements (see also precision argument)

n

NULL or numeric, for a number of parameter points (excluding replicates and confidence interval points in the primitive workflow), whose likelihood should be computed (see n argument of sample_volume). This argument is typically not heeded in the first refinement iteration (only one fifth as many points may be produced), but will be closely approached in later ones (so four refinement iterations with n=1000 is expected to produce 3200 new points). If n is left NULL, the number of points of the initial reference table is used as a reference, but with a somewhat different effect: four refinement iterations starting from a reference table of 1000 ones iis expected to produce 4000 new points (though again, possibly only 200 in the first refinement iteration).

useEI

Cf this argument in rparam

newsimuls

For the SLik_j method, a matrix or data frame, with the same parameters and summary statistics as the data of the original infer_SLik_joint call.

For other methods, a list of simulation of distributions of summary statistics, in the same format as for link{add_simulation}. If no such list is provided (i.e., if newsimuls remains NULL), the attr(object$logLs,"Simulate") function is used (it is inherited from the Simulate argument of add_simulation through the initial sequence of calls of functions add_simulation, infer_logLs or infer_tailp, and infer_surface). If no such function is available, then this function returns parameters for which new distribution should be provided by the user.

trypoints

A data frame of parameters on which the simulation function attr(object$logLs,"Simulate") should be called to extend the reference table. Only for programming by expert users, because poorly thought input trypoints could severely affect the inferences.

CIs

Boolean: whether to infer bounds of (one-dimensional, profile) confidence intervals. Their computation is not quite reliable in parameter spaces of large dimensions, so they should not be trusted per se, yet they may be useful for the definition of new parameter points.

useCI

whether to include parameter points near the inferred confidence interval points in the set of points whose likelihood should be computed. Effective only if CIs was TRUE.

level

Intended coverage of confidence intervals

verbose

A list as shown by the default, or simply a vector of booleans. verbose$most controls whether to display information about progress and results, except plots; $final controls whether to plot() the final object to show the final likelihood surface. Default is to plot it only in an interactive session and if fewer than three parameters are estimated; $movie controls whether to plot() the updated object in each iteration; verbose$proj controls the verbose argument of project.character. If verbose is an unnamed vector of booleans, they are matched to as many elements from "most","movie","final","proj", in that order.

precision

Requested local precision of surface estimation, in terms of prediction standard errors (RMSEs) of both the maximum summary log-likelihood and the likelihood ratio at any CI bound available. Iterations will stop when either maxit is reached, or if the RMSEs have been computed for the object (see eval_RMSEs argument) and this precision is reached for the RMSEs. A given precision on the CI bounds themselves might seem more interesting, but is not well specified by a single precision parameter if the parameters are on widely different scales.

nb_cores

Shortcut for cluster_args$spec for sample simulation.

cluster_args

A list of arguments for makeCluster, in addition to makeCluster's spec argument which is in most cases best specified by the nb_cores argument. Cluster arguments allow independent control of parallel computations for the different steps of a refine iteration (see Details; as a rough but effective summary, use only nb_cores when the simulations support it, and only cluster_args=list(project=list(num.threads=<.>)) when they do not).

packages

NULL or a list with possible elements add_simulation and logL_method, passed respectively as the packages arguments of add_simulation and infer_logLs, wherein they are the additional packages to be loaded on child processes. The default value keeps pre-refine values over iterations.

env

An environment, passed as the env argument to add_simulation. The default value keeps the pre-refine value over iterations.

using

Passed to infer_SLik_joint: a charcter string used to control the joint-density estimation method, as documented for that function. Default is to use to same method as in the the first iteration, but this argument allows a change of method.

method

(A vector of) suggested method(s) for estimation of smoothing parameters (see method argument of infer_surface), and therefore controlling the primitive workflow (see using instead for controlling the up-to-date workflow). The ith element of the vector is used in the ith iteration, if available; otherwise the last element is used. This argument is not always heeded, in that REML may be used if the suggested method is GCV but it appears to perform poorly. The default for SLikp and SLikp objects are "REML" and "PQL", respectively.

eval_RMSEs

passed to MSL

update_projectors

Boolean; whether to update the projectors at each iteration.

cl_seed

NULL or integer, passed to add_simulation. The default code uses an internal function, .update_seed, to update it from a previous iteration.

nbCluster

Passed to infer_SLik_joint. The data in the expression for the default value refers to the data argument of the latter function.

...

further arguments passed to or from other methods. refine passes these arguments to the plot method suitable for the object.

Details

New parameter points are sampled as follows: the algorithm aims to sample uniformly the space of parameters contained in the confidence regions defined by the level argument, and to surround it by a region sampled proportionally to likelihood. In each iteration the algorithm aims to add as many points (say n) as computed in the first iteration, so that after k iterations of refine, there are $n * (k+1)$ points in the simulation table. However, when not enough points satisfy certain criteria, only n/5 points may be added in an iteration, this being compensated in further iterations. For example, if $n=600$, the table may include only 720 points after the first refine, but 1800 after the second.

Independent control of parallelisation may be needed in the different steps, e.g. if the simulations are not easily parallelised whereas the projection method natively handles parallelisation. In the up-to-date workflow with default ranger projection method, prarallelisation controls may be passed to add_reftable for sample simulations, to project methods when projections are updated, and to MSL for RMSE computations (alternatively for the primitive workflow, add_simulation, infer_logLs and MSL are called). nb_cores, if given and not overcome by other options, will control simulation and projection steps (but not RMSE computation): nb_cores gives the number of parallel processes for sample simulation, with additional makeCluster arguments taken from cluster_args, but RMSE computations are performed serially. Further independent control is possible as follows:
cluster_args=list(project=list(num.threads=<.>)) allows control of the num.threads argument of ranger functions;
cluster_args=list(RMSE=list(spec=<number of 'children'>)) can be used to force parallel computation of RMSEs;
cluster_args=list(spec=<.>, <other makeCluster arguments>)) would instead apply the same arguments to both reference table and RMSE computation, overcoming the default effect of nb_cores in both of them; finally
cluster_args=list(reftable=list(<makeCluster arguments>),RMSEs=list(<makeCluster arguments>)) allows full independent control of parallelisation for the two computations.

Examples

Run this code

  ## see Note for links to examples.

Run the code above in your browser using DataLab