Windham: Windham Robustification of Point Estimators for Exponential Family Distributions

Description

Performs a generalisation of Windham's robustifying method windham1995roscorematchingad for exponential models with natural parameters that are a linear function of the parameters for estimation. Estimators must solve estimating equations of the form $$\sum_{i = 1}^n U(z_i; \theta) = 0.$$ The estimate is found iteratively through a fixed point method as suggested by windham1995ro;textualscorematchingad.

Usage

Windham(
  Y,
  estimator,
  ldenfun,
  cW,
  ...,
  fpcontrol = list(Method = "Simple", ConvergenceMetricThreshold = 1e-10),
  paramvec_start = NULL,
  alternative_populationinverse = FALSE
)

Value

A list:

paramvec the estimated parameter vector
optim information about the fixed point iterations and optimisation process. Including a slot finalweights for the weights in the final iteration.

Arguments

Y: A matrix of measurements. Each row is a measurement, each component is a dimension of the measurement.
estimator: A function that estimates parameters from weighted observations. It must have arguments Y that is a matrix of measurements and w that are weights associated with each row of Y. If it accepts arguments paramvec or paramvec_start then these will be used to specify fixed elements of the parameter vector and the starting guess of the parameter vector, respectively. The estimated parameter vector, including any fixed elements, must be the returned object, or the first element of a returned list, or as the paramvec slot within the est slot of the returned object.
ldenfun: A function that returns a vector of values proportional to the log-density for a matrix of observations Y and parameter vector theta.
cW: A vector of robustness tuning constants. When computing the weight for an observation the parameter vector is multiplied element-wise with cW. For the PPI model, generate cW easily using ppi_cW() and ppi_cW_auto().
...: Arguments passed to estimator.
fpcontrol: A named list of control arguments to pass to FixedPoint::FixedPoint() for the fixed point iteration.
paramvec_start: Initially used to check the function estimator. If estimator accepts a paramvec_start, then the current estimate of the parameter vector is passed as paramvec_start to estimator in each iteration.
alternative_populationinverse: The default is to use Windham_populationinverse(). If TRUE an alternative implementation in Windham_populationinverse_alternative() is used. So far we have not seen any difference between the results.

Details

For any family of models with density $f(z; \theta)$, Windham's method finds the parameter set $\hat\theta$ such that the estimator applied to observations weighted by $f(z; \hat\theta)^c$ returns an estimate that matches the theoretical effect of weighting the full population of the model. When $f$ is proportional to $\exp(\eta(\theta) \cdot T(z))$ and $\eta(\theta)$ is linear, these weights are equivalent to $f(z; c\hat\theta)$ and the theoretical effect of the weighting on the full population is to scale the parameter vector $\theta$ by $1+c$.

The function Windham() assumes that $f$ is proportional to $\exp(\eta(\theta) \cdot T(z))$ and $\eta(\theta)$ is linear. It allows a generalisation where $c$ is a vector so the weight for an observation $z$ is $$f(z; c \circ \theta),$$ where $\theta$ is the parameter vector, $c$ is a vector of tuning constants, and $\circ$ is the element-wise product (Hadamard product).

The solution is found iteratively windham1995roscorematchingad. Given a parameter set $\theta_n$, Windham() first computes weights $f(z; c \circ \theta_n)$ for each observation $z$. Then, a new parameter set $\tilde{\theta}_{n+1}$ is estimated by estimator with the computed weights. This new parameter set is element-wise-multiplied by the (element-wise) reciprocal of $1+c$ to obtain an adjusted parameter set $\theta_{n+1}$. The estimate returned by Windham() is the parameter set $\hat{\theta}$ such that $\theta_n \approx \theta_{n+1}$.

Examples

Run this code

if (requireNamespace("movMF")){
  Y <- movMF::rmovMF(1000, 100 * c(1, 1) / sqrt(2))
} else {
  Y <- matrix(rnorm(1000 * 2, sd = 0.01), ncol = 2)
  Y <- Y / sqrt(rowSums(Y^2))
}
Windham(Y = Y,
   estimator = vMF,
   ldenfun = function(Y, theta){ #here theta is km
     return(drop(Y %*% theta))
   },
   cW = c(0.01, 0.01),
   method = "Mardia")