pcops: Profile COPS Function (aka COPS Variant 2)

Description

Metaparameter selection for MDS models baseed on the Profile COPS approach (COPS Variant 2). It uses copstress for hyperparameter selection. It is a special case of a STOPS model.

Usage

pcops(
  dis,
  loss = c("stress", "smacofSym", "smacofSphere", "strain", "sammon", "rstress",
    "powermds", "sstress", "elastic", "powersammon", "powerelastic", "powerstress",
    "sammon2", "powerstrain", "apstress", "rpowerstress"),
  weightmat = NULL,
  ndim = 2,
  init = NULL,
  theta = 1,
  stressweight = 1,
  cordweight,
  q = 2,
  minpts = ndim + 1,
  epsilon = 100,
  rang,
  optimmethod = c("ALJ", "pso", "SANN", "DIRECT", "DIRECTL", "stogo", "MADS", "hjk"),
  lower = 0.5,
  upper = 5,
  verbose = 0,
  scale = c("proc", "sd", "none", "std"),
  normed = TRUE,
  s = 4,
  stresstype = "default",
  acc = 1e-07,
  itmaxo = 200,
  itmaxi = 10000,
  ...
)

Value

A list with the components

copstress: the weighted loss value
OC: the OPTICS cordillera for the scaled configuration (as defined by scale)
optim: the object returned from the optimization procedure
stress: the stress (square root of stress.m)
stress.m: default normalized stress
parameters: the parameters used for fitting (kappa, lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object

Arguments

dis

numeric matrix or dist object of a matrix of proximities

loss

which loss function to be used for fitting, defaults to strain. Currently allows for the following models:

Power transformations of observed proximities only (theta must be scalar): Strain loss or classical scaling (strain, workhorse is cmdscale), Kruskall's stress for symmetric matrices (smacofSym or stress and smacofSphere for scaling onto a sphere; workhorse is smacof), Sammon mapping (sammon or sammon2; for the earlier the workhorse is sammon from MASS for the latter it is smacof), elastic scaling (elastic, the workhorse is smacof), Takane et al's s-Stress sstress (workhorse is powerstressMin)
Power transformations of fitted distances only (theta must be scalar): De Leeuw's r-stress rstress (workhorse is powerstressMin)
Power transformations of fitted distances and observed proximities (theta must be scalar or of length 2): Power MDS (powermds), Sammon mapping/elastic scaling with powers (powersammon, powerelastic)
Power transfomations of fitted distances, observed proximities and weights (theta must be of length 3 at most): powerstress (POST-MDS, powerstress), restricted powerstress with equal transformations for distances and proximities (rpowerstress); workhorse is powerstressMin)
Approximation to power stress (theta must be of length 2): Approximated power stress (apstress; workhorse is smacof)

weightmat

(optional) a matrix of nonnegative weights; defaults to 1 for all off diagonals

ndim

number of dimensions of the target space

init

(optional) initial configuration. If not supplied, the Torgerson scaling result of the dissimilarity matrix dis^theta[2]/enorm(dis^theta[2],weightmat) is used.

theta

the theta vector of powers; see the corresponding cop_XXX function for which theta are allowed. If a scalar is given as argument, it will be recycled. Defaults to 1.

stressweight

weight to be used for the fit measure; defaults to 1

cordweight

weight to be used for the cordillera; if missing gets estimated from the initial configuration so that copstress = 0 for theta=c(1,1)

q

the norm of the cordillera; defaults to 1

minpts

the minimum points to make up a cluster in OPTICS; defaults to ndim+1

epsilon

the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10

rang

range of the minimum reachabilities to be considered. If missing it is found from the initial configuration by taking 1.5 times the maximal minimum reachability of the model with theta=c(1,1). If NULL it will be normed to each configuration's minimum and maximum distance, so an absolute value of goodness-of-clusteredness. Note that the latter is not necessarily desirable when comparing configurations for their relative clusteredness. See also cordillera.

optimmethod

What general purpose optimizer to use? Defaults to our adaptive LJ version (ALJ). Also allows particle swarm optimization with s particles ("pso") and simulated annealing ("SANN"), "DIRECT" and "DIRECTL", Hooke-Jeeves ("hjk"), StoGo ("stogo"), and "MADS". We recommend not using SANN and pso with the rstress, sstress and the power stress models. We made good experiences with ALJ, stogo, DIRECT and DIRECTL and also MADS.

lower

A vector of the lower box contraints of the search region. Its length must match the length of theta.

upper

A vector of the upper box contraints of the search region. Its length must match the length of theta.

verbose

numeric value hat prints information on the fitting process; >2 is extremely verbose. Note that for models with some parameters fixed, the iteration progress of the optimizer shows different values also for the fixed parameters because due to the modular setup we always optimize over a three parameter vector. These values are inconsequential however as internally they will be fixed.

scale

should the configuration be scaled and/or centered for calculating the cordillera? "std" standardizes each column of the configurations to mean=0 and sd=1 (typically not a good idea), "sd" scales the configuration by the maximum standard devation of any column (default), "proc" adjusts the fitted configuration to the init configuration (or the Togerson scaling solution if init=NULL). This parameter only has an effect for calculating the cordillera, the fitted and returned configuration is NOT scaled.

normed

should the cordillera be normed; defaults to TRUE

s

number of particles if pso is used

stresstype

what stress to be used for comparisons between solutions. Currently not implemented and pcops uses explicitly normalized stress for copstress (not stress-1). Stress-1 is reported by the print function though.

acc

termination threshold difference of two successive outer minimization steps.

itmaxo

iterations of the outer step (optimization over the hyperparmeters; if solver allows it). Defaults to 200.

itmaxi

iterations of the inner step (optimization of the MDS). Defaults to 10000 (whichis huge).

...

additional arguments to be passed to the optimization procedure

Examples

Run this code

dis<-as.matrix(smacof::kinshipdelta)
set.seed(210485)
#configuration is scaled with highest column sd for calculating cordilera 
res1<-pcops(dis,loss="strain",lower=0.1,upper=5,minpts=2) 
res1
summary(res1)
plot(res1)

Run the code above in your browser using DataLab