Learn R Programming

matchFeat (version 1.0)

match.bca.gen: Block Coordinate Ascent Method for General (Balanced or Unbalanced) Data

Description

Solve a feature matching problem by block coordinate ascent

Usage

match.bca.gen(x, unit = NULL, cluster = NULL, w = NULL, 
	method = c("cyclical", "random"), control = list())

Value

A list of class matchFeat with components

cluster

integer vector of cluster assignments (length = now(x))

objective

minimum objective value

mu

sample mean for each cluster/class (feature-by-cluster matrix)

V

sample covariance for each cluster/class (feature-by-feature-by-cluster 3D array)

size

integer vector of cluster sizes

call

function call

Arguments

x

data matrix (rows=instances, columns=features)

unit

vector of unit labels (length = number of rows of x)

cluster

integer specifying the number of classes/clusters to assign the feature vectors to OR integer vector specifiying the initial cluster assignment.

w

feature weights in loss function. Can be specified as single positive number, vector, or positive definite matrix

method

sweeping method for block coordinate ascent: cyclical or random (simple random sampling without replacement)

control

optional list of tuning parameters

Details

If cluster is an integer vector, it must have the same length as unit and its values must range between 1 and the number of clusters.

The list control can contain a field maxit, an integer that fixes the maximum number of algorithm iterations.

References

Degras (2022) "Scalable feature matching across large data collections." tools:::Rd_expr_doi("10.1080/10618600.2022.2074429")
Wright (2015). Coordinate descent algorithms. https://arxiv.org/abs/1502.04759

See Also

match.2x, match.bca, match.bca.gen, match.gaussmix, match.kmeans, match.rec, match.template

Examples

Run this code
data(optdigits)
nobs <- nrow(optdigits$x) # total number of observations
n <- length(unique(optdigits$unit)) # number of statistical units
rmv <- sample.int(nobs, n-1) # remove (n-1) observations to make data unbalanced
min.m <- max(table(optdigits$unit[-rmv])) # smallest possible number of clusters
# lower values will result in an error message 
m <- min.m
result <- match.bca.gen(optdigits$x[-rmv,], optdigits$unit[-rmv], m)



Run the code above in your browser using DataLab