Given \(n\) datasets or statistical units, each containing \(m\) feature vectors, the one-to-one matching problem is to find a set of \(n\) label permutations that produce the best match of feature vectors across units. The objective function to minimize is the sum of squared (Euclidean) distances between all feature vectors having the same (new) label. This amounts to minimizing the sum of the within-label variances.
The template-based method consists in relabeling successively each sample unit to best match a template matrix of feature vectors. This method is very fast but its optimization performance is only as good as the template. For best results, the template should be representative of the collected data.
If x is a matrix, the rows should be sorted by increasing unit label and unit should be a nondecreasing sequence of integers, for example \((1,...,1,2,...,2,...,n,...,n)\) with each integer \(1,...,n\) replicated \(m\) times.
The argument w can be specified as a vector of positive numbers (will be recycled to length \(p\) if needed) or as a positive definite matrix of size \((p,p)\).