Learn R Programming

SBmedian (version 0.1.2)

mpost.euc: Median Posterior for Subset Posterior Samples in Euclidean Space

Description

mpost.euc is a general framework to merge multiple empirical measures \(Q_1,Q_2,\ldots,Q_M \subset R^p\) from independent subset of data by finding a median $$\hat{Q} = \textrm{argmin}_Q \sum_{m=1}^M d(Q,Q_m)$$ where \(Q\) is a weighted combination and \(d(P_1,P_2)\) is distance in RKHS between two empirical measures \(P_1\) and \(P_2\). As in the references, we use RBF kernel with bandwidth parameter \(\sigma\).

Usage

mpost.euc(
  splist,
  sigma = 0.1,
  maxiter = 121,
  abstol = 1e-06,
  show.progress = FALSE
)

Value

a named list containing:

med.atoms

a vector or matrix of all atoms aggregated.

med.weights

a weight vector that sums to 1 corresponding to med.atoms.

weiszfeld.weights

a weight for \(M\) subset posteriors.

weiszfeld.history

updated parameter values. Each row is for iteration, while columns are weights corresponding to weiszfeld.weights.

Arguments

splist

a list of length \(M\) containing vectors or matrices of univariate or multivariate subset posterior samples respectively.

sigma

bandwidth parameter for RBF kernel.

maxiter

maximum number of iterations for Weiszfeld algorithm.

abstol

stopping criterion for Weiszfeld algorithm.

show.progress

a logical; TRUE to show iteration mark, FALSE otherwise.

References

minsker_scalable_2014SBmedian

minsker_robust_2017SBmedian

Examples

Run this code
## Median Posteior from 2-D Gaussian Samples
#  Step 1. let's build a list of atoms whose numbers differ
set.seed(8128)                   # for reproducible results
mydata = list()
mydata[[1]] = cbind(rnorm(96, mean= 1), rnorm(96, mean= 1))
mydata[[2]] = cbind(rnorm(78, mean=-1), rnorm(78, mean= 0))
mydata[[3]] = cbind(rnorm(65, mean=-1), rnorm(65, mean= 1))
mydata[[4]] = cbind(rnorm(77, mean= 2), rnorm(77, mean=-1))

#  Step 2. Let's run the algorithm
myrun = mpost.euc(mydata, show.progress=TRUE)

#  Step 3. Visualize
#  3-1. show subset posterior samples
opar <- par(no.readonly=TRUE)
par(mfrow=c(2,3), no.readonly=TRUE)
for (i in 1:4){
  plot(mydata[[i]], cex=0.5, col=(i+1), pch=19, xlab="", ylab="", 
       main=paste("subset",i), xlim=c(-4,4), ylim=c(-3,3))
}

#  3-2. 250 median posterior samples via importance sampling
id250 = base::sample(1:nrow(myrun$med.atoms), 250, prob=myrun$med.weights, replace=TRUE)
sp250 = myrun$med.atoms[id250,]
plot(sp250, cex=0.5, pch=19, xlab="", ylab="", 
     xlim=c(-4,4), ylim=c(-3,3), main="median samples")

#  3-3. convergence over iterations
matplot(myrun$weiszfeld.history, xlab="iteration", ylab="value",
        type="b", main="convergence of weights")
par(opar)
        

Run the code above in your browser using DataLab