Learn R Programming

sanba (version 0.0.3)

estimate_partition: Estimate the Observational and Distributional Partition

Description

Given the output of a sanba model-fitting function, this method estimates both the observational and distributional partitions. For MCMC objects, it computes a point estimate using salso::salso(); for Variational Inference (VI) objects, the cluster allocation is determined by the label with the highest estimated variational probability.

Usage

estimate_partition(object, ...)

# S3 method for SANvi estimate_partition(object, ordered = TRUE, ...)

# S3 method for SANmcmc estimate_partition(object, ordered = TRUE, add_burnin = 0, ncores = 0, ...)

# S3 method for partition_mcmc summary(object, ...)

# S3 method for partition_vi summary(object, ...)

# S3 method for partition_mcmc print(x, ...)

# S3 method for partition_vi print(x, ...)

# S3 method for partition_mcmc plot( x, DC_num = NULL, type = c("ecdf", "boxplot", "scatter"), alt_palette = FALSE, ... )

# S3 method for partition_vi plot( x, DC_num = NULL, type = c("ecdf", "boxplot", "scatter"), alt_palette = FALSE, ... )

Value

A list of class partition_vi or partition_mcmc containing

  • obs_level: a data frame containing the data values, their group indexes, and the observational and distributional clustering assignments for each observation.

  • dis_level: a vector with the distributional clustering assignment for each unit.

Arguments

object

Object of class SANmcmc (usually, the result of a call to fit_fiSAN, fit_fSAN, or fit_CAM with est_method = "MCMC") or SANvi (the result of a call to fit_fiSAN, fit_fSAN, or fit_CAM with est_method = "VI").

...

Additional graphical parameters to be passed to the plot function.

ordered

Logical, if TRUE (default), the function sorts the distributional cluster labels reflecting the increasing values of medians of the data assigned to each DC. If FALSE, no ordering is applied.

add_burnin

Integer (default = 0). Number of observations to discard as additional burn-in (only for SANmcmc objects).

ncores

A parameter to pass to the salso::salso() function (only for SANmcmc objects). The number of CPU cores to use for parallel computing; a value of zero indicates the use of all cores on the system.

x

The result of a call to estimate_partition.

DC_num

An integer or a vector of integers indicating which distributional clusters to plot.

type

What type of plot should be drawn. Available types are "boxplot", "ecdf", and "scatter".

alt_palette

Logical, the color palette to be used. Default is R base colors (alt_palette = FALSE).

See Also

salso::salso()

Examples

Run this code
set.seed(123)
y <- c(rnorm(40,0,0.3), rnorm(20,5,0.3))
g <- c(rep(1:6, each = 10))
out <- fit_fSAN(y = y, group = g, "VI", vi_param = list(n_runs = 10))
plot(out)
clust <- estimate_partition(out)
summary(clust)
plot(clust, lwd = 2, alt_palette = TRUE)
plot(clust, type = "scatter", alt_palette = FALSE, cex = 2)

set.seed(123)
y <- c(rnorm(40,0,0.3), rnorm(20,5,0.3))
g <- c(rep(1:6, each = 10))
out <- fit_fSAN(y = y, group = g, "MCMC", mcmc_param=list(nrep=500,burn=200))
plot(out)
clust <- estimate_partition(out)
summary(clust)
plot(clust, lwd = 2)
plot(clust,  type = "boxplot", alt_palette = TRUE)
plot(clust,  type = "scatter", alt_palette = TRUE, cex = 2, pch = 4)

Run the code above in your browser using DataLab