EBIC: Bayesian Information Criterion (BIC) for a partition.

Description

This function calculates an extended version of BIC, which is computed using a particular weighted average of the total residual sum of squares and the number of clusters.

SCEM uses the following equation for the BIC of each partition:

BIC(P) = (np) RSS(P)np + |P|(B_n^-1-1) (nB_n),ASCII representation

where RSS(P) = _q=1^Q RSS(S_q)ASCII representation.

The sample size of each individual time series (i.e. the number of observations) is denoted by $n$ , but in dealing with archaeological data, not all the time series in a data set will have the same number of observations.

In order to have a reasonable representative value for the sample size, we have chosen to use the natural arithmetic mean n=(n_1+…+n_p)/pASCII representation.

(B_n^-1-1)(nB_n)ASCII representation is the tuning parameter that places the penalty on the number of clusters (also note that the term nB_nASCII representation). Using a different tuning parameter _nASCII representation in place of (B_n^-1-1)(nB_n)ASCII representation allows stronger or weaker penalties on the number of clusters.

Usage

EBIC(paths, partition, bandwidth)

Arguments

paths

A list of data frames, where each frame contains the data for one individual. Every data frame should have two columns with names 'distance' and 'oxygen'.

partition

A list of vectors. Each element in the list is a vector of integers, corresponding to individuals considered in one group.

bandwidth

Denotes the order of the bandwidth that should be used in the estimation process. bandwidth = k will mean that the bandwidth is n^k.

Value

Value of the extended BIC function for the partition.

Examples

Run this code

# NOT RUN {
armenia_split = split(armenia,f = armenia$ID)
band = -0.33
p = length(armenia_split)
EBIC(armenia_split,1:p,band)
# }

Run the code above in your browser using DataLab

Data Engineering and BI courses are free this week!

Description

Usage

Arguments

Value

Examples