Learn R Programming

rebmix (version 2.8.0)

RCLSMIX-methods: Predicts Class Membership Based Upon a Model Trained by REBMIX

Description

Returns as default the RCLSMIX algorithm output for mixtures of conditionally independent normal, lognormal, Weibull, gamma, binomial, Poisson or Dirac component densities. If model equals "RCLSMVNORM" output for mixtures of multivariate normal component densities with unrestricted variance-covariance matrices is returned.

Usage

## S3 method for class 'RCLSMIX':
RCLSMIX(model = "RCLSMIX", x = list(), Dataset = data.frame(),
        Zt = factor(), ...)
## ... and for other signatures
## S3 method for class 'RCLSMIX':
summary(object, ...)
## ... and for other signatures

Arguments

model
see Methods section below.
x
a list of objects of class REBMIX of length $o$ obtained by running REBMIX on $g = 1, \ldots, s$ train datasets $Y_{\mathrm{train}g}$ all of length $n_{\mathrm{train}g}$. For the train da
Dataset
a data frame containing test dataset $Y_{\mathrm{test}}$ of length $n_{\mathrm{test}}$. For the test dataset the corresponding class membership $\bm{\Omega}_{g}$ is not known. The default value is data.frame().
Zt
a factor of true class membership $\bm{\Omega}_{g}$ for the test dataset. The default value is factor().
object
see Methods section below.
...
currently not used; additional arguments affecting the summary produced.

Value

  • Returns an object of class RCLSMIX or RCLSMVNORM.

References

R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 1973.

Examples

Run this code
devAskNewPage(ask = TRUE)

data("adult")

# Find complete cases.

adult <- adult[complete.cases(adult), ]

# Replace levels with numbers.

adult <- as.data.frame(data.matrix(adult))

# Split adult dataset into two train subsets for two Incomes
# and remove Type and Income columns.

trainle50k <- subset(adult, subset = (Type == 2) & (Income == 1), 
  select = c(-Type, -Income))
traingt50k <- subset(adult, subset = (Type == 2) & (Income == 2), 
  select = c(-Type, -Income))

trainall <- subset(adult, subset = Type == 2, select = c(-Type, -Income))

train <- as.factor(subset(adult, subset = Type == 2, select = c(Income))[, 1])

# Extract test dataset form adult dataset and remove Type 
# and Income columns.

testle50k <- subset(adult, subset = (Type == 1) & (Income == 1), 
  select = c(-Type, -Income))
testgt50k <- subset(adult, subset = (Type == 1) & (Income == 2), 
  select = c(-Type, -Income))

testall <- subset(adult, subset = Type == 1, select = c(-Type, -Income))

test <- as.factor(subset(adult, subset = Type == 1, select = c(Income))[, 1])

# Estimate number of components, component weights and component 
# parameters for Naive Bayes.

cmax <- unlist(lapply(apply(trainall, 2, unique), length))

adultest <- list(0)

for (i in 1:14) {
  adultest[[i]] <- REBMIX(Dataset = list(as.data.frame(trainle50k[, i]), 
    as.data.frame(traingt50k[, i])),
    Preprocessing = "histogram",
    cmax = if (cmax[i] > 120) 12 else cmax[i],
    Criterion = "BIC",
    pdf = if (cmax[i] > 120) "normal" else "Dirac",
    K = if (cmax[i] > 120) 13:43 else 1)
}

# Best-first feature subset selection.

c <- NULL; rvs <- 1:14; Error <- 1.0

for (i in 1:14) {
  k <- NA

  for (j in rvs) {
    adultcla <- RCLSMIX(x = adultest[c(c, j)],
      Dataset = as.data.frame(trainall[, c(c, j)]),
      Zt = train) 

    if (adultcla@Error < Error) {
      Error <- adultcla@Error; k <- j
    }
  }

  if (is.na(k)) {
    break
  }
  else {
    c <- c(c, k); rvs <- rvs[-which(rvs == k)]
  }
}

# Error on train dataset.

Error

# Selected features.

adultcla <- RCLSMIX(x = adultest[c],
  Dataset = as.data.frame(testall[, c]), 
  Zt = test)

adultcla
  
summary(adultcla)

# Plot selected features.

plot(adultcla, nrow = 5, ncol = 2)

Run the code above in your browser using DataLab