naiveBayes: Naive bayes classifier using histograms and shrinkage

Description

After binning, this adds pseudo counts to each bin count to give df approximate degrees of freedom. If partition=quantile, this does not assume a continuous uniform prior over support, but rather a discrete uniform over all (unlabeled) observations points.

Usage

naiveBayes(formula, data, weights, df = 20, nbins = 30, partition = c("quantile", "width"))
naiveBayes.fit(X, y, weights, df = 20, nbins = 30, partition = c("quantile", "width"))

Arguments

formula

an object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. Only main effects (not interactions) are allowed.

data

data.frame of predictors, can include continuous and categorical/factors along with a response vector (1 = linked, 0 = unlinked), and (optionally) observation weights (e.g., weight column). The column names of data need to include the terms specified in formula.

weights

a vector of observation weights or the column name in data that corresponds to the weights.

the degrees of freedom for each component density. if vector, each predictor can use a different df

nbins

the number of bins for continuous predictors

partition

for binning; indicates if breaks generated from quantiles or equal spacing

data frame of categorical and/or numeric variables

binary vector indicating linkage (1 = linked, 0 = unlinked) or logical vector (TRUE = linked, FALSE = unlinked)

Value

BF a bayes factor object; list of component bayes factors

Details

Fits a naive bayes model to continous and categorical/factor predictors. Continous predictors are first binned, then estimates shrunk towards zero.

Examples

Run this code

# See vignette: "Statistical Methods for Crime Series Linkage" for usage.