Learn R Programming

AssocBin (version 1.1-2)

chiScores: Scoring functions

Description

These functions define scores to evaluate candidate splits along a single margin within a partition.

Usage

chiScores(bounds, nbelow, n)

miScores(bounds, nbelow, n)

randScores(bounds, nbelow, n)

Value

A vector of scores.

Arguments

bounds

numeric vector giving candidate split bounds in increasing order

nbelow

integer vector giving the number of points below each candidate split

n

the total number of points in the bin to be split

Functions

  • chiScores(): A chi-squared statistic score

  • miScores(): A mutual information score

  • randScores(): A random score for random splitting

Author

Chris Salahub

Details

Scorings

Each of these functions accepts `bounds`, an ordered numeric vector containing the candidate splits within a bin and the bin bounds all in increasing order, and `nbelow` which gives the count of points below each split. `n` is used to determine the number of points above the split.

This implementation choice was made because AssocBin only considers splits on observed points. It can be proven that, for any convex scoring function, the internal maximum will occur at an observed point. This choice therefore limits the computational search required to identify and split at the optimal coordinate.

Examples

Run this code
vals <- c(2, 5, 12, 16, 19)
chiScores(vals, 1:3, 3)
## same for the miScores
miScores(vals, 1:3, 3)
## random scoring produces different output every time
randScores(vals, 1:3, 3)
randScores(vals, 1:3, 3)

Run the code above in your browser using DataLab