Learn R Programming

predictionet (version 1.18.0)

data.discretize: Function to discretize data based on user specified cutoffs

Description

This function enable discretization of data based on cutoffs specified by the users

Usage

data.discretize(data, cuts)

Arguments

data
matrix of continuous or categorical values (gene expressions for example); observations in rows, features in columns.
cuts
list of cutoffs for each variable.

Value

a matrix of categorical values where categories are {1,2,..,n} depending on the list of cutoffs specified in cuts; observations in rows, features in columns.

Details

This function is discretizing the continuous value in data using the cutoffs specified in cuts to create categories represented by increasing integers in 1,2,...n where n is the maximum number of categories in the dataset.

See Also

discretize

Examples

Run this code
## load gene expression data for colon cancer data, list of genes related to RAS signaling pathway and the corresponding priors
data(expO.colon.ras)
## discretize the data in 3 categories
categories <- rep(3, ncol(data.ras))
## estimate the cutoffs (tertiles) for each gene
cuts.discr <- lapply(apply(rbind("nbcat"=categories, data.ras), 2, function(x) { y <- x[1]; x <- x[-1]; return(list(quantile(x=x, probs=seq(0, 1, length.out=y+1), na.rm=TRUE)[-c(1, y+1)])) }), function(x) { return(x[[1]]) })
data.ras.bin <- data.discretize(data=data.ras, cuts=cuts.discr)

Run the code above in your browser using DataLab