snpRF (version 0.4)

predict.snpRF: predict method for snpRF objects

Description

Prediction of test data using the modified random forest algorithm implemented in snpRF.

Usage

"predict"(object, newdata.autosome=NULL, newdata.xchrom=NULL, xchrom.names=NULL, newdata.covar=NULL, type = "response", norm.votes = TRUE, predict.all=FALSE, proximity = FALSE, nodes=FALSE, cutoff, ...)

Arguments

object
an object of class snpRF, as that created by the function snpRF.
newdata.autosome
A matrix of autosomal markers with each column corresponding to a SNP coded as the count of a particular allele (i.e. 0,1 or 2), and each row corresponding to a sample/individual.(Note: If all newdata.* are not given, the out-of-bag prediction in object is returned.)
newdata.xchrom
A matrix of X chromosome markers, each marker coded as two adjacent columns, alleles of a marker are coded as 0 or 1 for carrying a particular allele. Although males only have one X-chromosome, their markers are coded as 2 columns as well, the second column being a duplicate of the first. Each row of this matrix corresponds to a sample/individual. This data must be phased in chromosomal order. (Note: If all newdata.* are not given, the out-of-bag prediction in object is returned.)
xchrom.names
A vector of names for markers (1 name per marker) in the newdata.xchrom matrix ordered in the same manner as markers in newdata.xchrom.
newdata.covar
A matrix of covariates, each column being a different covariate and each row, a sample/individual. (Note: If all newdata.* are not given, the out-of-bag prediction in object is returned.)
type
one of response, prob. or votes, indicating the type of output: predicted values, matrix of class probabilities, or matrix of vote counts. class is allowed, but automatically converted to "response", for backward compatibility.
norm.votes
Should the vote counts be normalized (i.e., expressed as fractions)?
predict.all
Should the predictions of all trees be kept?
proximity
Should proximity measures be computed?
nodes
Should the terminal node indicators (an n by ntree matrix) be return? If so, it is in the ``nodes'' attribute of the returned object.
cutoff
(Classification only) A vector of length equal to number of classes. The `winning' class for an observation is the one with the maximum ratio of proportion of votes to cutoff. Default is taken from the forest$cutoff component of object (i.e., the setting used when running snpRF).
...
not used currently.

Value

The object returned depends on the argument type:
response
predicted classes (the classes with majority vote).
prob
matrix of class probabilities (one column for each class and one row for each input).
vote
matrix of vote counts (one column for each class and one row for each new input); either in raw counts or in fractions (if norm.votes=TRUE).
If predict.all=TRUE, then the individual component of the returned object is a character matrix where each column contains the predicted class by a tree in the forest.If proximity=TRUE, the returned object is a list with two components: pred is the prediction (as described above) and proximity is the proximitry matrix.If nodes=TRUE, the returned object has a ``nodes'' attribute, which is an n by ntree matrix, each column containing the node number that the cases fall in for that tree.NOTE: Any ties are broken at random, so if this is undesirable, avoid it by using odd number ntree in snpRF().

References

Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

See Also

snpRF

Examples

Run this code
data(snpRFexample)
set.seed(111)
ind <- sample(2, nrow(autosome.snps), replace = TRUE, prob=c(0.8, 0.2))
eg.rf <- snpRF(x.autosome=autosome.snps[ind==1,],x.xchrom=xchrom.snps[ind==1,],
               xchrom.names=xchrom.snps.names,x.covar=covariates[ind==1,], 
               y=phenotype[ind==1])
eg.pred <- predict(eg.rf, newdata.autosome=autosome.snps[ind==2,], 
                   newdata.xchrom=xchrom.snps[ind==2,], 
                   xchrom.names=xchrom.snps.names, 
                   newdata.covar=covariates[ind==2,])
table(observed = phenotype[ind==2], predicted = eg.pred)
## Get prediction for all trees.
predict(eg.rf,newdata.autosome=autosome.snps[ind==2,], 
        newdata.xchrom=xchrom.snps[ind==2,], 
        xchrom.names=xchrom.snps.names, 
        newdata.covar=covariates[ind==2,], predict.all=TRUE)
## Proximities.
predict(eg.rf,newdata.autosome=autosome.snps[ind==2,], 
        newdata.xchrom=xchrom.snps[ind==2,], 
        xchrom.names=xchrom.snps.names,	
        newdata.covar=covariates[ind==2,], proximity=TRUE)
## Nodes matrix.
str(attr(predict(eg.rf,newdata.autosome=autosome.snps[ind==2,], 
                 newdata.xchrom=xchrom.snps[ind==2,], 
                 xchrom.names=xchrom.snps.names, 
                 newdata.covar=covariates[ind==2,], nodes=TRUE), "nodes"))

Run the code above in your browser using DataLab