"naiveBayesKernel"(expression, classes, ...) "naiveBayesKernel"(expression, test, densityFunction = density, densityParameters = list(bw = "nrd0", n = 1024, from = expression(min(featureValues)), to = expression(max(featureValues))), weighted = c("both", "unweighted", "weighted"), weight = c("all", "height difference", "crossover distance", "sum differences"), minDifference = 0, returnType = c("label", "score", "both"), verbose = 3)
matrix
or ExpressionSet
containing
the training data. For a matrix, the rows are features, and the columns
are samples.matrix
method passed to the
ExpressionSet
method.matrix
or ExpressionSet
containing
the test data.densityFunction
."height difference"
, the weight of each prediction
is equal to the verical distance between two densities, for a particular value of x. For
"crossover distance"
, the x positions where two densities cross is firstly calculated.
The predicted class is the class with the highest density at the particular value of x and
the weight is the distance of x from the nearest density crossover point. For
"sum differences"
, the weight is the sum of the weights calculated by both types
of distances."label"
, "score"
, or "both"
. Sets the return value
from the prediction to either a vector of class labels, score for a sample belonging
to the second class, as determined by the factor levels, or both labels and scores
in a data.frame
.weighted
is TRUE
, then a sample's predicted class is the class with
the largest sum of weights, scaled for the number of samples in
the training data of each class. Otherwise, when weighted
is FALSE
,
each feature has an equal vote, and votes for the class with the largest weight,
scaled for class sizes in the training set.
The variable name of each feature's measurements in the iteration over all features is featureValues
.
This is important to know if each feature's measurements need to be referred to in the specification of
densityParameters
, such as for specifying the range of x values of the density function to be computed.
If weight
is "crossover distance"
, the crossover points are computed by considering the
distance between y values of the two densities at every x value. x values for which the sign of the difference
changes compared to the difference of the closest lower value of x are used as the crossover points.
Setting weight to "sum differences"
is intended to find a mix of features which are strongly
differentially expressed and differentially variable.
trainMatrix <- matrix(rnorm(1000, 8, 2), ncol = 10)
trainMatrix[1:30, 1:5] <- trainMatrix[1:30, 1:5] + 5 # Make first 30 genes D.E.
testMatrix <- matrix(rnorm(1000, 8, 2), ncol = 10)
testMatrix[1:30, 6:10] <- testMatrix[1:30, 6:10] + 5 # Make first 30 genes D.E.
classes <- factor(rep(c("Poor", "Good"), each = 5))
# Expected: Good Good Good Good Good Poor Poor Poor Poor Poor
naiveBayesKernel(trainMatrix, classes, testMatrix)
Run the code above in your browser using DataLab