yaImpute (version 1.0-32)

notablyDistant: Find notably distant targets

Description

Notably distant targets are those with relatively large distances from the closest reference observation. A suitable threshold is used to detect large distances.

Usage

notablyDistant(object,kth=1,threshold=NULL,p=0.01,method="distribution")

Value

List of two data frames that contain 1) the references that are notably distant from other references, 2) the targets that are notably distant from the references, 3) the threshold used, and 4) the method used.

Arguments

object

an object of class yai.

kth

the kth neighbor is used.

threshold

the thereshold distance that identifies notably large distances between observations.

p

(1-p)*100 is the percentile point in the distribution of distances used to compute the threshold (only used when threshold is NULL).

method

the method used to compute the threshold, see details.

Author

Nicholas L. Crookston ncrookston.fs@gmail.com

Details

When threshold is NULL, the function computes one using one of two methods. When method is "distribution", assumption is made that distances follow the lognormal distribution, unless the method used to find neighbors is randomForest, in which case the distances are assumed to follow the beta distribution. A specified p value is used to compute the threshold, which is the point in the distribution where a fraction, p, of the neighbors are larger than the threshold.

When method is "quantile", the function uses the quantile function with probs=1-p.

See Also

notablyDifferent yai

Examples

Run this code
data(iris)

set.seed(12345)

# form some test data
refs=sample(rownames(iris),50)
x <- iris[,1:3]      # Sepal.Length Sepal.Width Petal.Length
y <- iris[refs,4:5]  # Petal.Width Species

# build an msn run, first build dummy variables for species.

sp1 <- as.integer(iris$Species=="setosa")
sp2 <- as.integer(iris$Species=="versicolor")
y2 <- data.frame(cbind(iris[,4],sp1,sp2),row.names=rownames(iris))
y2 <- y2[refs,]

names(y2) <- c("Petal.Width","Sp1","Sp2")

msn <- yai(x=x,y=y2,method="msn")

notablyDistant(msn)

Run the code above in your browser using DataLab