neighborhood: Given Bayesian features, returns those samples from a dataset that
exhibit a similarity (i.e., the neighborhood).
Description
The neighborhood \(N_i\) is defined as the set of samples that
have a similarity greater than zero to the given sample \(s_i\). Segmentation
is done using equality (==) for discrete features and less than or equal
(<=) for continuous features. Note that feature values NA and NaN
are also supported using is.na() and is.nan().
data.frame of Bayes-features, used to segment/select the
rows that should make up the neighborhood.
selectedFeatureNames
vector of names of features to use to demarcate
the neighborhood. If empty, uses all features' names.
retainMinValues
DEFAULT 0 the amount of samples to retain during
segmentation. For separating a neighborhood, this value typically should
be 0, so that no samples are included that are not within it. However,
for very sparse data or a great amount of variables, it might still make
sense to retain samples.
Value
data.frame with rows that were selected as neighborhood. It is
guaranteed that the rownames are maintained.
# NOT RUN {nbh <- mmb::neighborhood(df = iris, features = mmb::createFeatureForBayes(
name = "Sepal.Width", value = mean(iris$Sepal.Width)))
print(nrow(nbh))
# }