subsample

<code>subsample()</code> finds the nearest data points in a dataset to a given set of points as described in Joseph and Vakayil (2021). It uses an efficient kd-tree based algorithm that allows for lazy deletion of a data point from the kd-tree, thereby avoiding the need to rebuild the tree after each query. Please see Blanco and Rai (2014) for details.

Procedure to optimally split a dataset for training and testing.
'SPlit' is based on the method of support points, which is independent of modeling methods.
Please see Joseph and Vakayil (2021) <doi:10.1080/00401706.2021.1921037> for details.
This work is supported by U.S. National Science Foundation grant DMREF-1921873.

Akhil Vakayil

SPlit

Split a Dataset for Training and Testing

Roshan Joseph

Simon Mak

subsample function

<dl>	<dt>data</dt>
<dd>The dataset; should be numeric.</dd>	<dt>points</dt>
<dd>The set of query points of the same dimension as the dataset.</dd></dl>

Arguments

Nearest neighbor subsampling — subsample

<dl>

	<dt>data</dt>
<dd>The dataset; should be numeric.</dd>

	<dt>points</dt>
<dd>The set of query points of the same dimension as the dataset.</dd>

</dl>

subsample: Nearest neighbor subsampling

Description

Usage

Value

Arguments

References