Finds an optimal 1D projection of multivariate data that best separates classes using Linear Discriminant Analysis (LDA) or Penalized Discriminant Analysis (PDA), then determines a cutpoint for classification based on entropy splitting.
findproj_Ext(
origclass,
origdata,
PPmethod = "LDA",
q = 1,
weight = TRUE,
lambda = 0.1
)A list with the following components:
Numeric value representing the optimization criterion achieved by the best projection. Higher values indicate better class separation.
Numeric vector of length ncol(origdata) containing the optimal
projection direction coefficients. This vector defines the linear combination
of original variables that maximizes class separation.
Numeric scalar representing the optimal cutpoint (threshold) on the projected data. This value is determined using entropy-based splitting and divides observations into two groups for classification.
Logical vector of length nrow(origdata) indicating which
observations have projected values less than or equal to the cutpoint C
(projdata <= C). These observations are assigned to the left node/class.
Logical vector of length nrow(origdata) indicating which
observations have projected values greater than the cutpoint C (projdata > C).
These observations are assigned to the right node/class.
Factor or numeric vector containing the class labels for each observation.
Numeric matrix or data frame containing the predictor variables. Each row represents an observation and each column represents a variable.
Character string specifying the projection pursuit method.
Either "LDA" (Linear Discriminant Analysis, default) or "PDA"
(Penalized Discriminant Analysis).
Integer specifying the dimension of the projected data. Default is 1 for 1D projection.
Logical indicating whether to use weighted LDA index calculation.
Default is TRUE.
Numeric penalty parameter for the PDA method. Default is 0.1.
Only used when PPmethod = "PDA".
This function performs projection pursuit to find a one-dimensional projection that optimally separates classes in multivariate data. The process involves:
Finding the optimal projection direction using either LDA or PDA
Projecting all observations onto this direction
Determining an optimal cutpoint using entropy-based splitting
Creating binary classification indicators based on the cutpoint
The cutpoint is calculated to minimize the weighted entropy of the resulting split. In edge cases where the cutpoint equals the maximum projected value, the function uses the second-largest value to ensure a valid split.
Lee, YD, Cook, D., Park JW, and Lee, EK (2013) PPtree: Projection Pursuit Classification Tree, Electronic Journal of Statistics, 7:1369-1386.