Fit a kernel model
kroclearn(
X,
y,
lambda,
kernel = "radial",
param.kernel = NULL,
loss = "hinge",
approx = NULL,
intercept = TRUE,
target.perf = list(),
param.convergence = list()
)An object of class "kroclearn", a list containing:
theta.hat — estimated dual coefficient vector.
intercept — fitted intercept (if applicable).
lambda, kernel, param.kernel, loss.
approx, B (number of sampled pairs if approximation used).
time — training time (seconds).
nobs, p — number of observations and predictors.
converged, n.iter — convergence information.
kfunc — kernel function object.
nystrom — low rank kernel approximation details (if used).
X — training data (post-preprocessing).
preprocessing — details on categorical variables,
removed columns, and column names.
call — the function call.
Predictor matrix or data.frame (categorical variables are automatically one-hot encoded).
Response vector with class labels in {-1, 1}. Labels given as {0, 1} or as a two-level factor/character are automatically converted to this format.
Positive scalar regularization parameter.
Kernel type: "radial" (default), "polynomial",
"linear", or "laplace".
Kernel-specific parameter:
\(\sigma\) for "radial" and "laplace" kernels
(default \(1/p\), where \(p\) is the number of predictors after preprocessing,
i.e., after categorical variables are one-hot encoded).
Degree for "polynomial" kernel (default 2).
Ignored for "linear" kernel.
Surrogate loss function type. One of:
"hinge" (default), "hinge2" (squared hinge),
"logistic", or "exponential".
Logical; enables a scalable approximation to accelerate training.
The default is TRUE when nrow(X) >= 1000, and FALSE otherwise.
For details about how approximation is applied, see the details section.
Logical; include an intercept in the model (default TRUE).
List with target sensitivity and specificity used when estimating the intercept (defaults to 0.9 each).
List of convergence controls (e.g., maxiter,
eps). Default is list(maxiter = 5e4, eps = 1e-4).
For large-scale data, the model is computationally prohibitive because its
loss is a U-statistic involving a double summation. To reduce this burden,
the package adopts an efficient algorithm based on an incomplete U-statistic,
which approximates the loss with a single summation. In kernel models,
a Nyström low-rank approximation is further applied to efficiently compute
the kernel matrix. These approximations substantially reduce computational
cost and accelerate training, while maintaining accuracy, making the model
feasible for large-scale datasets. This option is available when
@param approx = TRUE.
set.seed(123)
n <- 100
r <- sqrt(runif(n, 0.05, 1))
theta <- runif(n, 0, 2*pi)
X <- cbind(r * cos(theta), r * sin(theta))
y <- ifelse(r < 0.5, 1, -1)
fit <- kroclearn(X, y, lambda = 0.1, kernel = "radial", approx=TRUE)
Run the code above in your browser using DataLab