ml_logistic_regression
From sparklyr v0.5
by Javier Luraschi
Spark ML -- Logistic Regression
Perform logistic regression on a Spark DataFrame.
Usage
ml_logistic_regression(x, response, features, intercept = TRUE, alpha = 0, lambda = 0, iter.max = 100L, ml.options = ml_options(), ...)
Arguments
- x
- An object coercable to a Spark DataFrame (typically, a
tbl_spark
). - response
- The name of the response vector (as a length-one character
vector), or a formula, giving a symbolic description of the model to be
fitted. When
response
is a formula, it is used in preference to other parameters to set theresponse
,features
, andintercept
parameters (if available). Currently, only simple linear combinations of existing parameters is supposed; e.g.response ~ feature1 + feature2 + ...
. The intercept term can be omitted by using- 1
in the model fit. - features
- The name of features (terms) to use for the model fit.
- intercept
- Boolean; should the model be fit with an intercept term?
- alpha, lambda
- Parameters controlling loss function penalization (for e.g. lasso, elastic net, and ridge regression). See Details for more information.
- iter.max
- The maximum number of iterations to use.
- ml.options
- Optional arguments, used to affect the model generated. See
ml_options
for more details. - ...
- Optional arguments. The
data
argument can be used to specify the data to be used whenx
is a formula; this allows calls of the formml_linear_regression(y ~ x, data = tbl)
, and is especially useful in conjunction withdo
.
Details
Spark implements for both $L1$ and $L2$ regularization in linear regression models. See the preamble in the Spark Classification and Regression documentation for more details on how the loss function is parameterized.
In particular, with alpha
set to 1, the parameterization
is equivalent to a lasso
model; if alpha
is set to 0, the parameterization is equivalent to
a ridge regression model.
See Also
Other Spark ML routines: ml_als_factorization
,
ml_decision_tree
,
ml_generalized_linear_regression
,
ml_gradient_boosted_trees
,
ml_kmeans
, ml_lda
,
ml_linear_regression
,
ml_multilayer_perceptron
,
ml_naive_bayes
,
ml_one_vs_rest
, ml_pca
,
ml_random_forest
,
ml_survival_regression
Community examples
Looks like there are no examples yet.