ml_generalized_linear_regression
Spark ML -- Generalized Linear Regression
Perform generalized linear regression on a Spark DataFrame.
Usage
ml_generalized_linear_regression(x, response, features, intercept = TRUE,
family = gaussian(link = "identity"), weights.column = NULL,
iter.max = 100L, ml.options = ml_options(), ...)
Arguments
- x
An object coercable to a Spark DataFrame (typically, a
tbl_spark
).- response
The name of the response vector (as a length-one character vector), or a formula, giving a symbolic description of the model to be fitted. When
response
is a formula, it is used in preference to other parameters to set theresponse
,features
, andintercept
parameters (if available). Currently, only simple linear combinations of existing parameters is supposed; e.g.response ~ feature1 + feature2 + ...
. The intercept term can be omitted by using- 1
in the model fit.- features
The name of features (terms) to use for the model fit.
- intercept
Boolean; should the model be fit with an intercept term?
- family
The family / link function to use; analogous to those normally passed in to calls to R's own
glm
.- weights.column
The name of the column to use as weights for the model fit.
- iter.max
The maximum number of iterations to use.
- ml.options
Optional arguments, used to affect the model generated. See
ml_options
for more details.- ...
Optional arguments. The
data
argument can be used to specify the data to be used whenx
is a formula; this allows calls of the formml_linear_regression(y ~ x, data = tbl)
, and is especially useful in conjunction withdo
.
Details
In contrast to ml_linear_regression()
and
ml_logistic_regression()
, these routines do not allow you to
tweak the loss function (e.g. for elastic net regression); however, the model
fits returned by this routine are generally richer in regards to information
provided for assessing the quality of fit.
See Also
Other Spark ML routines: ml_als_factorization
,
ml_decision_tree
,
ml_gradient_boosted_trees
,
ml_kmeans
, ml_lda
,
ml_linear_regression
,
ml_logistic_regression
,
ml_multilayer_perceptron
,
ml_naive_bayes
,
ml_one_vs_rest
, ml_pca
,
ml_random_forest
,
ml_survival_regression