Spark ML -- Generalized Linear Regression
Perform generalized linear regression on a Spark DataFrame.
ml_generalized_linear_regression(x, response, features, intercept = TRUE, family = gaussian(link = "identity"), weights.column = NULL, iter.max = 100L, ml.options = ml_options(), ...)
An object coercable to a Spark DataFrame (typically, a
The name of the response vector (as a length-one character vector), or a formula, giving a symbolic description of the model to be fitted. When
responseis a formula, it is used in preference to other parameters to set the
interceptparameters (if available). Currently, only simple linear combinations of existing parameters is supposed; e.g.
response ~ feature1 + feature2 + .... The intercept term can be omitted by using
- 1in the model fit.
The name of features (terms) to use for the model fit.
Boolean; should the model be fit with an intercept term?
The family / link function to use; analogous to those normally passed in to calls to R's own
The name of the column to use as weights for the model fit.
The maximum number of iterations to use.
Optional arguments, used to affect the model generated. See
ml_optionsfor more details.
Optional arguments. The
dataargument can be used to specify the data to be used when
xis a formula; this allows calls of the form
ml_linear_regression(y ~ x, data = tbl), and is especially useful in conjunction with
In contrast to
ml_logistic_regression(), these routines do not allow you to
tweak the loss function (e.g. for elastic net regression); however, the model
fits returned by this routine are generally richer in regards to information
provided for assessing the quality of fit.
Other Spark ML routines: