rsparse (version 0.4.0)

FTRL: Logistic regression model with FTRL proximal SGD solver.

Description

Creates 'Follow the Regularized Leader' model. Only logistic regression implemented at the moment.

Arguments

Methods

Public methods

Method new()

creates a model

Usage

FTRL$new(
  learning_rate = 0.1,
  learning_rate_decay = 0.5,
  lambda = 0,
  l1_ratio = 1,
  dropout = 0,
  family = c("binomial")
)

Arguments

learning_rate

learning rate

learning_rate_decay

learning rate which controls decay. Please refer to FTRL proximal paper for details. Usually convergense does not heavily depend on this parameter, so default value 0.5 is safe.

lambda

regularization parameter

l1_ratio

controls L1 vs L2 penalty mixing. 1 = Lasso regression, 0 = Ridge regression. Elastic net is in between

dropout

dropout - percentage of random features to exclude from each sample. Acts as regularization.

family

a description of the error distribution and link function to be used in the model. Only binomial (logistic regression) is implemented at the moment.

Method partial_fit()

fits model to the data

Usage

FTRL$partial_fit(x, y, weights = rep(1, length(y)), ...)

Arguments

x

input sparse matrix. Native format is Matrix::RsparseMatrix. If x is in different format, model will try to convert it to RsparseMatrix with as(x, "RsparseMatrix"). Dimensions should be (n_samples, n_features)

y

vector of targets

weights

numeric vector of length `n_samples`. Defines how to amplify SGD updates for each sample. May be useful for highly unbalanced problems.

...

not used at the moment

Method fit()

shorthand for applying `partial_fit` `n_iter` times

Usage

FTRL$fit(x, y, weights = rep(1, length(y)), n_iter = 1L, ...)

Arguments

x

input sparse matrix. Native format is Matrix::RsparseMatrix. If x is in different format, model will try to convert it to RsparseMatrix with as(x, "RsparseMatrix"). Dimensions should be (n_samples, n_features)

y

vector of targets

weights

numeric vector of length `n_samples`. Defines how to amplify SGD updates for each sample. May be useful for highly unbalanced problems.

n_iter

number of SGD epochs

...

not used at the moment

Method predict()

makes predictions based on fitted model

Usage

FTRL$predict(x, ...)

Arguments

x

input matrix

...

not used at the moment

Method coef()

returns coefficients of the regression model

Usage

FTRL$coef()

Method clone()

The objects of this class are cloneable with this method.

Usage

FTRL$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

Run this code
# NOT RUN {
library(rsparse)
library(Matrix)
i = sample(1000, 1000 * 100, TRUE)
j = sample(1000, 1000 * 100, TRUE)
y = sample(c(0, 1), 1000, TRUE)
x = sample(c(-1, 1), 1000 * 100, TRUE)
odd = seq(1, 99, 2)
x[i %in% which(y == 1) & j %in% odd] = 1
m = sparseMatrix(i = i, j = j, x = x, dims = c(1000, 1000), giveCsparse = FALSE)
x = as(m, "RsparseMatrix")

ftrl = FTRL$new(learning_rate = 0.01, learning_rate_decay = 0.1,
lambda = 10, l1_ratio = 1, dropout = 0)
ftrl$partial_fit(x, y)

w = ftrl$coef()
head(w)
sum(w != 0)
p = ftrl$predict(m)
# }

Run the code above in your browser using DataLab