Method new()
Create a new WeightedERMDP.CMS
object.
Usage
WeightedERMDP.CMS$new(
mapXy,
loss,
regularizer,
eps,
gamma,
perturbation.method = "objective",
c = NULL,
mapXy.gr = NULL,
loss.gr = NULL,
regularizer.gr = NULL
)
Arguments
mapXy
Map function of the form mapXy(X, coeff)
mapping input
data matrix X
and coefficient vector or matrix coeff
to
output labels y
. Should return a column matrix of predicted labels
for each row of X
. See mapXy.sigmoid
for an example.
loss
Loss function of the form loss(y.hat, y, w)
, where
y.hat
and y
are matrices and w
is a matrix or vector
of weights of the same length as y
. Should be defined such that it
returns a matrix of weighted loss values for each element of y.hat
and y
. If w
is not given, the function should operate as if
uniform weights were given. See generate.loss.huber
for an
example. It must be convex and differentiable, and the absolute value of
the first derivative of the loss function must be at most 1.
Additionally, if the objective perturbation method is chosen, it must be
doubly differentiable and the absolute value of the second derivative of
the loss function must be bounded above by a constant c for all possible
values of y.hat
and y
.
regularizer
String or regularization function. If a string, must be
'l2', indicating to use l2 regularization. If a function, must have form
regularizer(coeff)
, where coeff
is a vector or matrix, and
return the value of the regularizer at coeff
. See
regularizer.l2
for an example. Additionally, in order to
ensure differential privacy, the function must be 1-strongly convex and
differentiable. If the objective perturbation method is chosen, it must
also be doubly differentiable.
eps
Positive real number defining the epsilon privacy budget. If set
to Inf, runs algorithm without differential privacy.
gamma
Nonnegative real number representing the regularization
constant.
perturbation.method
String indicating whether to use the 'output' or
the 'objective' perturbation methods chaudhuri2011DPpack.
Defaults to 'objective'. Currently, only the output perturbation method
is supported.
c
Positive real number denoting the upper bound on the absolute
value of the second derivative of the loss function, as required to
ensure differential privacy for the objective perturbation method. This
input is unnecessary if perturbation.method is 'output', but is required
if perturbation.method is 'objective'. Defaults to NULL.
mapXy.gr
Optional function representing the gradient of the map
function with respect to the values in coeff
. If given, must be of
the form mapXy.gr(X, coeff)
, where X
is a matrix and
coeff
is a matrix or numeric vector. Should be defined such that
the ith row of the output represents the gradient with respect to the ith
coefficient. See mapXy.gr.sigmoid
for an example. If not
given, non-gradient based optimization methods are used to compute the
coefficient values in fitting the model.
loss.gr
Optional function representing the gradient of the loss
function with respect to y.hat
and of the form
loss.gr(y.hat, y, w)
, where y.hat
and y
are matrices
and w
is a matrix or vector of weights. Should be defined such
that the ith row of the output represents the gradient of the (possibly
weighted) loss function at the ith set of input values. See
generate.loss.gr.huber
for an example. If not given,
non-gradient based optimization methods are used to compute the
coefficient values in fitting the model.
regularizer.gr
Optional function representing the gradient of the
regularization function with respect to coeff
and of the form
regularizer.gr(coeff)
. Should return a vector. See
regularizer.gr.l2
for an example. If regularizer
is
given as a string, this value is ignored. If not given and
regularizer
is a function, non-gradient based optimization methods
are used to compute the coefficient values in fitting the model.
Returns
A new WeightedERMDP.CMS
object.
Method fit()
Fit the differentially private weighted empirical risk
minimization model. This method runs either the output perturbation or
the objective perturbation algorithm chaudhuri2011DPpack
(only output is currently implemented), depending on the value of
perturbation.method used to construct the object, to generate an
objective function. A numerical optimization method is then run to find
optimal coefficients for fitting the model given the training data,
weights, and hyperparameters. The built-in optim
function
using the "BFGS" optimization method is used. If mapXy.gr
,
loss.gr
, and regularizer.gr
are all given in the
construction of the object, the gradient of the objective function is
utilized by optim
as well. Otherwise, non-gradient based
optimization methods are used. The resulting privacy-preserving
coefficients are stored in coeff
.
Usage
WeightedERMDP.CMS$fit(
X,
y,
upper.bounds,
lower.bounds,
add.bias = FALSE,
weights = NULL,
weights.upper.bound = NULL
)
Arguments
X
Dataframe of data to be fit.
y
Vector or matrix of true labels for each row of X
.
upper.bounds
Numeric vector of length ncol(X)
giving upper
bounds on the values in each column of X. The ncol(X)
values are
assumed to be in the same order as the corresponding columns of X
.
Any value in the columns of X
larger than the corresponding upper
bound is clipped at the bound.
lower.bounds
Numeric vector of length ncol(X)
giving lower
bounds on the values in each column of X
. The ncol(X)
values are assumed to be in the same order as the corresponding columns
of X
. Any value in the columns of X
larger than the
corresponding upper bound is clipped at the bound.
add.bias
Boolean indicating whether to add a bias term to X
.
Defaults to FALSE.
weights
Numeric vector of observation weights of the same length as
y
.
weights.upper.bound
Numeric value representing the global or public
upper bound on the weights.
Predict label(s) for given X
using the fitted
coefficients.
Usage
WeightedERMDP.CMS$predict(X, add.bias = FALSE)
Arguments
X
Dataframe of data on which to make predictions. Must be of same
form as X
used to fit coefficients.
add.bias
Boolean indicating whether to add a bias term to X
.
Defaults to FALSE. If add.bias was set to TRUE when fitting the
coefficients, add.bias should be set to TRUE for predictions.
Returns
Matrix of predicted labels corresponding to each row of X
.
Method clone()
The objects of this class are cloneable with this method.
Usage
WeightedERMDP.CMS$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.