CDL: Compressive Dictionary Learning

Description

Implementation of the Sketched Stochastic Dictionary Learning (SSDL) method, which learns a dictionary from a large-scale matrix-like dataset by operating on a compressed version of data (a.k.a. data sketch).

Usage

CDL(
  Data,
  K,
  SK_Data = NULL,
  Frequencies = NULL,
  D = NULL,
  pos.dic = TRUE,
  learn_rate = 0.1,
  alpha = 0.9,
  gamma = 0,
  maxEpoch = 5,
  batch_size,
  lambda = 0,
  ncores = 1,
  typeOptim = "Nesterov",
  DIR_tmp = tempdir(),
  grad_t_1 = NULL,
  verbose = 0,
  m = nrow(Frequencies),
  ...
)

Arguments

Data

is a Filebacked Big Matrix \(s \times N\) with data vectors stored in the matrix columns.

is a dictionary size.

SK_Data

is a data sketch. It is a \(2m\)-dimensional complex vector. The first \(m\) coordinates correspond to the real parts and the last \(m\) coordinates to the imaginary parts. If it is NULL, the sketch is computed using Sketch function of chickn package.

Frequencies

is a frequency matrix \(m \times s\) with frequency vectors in the matrix rows. If NULL, the frequencies are generated using GenerateFrequencies function of chickn package.

is an initial dictionary. If it is NULL, the dictionary is initialized by random selection of K signals from Data and it is saved in the DIR_tmp directory.

pos.dic

indicates whether the dictionary is positive (default) or not.

learn_rate

is a learning rate value. The default value is 0.1.

alpha

is a momentum weight.

gamma

is a decay parameter. The default value is 0, which corresponds to the constant learning rate.

maxEpoch

is a number of epochs.

batch_size

is a batch size.

lambda

is a regularization parameter.

ncores

is a number of cores. The default value is 1.

typeOptim

is a type of the optimization scheme used in the dictionary update. Possible values are c('Nesterov', 'Momentum'). It is suggested to use 'Nesterov' scheme.

DIR_tmp

is a directory to save the initial dictionary and intermediate results.

grad_t_1

is an initial momentum matrix. By default it is NULL, and it is initialized as a zero matrix.

verbose

controls how much output is shown and saved during the optimization process. Possible values:

0 no output (default value)
1 show iteration number and value of objective function
2 1 + save a dictionary and a momentum matrix at the end of each epoch.

is a number of the frequency vectors.

...

are additional parameters passed to GenerateFrequencies function.

Value

a list

D is the obtained dictionary,
objFunProcess is objective function values computed at the end of each iteration,
learning_rate is learning rate values.

Details

CDL builds a dictionary by alternating two steps: calculating the code matrix that contains the weights of the dictionary elements, and updating the dictionary. For computational efficiency, the code matrix is computed only for a randomly selected subset of data vectors \(x_1, \dots, x_n\) (a.k.a. batch). The code matrix is obtained as a solution of the following optimization problem: \(\min\limits_{A \in R^{+}_{K\times n}} \sum_{i=1}^{n} \|x_i - D\cdot \alpha_i\|^2 + \lambda \cdot\|\alpha_i\|_1\), where \(n\) denotes a batch size, \(A = \{\alpha_1,\dots, \alpha_{n}\}\) is a code matrix and \(\lambda\) is a regularization parameter which defines the data sparsity level.

The dictionary is updated by taking one step along the gradient of the objective function \(F(D, A) = \|SK(Data) - SK(A\cdot D)\|^2\). Two gradient descent update rules are available: Nesterov accelerated and momentum.

\(SK(\cdot)\) is a sketch operator, which compresses a matrix into a fixed size complex vector referred to as a data sketch. It has been introduced in DBLP:journals/corr/KerivenBGP16SSDL and it is defined as \(SK(Data) = \frac{1}{N}\sum_{i=1}^N \exp{\left(-1i \cdot W\cdot x_i\right)}\), where \(W\) is a frequency matrix and \(x_1, \dots, x_N\) are data vectors. The data compression is performed using routines from chickn package.

CDL allows also to use the decaying learning rate, i.e. \(\code{learn\_rate}^t = \frac{\code{learn\_rate}}{1+ (t-1)\cdot \code{gamma}}\), where \(t\) is the iteration number.

References

permiakova2021SSDLSSDL
permiakova2021chicknSSDL

Examples

Run this code

# NOT RUN {
X = matrix(abs(rnorm(n = 1000)), ncol = 100, nrow = 10)
X_fbm = bigstatsr::as_FBM(X)$save()
W = chickn::GenerateFrequencies(Data = X_fbm, m = 64, N0 = ncol(X_fbm),
                                ncores = 1, niter= 3, nblocks = 2, sigma_start = 0.001)$W
SK= chickn::Sketch(X_fbm, W)
D = CDL(Data = X_fbm, K = 10, SK_Data = SK, Frequencies = W,
        D = NULL, pos.dic = TRUE, maxEpoch = 3, batch_size = 50, 
        lambda = 0, learn_rate = 0.1, alpha = 0.9, 
        gamma = 0, ncores = 2, DIR_tmp = tempdir(), 
        verbose=0, typeOptim = "Nesterov")$D
# }

Run the code above in your browser using DataLab