lba: Latent Budget Analysis (LBA) for Compositional Data

Description

Latent budget analysis (LBA) is a method for the analysis of contingency tables, from where the compositional data is derived. It is used to understand the relationship between the table rows and columns, where the rows denote the categories of the explanatory variable and the columns denote the categories of the response variable.

Usage

lba(obj, ...)

## S3 method for class 'matrix':
lba(obj,
    A           = NULL,
    B           = NULL,
    K           = 1L,
    cA          = NULL,
    cB          = NULL,
    logitA      = NULL,
    logitB      = NULL,
    omsk        = NULL,
    psitk       = NULL,
    S           = NULL,
    T           = NULL,
    row.weights = NULL,
    col.weights = NULL,
    tolG        = 1e-10,
    tolA        = 1e-05,
    tolB        = 1e-05,
    itmax.unide = 1e3,
    itmax.ide   = 1e3,
    trace.lba   = TRUE,
    toltype     = "all",
    method      = c("ls", "mle"),
    what        = c("inner","outer"), ...)

## S3 method for class 'table':
lba(obj,
    A           = NULL,
    B           = NULL,
    K           = 1L,
    cA          = NULL,
    cB          = NULL,
    logitA      = NULL,
    logitB      = NULL,
    omsk        = NULL,
    psitk       = NULL,
    S           = NULL,
    T           = NULL,
    row.weights = NULL,
    col.weights = NULL,
    tolG        = 1e-10,
    tolA        = 1e-05,
    tolB        = 1e-05,
    itmax.unide = 1e3,
    itmax.ide   = 1e3,
    trace.lba   = TRUE,
    toltype     = "all",
    method      = c("ls", "mle"),
    what        = c("inner","outer"), ...)

## S3 method for class 'formula':
lba(formula, data,
    A           = NULL,
    B           = NULL,
    K           = 1L,
    cA          = NULL,
    cB          = NULL,
    logitA      = NULL,
    logitB      = NULL,
    omsk        = NULL,
    psitk       = NULL,
    S           = NULL,
    T           = NULL,
    row.weights = NULL,
    col.weights = NULL,
    tolG        = 1e-10,
    tolA        = 1e-05,
    tolB        = 1e-05,
    itmax.unide = 1e3,
    itmax.ide   = 1e3,
    trace.lba   = TRUE,
    toltype     = "all",
    method      = c("ls", "mle"),
    what        = c("inner","outer"), ...)

## S3 method for class 'ls':
lba(obj,
    A           ,
    B           ,
    K           ,
    row.weights ,
    col.weights ,
    tolA        ,
    tolB        ,
    itmax.unide ,
    itmax.ide   ,
    trace.lba   ,
    what        , ...)

## S3 method for class 'mle':
lba(obj,
    A           ,
    B           ,
    K           ,
    tolG        ,
    tolA        ,
    tolB        ,
    itmax.unide ,
    itmax.ide   ,
    trace.lba   ,
    toltype     ,
    what        , ...)

## S3 method for class 'ls.fe':
lba(obj,
    A           ,
    B           ,
    K           ,
    cA          ,
    cB          ,
    row.weights ,
    col.weights ,
    itmax.ide   ,
    trace.lba   , ...)

## S3 method for class 'mle.fe':
lba(obj,
    A          ,
    B          ,
    K          ,
    cA         ,
    cB         ,
    tolG       ,
    tolA       ,
    tolB       ,
    itmax.ide  ,
    trace.lba  ,
    toltype    , ...)

## S3 method for class 'ls.logit':
lba(obj,
    A           ,
    B           ,
    K           ,
    cA          ,
    cB          ,
    logitA      ,
    logitB      ,
    omsk        ,
    psitk       ,
    S           ,
    T           ,
    row.weights ,
    col.weights ,
    itmax.ide   ,
    trace.lba   , ...)

## S3 method for class 'mle.logit':
lba(obj,
    A          ,
    B          ,
    K          ,
    cA         ,
    cB         ,
    logitA     ,
    logitB     ,
    omsk       ,
    psitk      ,
    S          ,
    T          ,
    itmax.ide  ,
    trace.lba  , ...)

Arguments

obj,formula

The function is generic, accepting some forms of the principal argument for specifying a two-way frequency table. Currently accepted forms are matrix, data frame (coerced to frequency tables), objects of class "xtabs" or "table"

data

A data frame containing variables in formula.

The starting value of a (I x K) matrix containing the mixing parameters, if given. The default is NULL, producing random starting values.

The starting value of a (J x K) matrix containing the latent components, if given. The default is NULL, producing random starting values.

Integer giving the number of latent budgets chosen by the user. The default is 1.

The value of a (I x K) matrix containing the constraints on the mixing parameters. Fixed constraints are the values themselves which are numbers in the [0,1] interval. The optional equality constraints are indicated by an integer starting from 2, such tha

The value of a (J x K) matrix containing the constraints on the latent components. Fixed constraints are the values themselves which are numbers in the [0,1] interval. The optional equality constraints are indicated by an integer starting from 2, such tha

logitA

Design (IxS) matrix for row-covariates. The first column contains 1�s, indicating a constant covariate. The entries may be continuous or dummy coded values.

logitB

Design (JxT) matrix for column-covariates. The entries may be continuous or dummy coded values.

omsk

A (SxK) matrix giving the starting values for the multinomial logit parameters of the row covariates. The default is NULL, producing random starting values.

psitk

A (TxK) matrix giving the starting values for the multinomial logit parameters of the column covariates. The default is NULL, producing random starting values.

Number of row-covariates. The default is NULL.

Number of column-covariates. The default is NULL.

row.weights

Row weights for weighted least squares method. The default is NULL.

col.weights

Column weights for weighted least squares method. The default is NULL. If both row.weights and col.weights are NULL and "ls" method is chosen, then ordinary least squares is used.

tolG

A tolerance value for judging when convergence has been reached. It is based on the estimated likelihood ratio statistics G2. The default is 1e-10.

tolA

A tolerance value for judging when convergence has been reached. When the one-iteration change in the maximum of the absolute value of the element wise difference of the estimated matrices A is less than tolA. The default is 1e-05.

tolB

itmax.unide

Maximum number of iterations performed by the mle or ls method, if convergence is not achieved, before identification parameters. The default is 1e3.

itmax.ide

Maximum number of iterations performed by the mle or ls method in the identification process. Is used too when the constrained fixed, equality and logit are required. The default is 1e3.

trace.lba

Logical, indicating whether the base function optim and constrOptim.nl from package alabama, will trace their results. The default is TRUE.

toltype

String indicating which kind of tolerance to be used. That is, the EM algorithm stops updating and considers the maximum log-likelihood to have been found. Their types are: "all" when the one-iteration change in the estimated likelihood rati

method

String indicating which kind of estimating method. They are: "ls" when least squares, either weighted or ordinary, method is used; "mle" when maximum likelihood method is used. The default is "ls".

what

String indicating which kind identified solutions for mixing parameters and latent budgets matrices. They are: the "inner" extreme solution and the "outer" extreme solution. The default is "inner".

...

Potential further arguments (required by generic).

Value

The method lba.ls and lba.mle returns a list of class lba.ls and lba.mle respectively with the slots:
PThe compositional data matrix which is formed by dividing the raw data matrix by their corresponding total, its rows are called observed budgets.
pijMatrix whose rows are the expected budgets.
residualResidual matrix P - pij.
A(I x K) matrix of the unidentified the mixing parameters.
B(J x K) matrix of the unidentified the latent components.
Aoi(I x K) matrix of the identified mixing parameters, they may be either the inner extreme values or the outer extreme values.
Boi(J x K) matrix of the identified latent componentes, they may be either the inner extreme values or the outer extreme values.
rescB(J x K) matrix of the rescaled latent components.
pkBudget proportions.
val_funcValue of least squared or likelihood function achieved.
iter_unideNumber of unidentified iterations.
iter_ideNumber of identified iterations.
The method lba.ls.fe and lba.mle.fe returns a list of class lba.ls.fe and lba.mle.fe respectively with the slots:
PThe compositional data matrix which is formed by dividing the raw data matrix by their corresponding total, its rows are called observed budgets.
pijMatrix whose rows are the expected budgets.
residualResidual matrix P - pij.
A(I x K) matrix of the unidentified the mixing parameters.
B(J x K) matrix of the unidentified the latent components.
pkBudget proportions.
val_funcValue of least squared or likelihood function achieved.
iter_ideNumber of identified iteractions.
The method lba.ls.logit and lba.mle.logit returns a list of class lba.ls.logit and lba.mle.logit respectively with the slots:
PThe compositional data matrix which is formed by dividing the raw data matrix by their corresponding total, its rows are called observed budgets.
pijMatrix whose rows are the expected budgets.
residualResidual matrix P - pij.
A(I x K) matrix of the unidentified the mixing parameters.
B(J x K) matrix of the unidentified the latent componentes.
pkBudget proportions.
val_funcValue of least squared or likelihood function achieved.
iter_ideNumber of identified iterations.
omskA (SxK) matrix giving estimated values of the multinomial logit parameters of the row covariates.
psitkA (TxK) matrix giving the estimated values for the multinomial logit parameters of the column covariates.

References

Agresti, Alan. 2002. Categorical Data Analysis, second edition. Hoboken: John Wiley & Sons. de Leeuw, J., and van der Heijden, P.G.M. 1988. "The analysis of time-budgets with a latent time-budget model". In E. Diday (Ed.), Data Analysis and Informatics V. pp. 159-166. Amsterdam: North-Holland. de Leeuw, J., van der Heijden, P.G.M., and Verboon, P. 1990. "A latent time budget model". Statistica Neerlandica. 44, 1, 1-21. Dempster, A.P., Laird, N.M., and Rubin, D.B. 1977. "Maximum likelihood from incomplete data via the EM algorithm". Journal of the Royal Statistical Society, Series. 39, 1-38. van der Ark, A.L. 1999. Contributions to Latent Budget Analysis, a tool for the analysis of comositional data. Ph.D. Thesis University of Utrecht. van der Heijden, P.G.M., Mooijaart, A., and de Leeuw, J. 1992. "Constrained latent budget analysis". In P.V. Marsden (Ed.), Sociological Methodology pp. 279-320. Cambridge: Blackwell Publishers.

Examples

Run this code

data('votB')

# Using LS method (default) without constraint
# K = 2
ex1 <- lba(city ~ parties,
           votB,
           K = 2)
ex1 

# Already tabulated data? Ok!
data('PerfMark') 

ex2 <- lba(as.matrix(PerfMark),
           K = 2,
           what='outer')
ex2

# Using LS method (default) with constraint
# Fixed constraint to mixing parameters
cakiF1 <- matrix(c(0.2, NA, NA,
                   NA , NA,0.2,
                   NA , NA,0.2,
                   0.3, NA, NA,
                   0.2, NA, NA,
                   NA , NA, NA),
                 byrow = TRUE,
                 ncol  = 3)  

# K = 3
exf1 <- lba(city ~ parties,
            votB,
            cA = cakiF1,
            K = 3)
exf1

Run the code above in your browser using DataLab