prodestACF: Estimate productivity - Ackerberg-Caves-Frazer correction

Description

The prodestACF() function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod object of class S3 with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors .

Usage

prodestACF(Y, fX, sX, pX, idvar, timevar, R = 20, cX = NULL,
            opt = 'optim', theta0 = NULL, cluster = NULL)

Arguments

the vector of value added log output.

the vector/matrix/dataframe of log free variables.

the vector/matrix/dataframe of log state variables.

the vector/matrix/dataframe of log proxy variables.

the vector/matrix/dataframe of control variables. By default cX= NULL.

idvar

the vector/matrix/dataframe identifying individual panels.

timevar

the vector/matrix/dataframe identifying time.

the number of block bootstrap repetitions to be performed in the standard error estimation. By default R = 20.

opt

a string with the optimization algorithm to be used during the estimation. By default opt = 'optim'.

theta0

a vector with the second stage optimization starting points. By default theta0 = NULL and the optimization is run starting from the first stage estimated parameters + \(N(0,0.01)\) noise.

cluster

an object of class "SOCKcluster" or "cluster". By default cluster = NULL.

Value

The output of the function prodestACF is a member of the S3 class prod. More precisely, is a list (of length 3) containing the following elements:

Model, a list with elements:

method: a string describing the method ('ACF').
boot.repetitions: the number of bootstrap repetitions used for standard errors' computation.
elapsed.time: time elapsed during the estimation.
theta0: numeric object with the optimization starting points - second stage.
opt: string with the optimization routine used - 'optim', 'solnp' or 'DEoptim'.
opt.outcome: optimization outcome.
FSbetas: first stage estimated parameters.

Data, a list with elements:

Y: the vector of value added log output.
free: the vector/matrix/dataframe of log free variables.
state: the vector/matrix/dataframe of log state variables.
proxy: the vector/matrix/dataframe of log proxy variables.
control: the vector/matrix/dataframe of log control variables.
idvar: the vector/matrix/dataframe identifying individual panels.
timevar: the vector/matrix/dataframe identifying time.
FSresiduals: numeric object with the residuals of the first stage.

Estimates, a list with elements:

pars: the vector of estimated coefficients.
std.errors: the vector of bootstrapped standard errors.

Members of class prod have an omega method returning a numeric object with the estimated productivity - that is: \(\omega_{it} = y_{it} - (\alpha + w_{it}\beta + k_{it}\gamma)\). FSres method returns a numeric object with the residuals of the first stage regression, while summary, show and coef methods are implemented and work as usual.

Details

Consider a Cobb-Douglas production technology for firm \(i\) at time \(t\)

\(y_{it} = \alpha + w_{it}\beta + k_{it}\gamma + \omega_{it} + \epsilon_{it}\)

where \(y_{it}\) is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and \(\epsilon_{it}\) is a normally distributed idiosyncratic error term. The unobserved technical efficiency parameter \(\omega_{it}\) evolves according to a first-order Markov process:

\(\omega_{it} = E(\omega_{it} | \omega_{it-1}) + u_{it} = g(\omega_{it-1}) + u_{it}\)

and \(u_{it}\) is a random shock component assumed to be uncorrelated with the technical efficiency, the state variables in \(k_{it}\) and the lagged free variables \(w_{it-1}\). ACF propose an estimation algorithm alternative to OP and LP procedures claiming that the labour demand and the control function are partially collinear. It is based on the following set of assumptions:

a) \(p_{it} = p(k_{it} , l_{it} , \omega_{it})\) is the proxy variable policy function;
b) \(p_{it}\) is strictly monotone in \(\omega_{it}\);
c) \(\omega_{it}\) is scalar unobservable in \(p_{it} = m(.)\) ;
d) The state variable are decided at time t-1. The less variable labor input, \(l_{it}\), is chosen at t-b, where \(0 < b < 1\). The free variables, \(w_{it}\), are chosen in t when the firm productivity shock is realized.

Under this set of assumptions, the first stage is meant to remove the shock \(\epsilon_{it}\) from the the output, \(y_{it}\). As in the OP/LP case, the inverted policy function replaces the productivity term \(\omega_{it}\) in the production function:

\(y_{it} = k_{it}\gamma + w_{it}\beta + l_{it}\mu + h(p_{it} , k_{it} ,w_{it} , l_{it}) + \epsilon_{it}\)

which is estimated by a non-parametric approach - First Stage. Exploiting the Markovian nature of the productivity process one can use assumption d) in order to set up the relevant moment conditions and estimate the production function parameters - Second stage.

References

Ackerberg, D., Caves, K. and Frazer, G. (2015). "Identification properties of recent production function estimators." Econometrica, 83(6), 2411-2451.

Examples

Run this code

# NOT RUN {
    require(prodest)

    ## Chilean data on production.The full version is Publicly available at
    ## http://www.ine.cl/canales/chile_estadistico/estadisticas_economicas/industria/
    ## series_estadisticas/series_estadisticas_enia.php

    data(chilean)

    # we fit a model with two free (skilled and unskilled), one state (capital)
    # and one proxy variable (electricity)

    ACF.fit <- prodestACF(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
                          chilean$pX, chilean$idvar, chilean$timevar,
                          theta0 = c(.5,.5,.5), R = 5)
    
# }
# NOT RUN {
      set.seed(154673)
      ACF.fit.solnp <- prodestACF(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
                            chilean$pX, chilean$idvar, chilean$timevar,
                            theta0 = c(.5,.5,.5), opt = 'solnp')

      # run the same regression in parallel
      # nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS")) # Windows systems
      nCores <- 3
      cl <- makeCluster(getOption("cl.cores", nCores - 1))
      set.seed(154673)
      ACF.fit.par <- prodestACF(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
                                chilean$pX, chilean$idvar, chilean$timevar,
                                theta0 = c(.5,.5,.5), cluster = cl)
      stopCluster(cl)

      # show results
      coef(ACF.fit)
      coef(ACF.fit.solnp)

       # show results in .tex tabular format
       printProd(list(ACF.fit, ACF.fit.solnp))
    
# }
# NOT RUN {
  
# }

Run the code above in your browser using DataLab