sf: Stochastic Frontier Models Using Cross-Sectional Data

Description

sf performs maximum likelihood estimation of the parameters and technical or cost efficiencies in cross-sectional stochastic (production or cost) frontier models with half-normal or truncated normal distributional assumption imposed on inefficiency error component.

Usage

sf(formula, uhet = NULL, vhet = NULL,
 tmean = NULL, prod = TRUE, data, subset, 
 distribution = c("h", "t"), start.val = NULL, 
 alpha = 0.05, marg.eff = FALSE, digits = 4, 
 print.level = 2)

Arguments

formula

an object of class ``formula'' (or one that can be coerced to that class): a symbolic description of the model. The details of model specification are given under `Details'.

uhet

one-sided formula; e.g. uhet ~ z1 + z2. Specifies exogenous variables entering the expression for variance of inefficiency error component. If NULL, inefficiency term is assumed to be homoskedastic, i.e. $\sigma_u^2 = exp(\gamma[

vhet

one-sided formula; e.g. vhet ~ z1 + z2. Specifies exogenous variables entering the expression for variance of random noise error component. If NULL, random noise component is assumed to be homoskedastic, i.e. $\sigma_v^2 = exp(\g

tmean

one-sided formula; e.g. tmean ~ z1 + z2. Specifies whether the mean of pre-truncated normal distribution of inefficiency term is a linear function of exogenous variables. Used only when distribution = "t". If NULL, m

prod

logical. If TRUE, the estimates of parameters of stochastic production frontier model and of technical efficiencies are returned; if FALSE, the estimates of parameters of stochastic cost frontier model and of cost efficiencies ar

data

an optional data frame containing variables in the model. If not found in data, the variables are taken from environment (formula), typically the environment from which sf is called.

subset

an optional vector specifying a subset of observations for which technical or cost efficiencies are to be computed.

distribution

either "h" (half-normal) or "t" (truncated normal), specifying the distribution of inefficiency term.

start.val

numeric. Starting values to be supplied to the optimization routine. If NULL, OLS and method of moments estimates are used (see Kumbhakar and Lovell 2000).

alpha

numeric. Defines (1-$\alpha$)100% two-sided prediction interval for technical or cost efficiencies (see Horrace and Schmidt 1996). Default is 0.05.

marg.eff

logical. If TRUE, unit-specific marginal effects of exogenous variables on the mean of distribution of inefficiency term are returned.

digits

numeric. Number of digits to be displayed in estimation results and for efficiency estimates. Default is 4.

print.level

numeric. 1 - print estimation results. 2 - print optimization details. 3 - print summary of point estimates of technical or cost efficiencies. 4 - print unit-specific point and interval estimates of technical or cost efficiencies. Default is 2.

Value

sf returns a list of class npsf containing the following elements:
coefnumeric. Named vector of ML parameter estimates.
vcovmatrix. Estimated covariance matrix of ML estimator.
logliknumeric. Value of log-likelihood at ML estimates.
efficienciesdata frame. Contains point estimates of unit-specific technical or cost efficiencies: exp(-E(u|e)) of Jondrow et al. (1982), E(exp(-u)|e) of Battese and Coelli (1988), and exp(-M(u|e)), where M(u|e) is the mode of conditional distribution of inefficiency term. In addition, estimated lower and upper bounds of (1-$\alpha$)100% two-sided prediction intervals are returned.
marg.effectsdata frame. Contains unit-specific marginal effects of exogenous variables on the expected value of inefficiency term.
sigmas_umatrix. Estimated unit-specific variances of inefficiency term. Returned if uhet is not NULL.
sigmas_vmatrix. Estimated unit-specific variances of random noise component. Returned if vhet is not NULL.
mumatrix. Estimated unit-specific means of pre-truncated normal distribution of inefficiency term. Returned if tmean is not NULL.
esamplelogical. Returns TRUE if the observation in user supplied data is in the estimation subsample and FALSE otherwise.

Details

Models for sf are specified symbolically. A typical model has the form y ~ x1 + ..., where y represents the logarithm of outputs or total costs and {x1,...} is a series of inputs or outputs and input prices (in logs).

Options uhet and vhet can be used if multiplicative heteroskedasticity of either inefficiency or random noise component (or both) is assumed; i.e. if their variances can be expressed as exponential functions of (e.g. size-related) exogenous variables (including intercept) (see Caudill et al. 1995).

If marg.eff = TRUE and distribution = "h", the marginal effect of kth exogenous variable on the expected value of inefficiency term of unit i is computed as: $\gamma[k]\sigma[i]/\sqrt2\pi$, where $\sigma_u[i] = \sqrt exp(z[i]'\gamma)$. If distribution = "t", marginal effects are returned if either tmean or uhet are not NULL. If the same exogenous variables are specified under both options, (non-monotonic) marginal effects are computed as explained in Wang (2002).

References

Battese, G., Coelli, T. (1988), Prediction of firm-level technical effiiencies with a generalized frontier production function and panel data. Journal of Econometrics, 38, 387--399.

Caudill, S., Ford, J., Gropper, D. (1995), Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business and Economic Statistics, 13, 105--111.

Horrace, W. and Schmidt, P. (1996), On ranking and selection from independent truncated normal distributions. Journal of Productivity Analysis, 7, 257--282.

Jondrow, J., Lovell, C., Materov, I., Schmidt, P. (1982), On estimation of technical inefficiency in the stochastic frontier production function model. Journal of Econometrics, 19, 233--238.

Kumbhakar, S. and Lovell, C. (2003), Stochastic Frontier Analysis. Cambridge: Cambridge University Press.

Wang, H.-J. (2002), Heteroskedasticity and non-monotonic efficiency effects of a stochastic frontier model. Journal of Productivity Analysis, 18, 241--253.

Examples

Run this code

require( npsf )
 
# Load Penn World Tables 5.6 dataset
 
data( pwt56 )
head( pwt56 )
 
# Create some missing values
 
pwt56 [4, "K"] <- NA 
 
# Stochastic production frontier model with 
# homoskedastic error components (half-normal)
 
# Use subset of observations - for year 1965
 
m1 <- sf(log(Y) ~ log(L) + log(K), data = pwt56, 
 subset = year == 1965, distribution = "h")
 
# Write efficiencies to the data frame using 'esample':

pwt56$BC[ m1$esample ] <- m1$efficiencies$BC
View(pwt56)
  
# Computation using matrices
 
Y1 <- as.matrix(log(pwt56[pwt56$year == 1965, 
c("Y"), drop = FALSE]))
X1 <- as.matrix(log(pwt56[pwt56$year == 1965,
c("K", "L"), drop = FALSE]))

X1 [51, 2] <- NA # create missing
X1 [49, 1] <- NA # create missing
 
m2 <- sf(Y1 ~ X1, distribution = "h")
  
# Load U.S. commercial banks dataset
 
data(banks05)
head(banks05)
 
# Doubly heteroskedastic stochastic cost frontier 
# model (truncated normal)
 
# Print summaries of cost efficiencies' estimates
 
m3 <- sf(lnC ~ lnw1 + lnw2 + lny1 + lny2, uhet = ~ ER, 
 vhet = ~ LA, data = banks05, distribution = "t", 
 prod = FALSE, print.level = 3)

# Non-monotonic marginal effects of equity ratio on 
# the mean of distribution of inefficiency term
 
m4 <- sf(lnC ~ lnw1 + lnw2 + lny1 + lny2, uhet = ~ ER,
 tmean = ~ ER, data = banks05, distribution = "t", 
 prod = FALSE, marg.eff = TRUE)
 
summary(m4$marg.effects)

Run the code above in your browser using DataLab