teradialbc: Statistical Inference Regarding the Radial Measure of Technical Efficiency

Description

Routine teradialbc performs bias correction of the radial Debrue-Farrell input- or output-based measure of technical efficiency, computes bias and constructs confidence intervals via bootstrapping techniques.

Usage

teradialbc(formula, data, subset,
 ref = NULL, data.ref = NULL, subset.ref = NULL,
 rts = c("C", "NI", "V"), base = c("output", "input"),
 homogeneous = TRUE, smoothed = TRUE, kappa = NULL,
 reps = 999, level = 95,
 core.count = 1, cl.type = c("SOCK", "MPI"),
 print.level = 1, dots = TRUE)

Arguments

formula

an object of class ``formula'' (or one that can be coerced to that class): a symbolic description of the model. The details of model specification are given under `Details'.

data

an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment (formula), typically the environment from which teradial is called.

subset

an optional vector specifying a subset of observations for which technical efficiency is to be computed.

rts

character or numeric. string: first letter of the word ``c'' for constant, ``n'' for non-increasing, or ``v'' for variable returns to scale assumption. numeric: 3 for constant, 2 for non-increasing, or 3 for variable returns to scale assumption.

base

character or numeric. string: first letter of the word ``o'' for computing output-based or ``i'' for computing input-based technical efficiency measure. string: 2 for computing output-based or 1 for computing input-based technical efficiency measure

ref

an object of class ``formula'' (or one that can be coerced to that class): a symbolic description of inputs and outputs that are used to define the technology reference set. The details of technology reference set specification are given under `Details'.

data.ref

an optional data frame containing the variables in the technology reference set. If not found in data.ref, the variables are taken from environment(ref), typically the environment from which teradial is called.

subset.ref

an optional vector specifying a subset of observations to define the technology reference set.

smoothed

logical. If TRUE, the reference set is bootstrapped with smoothing; if FALSE, the reference set is bootstrapped with subsampling.

homogeneous

logical. Relevant if smoothed=TRUE. If TRUE, the reference set is bootstrapped with homogeneous smoothing; if FALSE, the reference set is bootstrapped with heterogeneous smoothing.

kappa

relevant if smoothed=TRUE. 'kappa' sets the size of the subsample as K^kappa, where K is the number of data points in the original reference set. The default value is 0.7. 'kappa' may be between 0.5 and 1.

reps

specifies the number of bootstrap replications to be performed. The default is 999. The minimum is 100. Adequate estimates of confidence intervals using bias-corrected methods typically require 1,000 or more replications.

level

sets confidence level for confidence intervals; default is level=95.

core.count

positive integer. Number of cluster nodes. If core.count=1, the process runs sequentially. See performParallel for more details.

cl.type

Character string that specifies cluster type (see makeClusterFT). Possible values are 'MPI' and 'SOCK' ('PVM' is currently not available). See

dots

logical. Relevant if print.level>=1. If TRUE, one dot character is displayed for each successful replication; if FALSE, display of the replication dots is suppressed.

print.level

numeric. 0 - nothing is printed; 1 - print summary of the model and data. 2 - print summary of technical efficiency measures. 3 - print estimation results observation by observation. Default is 1.

Value

teradialbc returns a list of class npsf containing the following elements:
Knumeric: number of data points.
Mnumeric: number of inputs.
Nnumeric: number of outputs.
rtsstring: RTS assumption.
basestring: base for efficiency measurement.
repsnumeric: number of bootstrap replications.
levelnumeric: confidence level for confidence intervals.
tenumeric: radial measure (Russell) of technical efficiency.
tebcnumeric: bias-corrected radial measures of technical efficiency.
biasbootnumeric: bootstrap bias estimate for the original radial measures of technical efficiency.
varbootnumeric: bootstrap variance estimate for the radial measures of technical efficiency.
biassqvarnumeric: one-third of the ratio of bias squared to variance for radial measures of technical efficiency.
realrepsnumeric: actual number of replications used for statistical inference.
telownumeric: lower bound estimate for radial measures of technical efficiency.
teuppnumeric: upper bound estimate for radial measures of technical efficiency.
tebootnumeric: reps x K matrix containing bootstrapped measures of technical efficiency from each of reps bootstrap replications.
esamplelogical: returns TRUE if the observation in user supplied data is in the estimation subsample and FALSE otherwise.

Details

Models for teradialbc are specified symbolically. A typical model has the form outputs ~ inputs, where outputs (inputs) is a series of (numeric) terms which specifies outputs (inputs). The same goes for reference set. Refer to the examples.

If core.count>=1, teradialbc will perform bootstrap on multiple cores. Parallel computing requires package snowFT. By the default cluster type is defined by option cl.type="SOCK". Specifying cl.type="MPI" requires package Rmpi.

On some systems, specifying option cl.type="SOCK" results in much quicker execution than specifying option cl.type="MPI". Option cl.type="SOCK" might be problematic on Mac system.

Parallel computing make a difference for large data sets. Specifying option dots=TRUE will indicate at what speed the bootstrap actually proceeds. Specify reps=100 and compare two runs with option core.count=1 and core.count>1 to see if parallel computing speeds up the bootstrap. For small samples, parallel computing may actually slow down the teradialbc.

References

Färe, R. and Lovell, C. A. K. (1994), Measuring the technical efficiency of production, Journal of Economic Theory, 19, 150--162

Färe, R., Grosskopf, S. and Lovell, C. A. K. (1994), Production Frontiers, Cambridge U.K.: Cambridge University Press

Kneip, A., Simar L., and P.W. Wilson (2008), Asymptotics and Consistent Bootstraps for DEA Estimators in Nonparametric Frontier Models, Econometric Theory, 24, 1663--1697

Simar, L. and P.W. Wilson (1998), Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparametric Frontier Models, Management Science, 44, 49--61

Simar, L. and P.W. Wilson (2000), A General Methodology for Bootstrapping in Nonparametric Frontier Models, Journal of Applied Statistics, 27, 779--802

Examples

Run this code

require( npsf )

# Prepare data and matrices

data( pwt56 )
head( pwt56 )

# Create some missing values

pwt56 [49, "K"] <- NA # just to create missing

Y1 <- as.matrix ( pwt56[ pwt56$year == 1965, c("Y"), drop = FALSE] )
X1 <- as.matrix ( pwt56[ pwt56$year == 1965, c("K", "L"), drop = FALSE] )

X1 [51, 2] <- NA # just to create missing
X1 [49, 1] <- NA # just to create missing

data( ccr81 )
head( ccr81 )

# Create some missing values

ccr81 [64, "x4"] <- NA # just to create missing
ccr81 [68, "y2"] <- NA # just to create missing

Y2 <- as.matrix( ccr81[ , c("y1", "y2", "y3"), drop = FALSE] )
X2 <- as.matrix( ccr81[ , c("x1", "x2", "x3", "x4", "x5"), drop = FALSE] )

# Compute output-based measures of technical efficiency under 
# the assumption of CRS (the default) and perform bias-correctiion
# using smoothed homogeneous bootstrap (the default) with 999
# replications (the default).

t1 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81)

# or just

t2 <- teradialbc ( Y2 ~ X2)

# Combined formula and matrix

t3 <- teradialbc ( Y ~ K + L, data = pwt56, subset = Nu < 10, 
	ref = Y1[-2,] ~ X1[-1,] )

# Compute input-based measures of technical efficiency under 
# the assumption of VRS and perform bias-correctiion using
# subsampling heterogenous bootstrap with 1999 replications.
# Choose to report 99# formed by data points where x5 is not equal 10. 
# Suppress printing dots.

t4 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, reps = 1999, 
	smoothed = FALSE, kappa = 0.7, dots = FALSE, 
	base = "i", rts = "v", level = 99)

# Compute input-based measures of technical efficiency under
# the assumption of NRS and perform bias-correctiion using 
# smoothed heterogenous bootstrap with 499 replications for 
# all data points. The reference set formed by data points 
# where x5 is not equal 10.

t5 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, homogeneous = FALSE, 
	reps = 999, smoothed = TRUE, dots = TRUE, base = "i", rts = "n")


# ===========================
# ===  Parallel computing ===
# ===========================

# Perform previous bias-correction but use 4 cores and 
# cluster type MPI

t51 <-  teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, homogeneous = FALSE, 
	reps = 999, smoothed = TRUE, dots = TRUE, base = "i", rts = "n", 
	core.count = 4, cl.type = "MPI")


# Really large data-set

data(usmanuf)
head(usmanuf)

nrow(usmanuf)
table(usmanuf$year)

# This will take some time depending on computer power

data(usmanuf)
head(usmanuf)

t6 <- teradialbc ( Y ~ K + L + M, data = usmanuf, 
	subset = year >= 1999 & year <= 2000, homogeneous = FALSE, 
	base = "o", reps = 100, 
	core.count = 4, cl.type = "MPI")

Run the code above in your browser using DataLab