Learn R Programming

sfa (version 1.0.4)

data_gen_cs: Generate Cross-Sectional Data for Stochastic Frontier Analysis

Description

data_gen_cs generates simulated cross-sectional data based on the stochastic frontier model, allowing for different distributional assumptions for the one-sided technical inefficiency error term (\(u\)) and the two-sided idiosyncratic error term (\(v\)). The model has the general form: \(Y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + v - u\) where \(u \geq 0\) and represents inefficiency. All variants are produced so that the user can select those that they want.

Usage

data_gen_cs(N, rand, sig_u, sig_v, cons, beta1, beta2, a, mu)

Value

A data frame containing \(N\) observations with the following columns:

name

Individual identifier (simply \(1\) to \(N\)).

cons

The constant term value.

x1

Simulated explanatory variable \(x_1\).

x2

Simulated explanatory variable \(x_2\).

u, uz, u_t, u_c, u_e, u_u, u_tn

The simulated one-sided error terms under different distributions.

v, v_t, v_c

The simulated two-sided error terms under different distributions.

y_pcs, y_pcs_t, y_pcs_e, y_pcs_c, y_pcs_u, y_pcs_z, y_pcs_w, y_pcs_tn

The dependent variable \(Y\) under the corresponding SFA model distributions.

z

The auxiliary variable used for heteroskedasticity in y_pcs_z.

con

A constant column set to 1, potentially for use in estimation.

Arguments

N

A single integer specifying the number of observations (cross-sectional units).

rand

A single integer to set the seed for the random number generator, ensuring reproducibility.

sig_u

The standard deviation parameter (\(\sigma_u\)) for the base distribution of the one-sided error term \(u\).

sig_v

The standard deviation parameter (\(\sigma_v\)) for the base distribution of the two-sided error term \(v\).

cons

The value of the constant term (intercept) in the model.

beta1

The coefficient for the \(x_1\) variable.

beta2

The coefficient for the \(x_2\) variable.

a

The degrees of freedom parameter for the t half-t distribution (u_t and v_t, respectively). Requires the rt function.

mu

The mean parameter (\(\mu\)) for the normal truncated normal distribution (u_tn). Requires the rtruncnorm function.

Author

David Bernstein

Details

The function simulates two explanatory variables, \(x_1\) and \(x_2\), as transformations of uniform random variables.

The function generates several different frontier models by combining various distributions for \(u\) and \(v\):

  • **\(u\) Distributions (Inefficiency):** Half-Normal (HN), Truncated Normal (TN), Half-T (HT), Half-Cauchy (HC), Exponential (E), Half-Uniform (HU).

  • **\(v\) Distributions (Idiosyncratic):** Normal (N), t, Cauchy (C).

**Specific Model Outputs (y_pcs variants):**

  • y_pcs: Normal-Half Normal (N-HN): \(v \sim N(0, \sigma_v^2)\), \(u \sim |N(0, \sigma_u^2)|\).

  • y_pcs_z: N-HN with Heteroskedastic \(\sigma_u\): \(\sigma_{u,i} = \exp(0.9 + 0.6 Z_i)\), where \(Z\) is a uniform variable.

  • y_pcs_t: T-Half T (T-HT): \(v \sim T(\text{df}=a) \cdot \sigma_v\), \(u \sim |T(\text{df}=a)| \cdot \sigma_u\).

  • y_pcs_tn: Normal-Truncated Normal (N-TN): \(v \sim N(0, \sigma_v^2)\), \(u \sim TN(\mu, \sigma_u^2)\) on \([0, \infty)\).

  • y_pcs_e: Normal-Exponential (N-E): \(v \sim N(0, \sigma_v^2)\), \(u \sim Exp(\phi)\), where \(\phi = 1/\sigma_u\).

  • y_pcs_c: Cauchy-Half Cauchy (C-HC): \(v \sim Cauchy(0, \sigma_v)\), \(u \sim |Cauchy(0, \sigma_u)|\).

  • y_pcs_u: Normal-Half Uniform (N-HU): \(v \sim N(0, \sigma_v^2)\), \(u \sim U(0, \sigma_u)\).

  • y_pcs_w: Normal + Cauchy - Half Normal: \(v \sim N(0, \sigma_v^2) + Cauchy(0, \sigma_v)\), \(u \sim |N(0, \sigma_u^2)|\). This introduces a composite \(v\) term.

**Note:** The rtruncnorm function is required for y_pcs_tn and loads with the package. In isolation it could be loaded by using library(truncnorm).

See Also

rnorm, runif, rt, rexp, rcauchy, rtruncnorm (if available).

Examples

Run this code

# Generate 100 observations of SFA data
data_sfa <- data_gen_cs(
  N     = 100,
  rand  = 123,
  sig_u = 0.5,
  sig_v = 0.2,
  cons  = 5,
  beta1 = 1.5,
  beta2 = 2.0,
  a     = 5,   # degrees of freedom for T/Half-T
  mu    = 0.1  # mean for Truncated Normal
)

# Display the first few rows of the generated data
head(data_sfa)

# Example of a Normal-Half Normal SFA model data
summary(data_sfa$y_pcs)
plot(density(data_sfa$y_pcs))

Run the code above in your browser using DataLab