data_transformation: Tranforms dependent variables

Description

Function data_transformation transforms the dependent variable from the formula object fixed in the given sample data set. Thus, it returns the original sample data set with transformed dependent variable. For the transformation three types can be chosen, particularly no, natural log and Box-Cox transformation.

Usage

data_transformation(fixed, smp_data, transformation, lambda)

Arguments

fixed

a two-sided linear formula object describing the fixed-effects part of the nested error linear regression model with the dependent variable on the left of a ~ operator and the explanatory variables on the right, separated by + operators. The argument corresponds to the argument fixed in function lme.

smp_data

a data frame that needs to comprise all variables named in fixed. If transformed data is further used to fit a nested error linear regression model, smp_data also needs to comprise the variable named in smp_domains (see ebp).

transformation

a character string. Three different transformation methods for the dependent variable can be chosen (i) no transformation ("no"); (ii) natural log transformation ("log"); (iii) Box-Cox transformation ("box.cox").

lambda

a scalar parameter that determines the Box-Cox transformation. In case of no and natural log transformation lambda can be set to NULL.

Value

a named list with two elements, a data frame containing the data set with transformed dependent variable (transformed_data) and a shift parameter shift if present. In case of no transformation, the original data frame is returned and the shift parameter is NULL.

Details

For the natural log and Box-Cox transformation the dependent variable is shifted such that all values are greater than zero since the transformations are not applicable for values equal to or smaller than zero. The shift is calculated as follows: $$shift = |min(y)| + 1 \qquad if \qquad min(y) <= 0$$ Function data_transformation works as a wrapper function. This means that the function manages the selection of the three different transformation functions no_transform, log_transform and box_cox.

Examples

Run this code

# NOT RUN {
# Loading data - sample data
data("eusilcA_smp")

# Transform dependent variable in sample data with Box-Cox transformation
transform_data <- data_transformation(eqIncome ~ gender + eqsize + cash + 
self_empl + unempl_ben + age_ben + surv_ben + sick_ben + dis_ben + rent + 
fam_allow + house_allow + cap_inv + tax_adj, eusilcA_smp, "box.cox", 0.7)
# }