Learn R Programming

PriceIndices (version 0.2.5)

data_imputing: Imputing missing and (optionally) zero prices.

Description

This function imputes missing prices and (optionally) zero prices by using one of the following methods: carry forward/backward, overall mean, class mean (targeted mean).

Usage

data_imputing(
  data,
  start,
  end,
  method = "carry forward",
  class = c(),
  formula = "jevons",
  zero_prices = TRUE,
  outlets = FALSE
)

Value

This function imputes missing prices (unit values) and (optionally) zero prices by using one of the following methods: carry forward/backward, overall mean, class mean (targeted mean). The imputation can be done for each outlet separately or for aggregated data (see the outlets parameter). For the carry forward/backward method: if a missing product has a previous price then that previous price is carried forward until the next real observation. If there is no previous price then the next real observation is found and carried backward. For the overall mean method: the procedure is similar, except that the imputed price is based on the previously recorded price multiplied (or divided - in the case of the next recorded price) by the price index determined for the quoted and imputed period. The user can select the index formula via the formula parameter. For the class mean method (also known as targeted mean method): the procedure is analogous to the overall mean method, but the price index is determined for the product class specified by the class parameter. The quantities for imputed prices are set to zero. The function returns a data frame (monthly aggregated) which is ready for price index calculations.

Arguments

data

The user's data frame with information about sold products. It must contain columns: time (as Date in format: year-month-day,e.g. '2020-12-01'), prices (as numeric), quantities (as numeric - for future calculations) and prodID (as numeric, factor or character). A column retID (as factor, character or numeric) is also needed if the User wants to impute prices over outlets.

start

The base period (as character) limited to the year and month, e.g. "2020-03".

end

The research period (as character) limited to the year and month, e.g. "2020-04".

method

A character string indicating the imputation method. Available options are: carry forward, overall mean, class mean. For the class mean method, the class parameter must be specified.

class

A character string indicating the column which describes product classes (homogeneous subgroups).

formula

A character string indicating the index formula which will be used for the overall mean or class mean method. Available options are: dutot, carli, jevons, fisher, tornqvist, walsh.

zero_prices

A logical parameter indicating whether zero prices are to be imputed too (then it is set to TRUE).

outlets

A logical parameter indicating whether imputations are to be done for each outlet separately (then it is set to TRUE).

Examples

Run this code
# Creating a small data set with zero prices:
time.<-c("2018-12-01","2019-01-01")
time<-as.Date(c(time., time., time.))
p1<-c(0,23,10)
p2<-c(40,0,20)
q1<-c(15,25,30)
q2<-c(44,79,30)
quantities<-c(q1,q2)
prices<-c(p1,p2)
prodID<-c(1,1,2,2,3,3)
my_data<-data.frame(time, prices, quantities, prodID)
# Price imputing:
data_imputing(my_data, start="2018-12", end="2019-01",
zero_prices=TRUE, outlets=FALSE)
data_imputing(my_data, start="2018-12", end="2019-01",
zero_prices=TRUE, outlets=FALSE, method="overall mean", formula="dutot")
# \donttest{
# Preparing a data set with zero and missing prices:
dataMATCH$prodID<-dataMATCH$codeIN 
data<-dplyr::select(dataMATCH, time, prices, quantities, prodID, retID)
set1<-data[1:5,]
set1$prices<-0
set2<-data[6:30,]
df<-rbind(set1, set2)
# Price imputing:
data_imputing(df, start="2018-12", end="2019-02",
zero_prices=TRUE, outlets=TRUE)
data_imputing(df, start="2018-12", end="2019-02",
method="overall mean", zero_prices=TRUE, formula="fisher")# }

Run the code above in your browser using DataLab