Learn R Programming

pricelevels (version 1.4.0)

pricedata: Price data characteristics

Description

Price data typically consist of prices (and purchased quantities) for multiple products and regions. Since not all products are usually available in all regions, the data exhibit gaps. In some situations, the gaps can lead to non-connected data, which prevents a price comparison between all regions.

The following functions are available to derive the characteristics of a data set:

  • is.connected() checks if all regions in the data are connected either directly or indirectly by some bridging region

  • neighbors() divides the regions into groups of connected regions

  • connect() is a simple wrapper of neighbors(), connecting the data using the group of regions with the maximum number of observations

  • gaps() computes the (percentage) number of gaps in the data

  • pairs() derives the number of available bilateral index pairs

  • properties() derives data characteristics for each group of connected regions

Usage

is.connected(r, n)

neighbors(r, n, simplify=FALSE)

connect(r, n)

gaps(r, n, relative=TRUE)

pairs(r, n)

properties(r, n)

Value

The function

  • is.connected() prints a single logical indicating if the data is connected or not

  • connect() returns a logical vector of the same length as r and n

  • neighbors() gives a list or vector of connected regions

  • pairs() returns a single numeric for the number of bilateral pairs

  • gaps() returns a single numeric for the percentage of data gaps

The function properties() provides a data.table with the following variables:

groupintegergroup identifier
sizeintegernumber of regions belonging to that group
regionslistregions belonging to that group
pairsintegernumber of available non-redundant region pairs, e.g., (AB,AC,BC)=3
nprodsintegernumber of unique products
nobsintegernumber of observations
gapsnumericpercentage of data gaps

Arguments

r, n

A character vector or factor of regional entities r and products n, respectively.

simplify

A logical indicating whether the results should be simplified to a vector of group identifiers (TRUE) or not (FALSE). In the latter case the output will be a list of connected regions.

relative

A logical indicating whether the absolute (FALSE) or relative (TRUE) number of data gaps should be computed.

Author

Sebastian Weinand

Details

Before calculations start, missing values are removed from the data. Duplicated observations for r and n are counted as one observation. Products with prices in only one region r do not provide meaningful information for interregional comparisons. Such products are therefore not considered by gaps(), pairs() and properties(). This approach follows the default treatment of all index functions in this package.

Following World Bank (2013, p. 98), a "price tableau is said to be connected if the price data are such that it is not possible to place the countries in two groups in which no item priced by any country in one group is priced by any other country in the second group".

References

World Bank (2013). Measuring the Real Size of the World Economy: The Framework, Methodology, and Results of the International Comparison Program. Washington, D.C.: World Bank.

Examples

Run this code
### connected price data:
set.seed(123)
dt1 <- rdata(R=4, B=1, N=3)

dt1[, is.connected(r=region, n=product)] # true
dt1[, neighbors(r=region, n=product, simplify=TRUE)]
dt1[, gaps(r=region, n=product)]
dt1[, pairs(r=region, n=product)]
dt1[, properties(r=region, n=product)]

### non-connected price data:
dt2 <- data.table::data.table(
          "region"=c("a","a","h","b","a","a","c","c","d","e","e","f",NA),
          "product"=c(1,1,"bla",1,2,3,3,4,4,5,6,6,7),
          "price"=runif(13,5,6),
          stringsAsFactors=TRUE)

dt2[, is.connected(r=region, n=product)] # false
with(dt2, neighbors(r=region, n=product))
dt2[, properties(region, product)]
# note that the first two observations are treated as one
# while the observation [NA,7] is dropped. Observation [a,2]
# is still included even though it does not provide valueable
# information for interregional comparisons (the product is
# observed in only one region)

# connect the price data:
dt2[connect(r=region, n=product),]

Run the code above in your browser using DataLab