MRGmerge: Merge two or more multi-resolution grids to a common resolution

Description

Merge two or more multi-resolution grids to a common resolution

Usage

MRGmerge(
  himg1,
  himg2,
  vars1,
  vars2,
  na.rm = TRUE,
  postProcess = FALSE,
  aggr = "merge",
  ...
)

Value

The function produces a new multiresolution grid, which is a sf-object with polygons.

Arguments

himg1: Either a multi-resolution grid (typically resulting from a call to multiResGrid), or a list of such grids
himg2: A multi-resolution grid, typically resulting from a call to multiResGrid
vars1: Variable(s) of interest that should be merged from the first grid, or a list of variables, one for each grid in the list himg1
vars2: Variable(s) of interest that should be merged from the second grid
na.rm: Should NA values be removed when summing values (essentially treating them equal to zero)
postProcess: Logical; should the postprocessing be done as part of creation of the multiresolution grid (TRUE), or be done in a separate step afterwards (FALSE). The second option is useful when wanting to check the confidential grid cells of the final map
aggr: Should data be aggregated to the largest grid cell (aggr = "merge"), or should data from larger grid cells be disaggregated to smaller grid cells (aggr = "disaggr")
...: Additional grids (himg3, himg4, ...) and variables (vars3, vars4, ...) to be merged. Additional grids and variables must be named.

Details

This function can merge different multi-resolution grids to a common resolution, i.e., it will select the grid cells with the lowest resolution, as these are the ones defining the restrictions.

The function will merge the variable names in vars1, vars2, ... if they exist. If they are missing, the function will look for variable names in the attributes of the grids (attr(himg, "vars")). These are added by multiResGrid, but will often disappear if the grid has been manipulated, or has been exported to another format for transmission.

If the variables are not given as vars or attributes, the function will try to guess them from the column names. Typical column names used by MRG (mostly temporary variables such as small, confidential etc) will be ignored. If variable names partly coincide with any of these names, or with count, res, geometry, it is necessary to specify vars.

The multi-resolution grids must be passed as named parameters if more than two are given.

Common variable names in different grids should be avoided.

The default of the function is to treat NA-values as zeroes when merging (through na.rm in sums). It will therefore not be possible to separate restricted grid cells from grid cells with zero observations after merging, except for the ones that have been left as they were. The alternative would be a much higher number of NA-values in the merged grids.

The resulting grid will most likely not have exactly the same values as a multi-resolution grid produced directly from the microdata. If the input-grids have been post-processed (the normal situation when not having access to the microdata), the grid cell values have usually been rounded, and some might have been suppressed. As these rounded and potentially suppressed values are summed, their values are likely to deviate from those that are computed directly from the microdata through a joint gridding process.

The argument aggr will decide on the direction of aggregation. If aggr == "merge", The values in high resolution grid cells will be aggregated to match those of lower resolution grid cells in the second grid. If aggr == "disaggr", the values of the lower resolution grid cells will be redistributed equally among higher resolution grid cells, according to their area. Note that this will most likely result in grid cell values that are apparently confidential (for example having less than 10 individuals). These are still not confidential values, but are average values from a larger area. This will in most cases be fine if the data is used for analyses, but publication of such values should be done with care.

Also note that if more than 2 MRG-grids are merged at the same time, then the redistribution will occur more than once. If the resolution of some grid cells becomes higher for each redistribution, with some of the high resolution grid cells missing, then the average values might differ for different high resolution grid cells coming from the same low value grid cell. See the plotted examples of h2 and h22.

Examples

Run this code

# \donttest{
library(sf)
library(dplyr)
library(ggplot2)
library(viridis)

# These are SYNTHETIC agricultural FSS data 
data(ifs_dk) # Census data
ifs_weight = ifs_dk %>% dplyr::filter(Sample == 1) # Extract weighted subsample

# Create spatial data
ifg = fssgeo(ifs_dk, locAdj = "LL")
fsg = fssgeo(ifs_weight, locAdj = "LL")

# We use the numeric part of the farmtype to create a third variable. This 
# is done for the an example, the value does not have any meaning when treated 
# like this
ifg$ft = as.numeric(substr(ifg$FARMTYPE, 3, 4))^2

ress = c(1,5,10,20,40, 80, 160)*1000
# Create regular grid of the variables
ifl = gridData(ifg, vars = c("UAA", "UAAXK0000_ORG", "ft"), res = ress)

# Create the different multi-resolution grids
himg1 = multiResGrid(ifl, vars = "UAA", ifg = ifg, postProcess = FALSE)
himg2 = multiResGrid(ifl, vars = "UAAXK0000_ORG", ifg = ifg, postProcess = FALSE)
himg3 = multiResGrid(ifl, vars = "ft", ifg = ifg, postProcess = FALSE)

# The grids have different number of polygons
dim(himg1)
dim(himg2)
dim(himg3)

hh1 = MRGmerge(himg1, himg2, himg3 = himg3)
dim(hh1)
# Postprocessing can also be done on the merged object
hh11 = MRGmerge(himg1, himg2, himg3 = himg3, postProcess = TRUE, rounding = -1)
dim(hh11)
summary(hh1$UAA-hh11$UAA)

# Here the merging will instead redistribute average values to 
# the higher resolution grid cells, and also seeing the effect
# of merging a third layer
hh2 = MRGmerge(himg1, himg2, aggr = "disaggr")
hh22 = MRGmerge(himg1, himg2, himg3 = himg3, aggr = "disaggr")
himg2$orgShare = himg2$UAAXK0000_ORG/himg2$res^2 * 10000
hh2$orgShare = hh2$UAAXK0000_ORG/hh2$res^2 * 10000
hh22$orgShare = hh22$UAAXK0000_ORG/hh22$res^2 * 10000
# Plot the organic share (organic area relative to grid cell area) for
# the original MRG grid for organic area, and after merging with the higher
# resolution maps.
p1 = ggplot(himg2) + geom_sf(aes(fill = orgShare)) + ggtitle("original") +
      scale_fill_viridis()
p2 = ggplot(hh2) + geom_sf(aes(fill = orgShare)) + ggtitle("merged two")+
      scale_fill_viridis() 
p3 = ggplot(hh22) + geom_sf(aes(fill = orgShare)) + ggtitle("merged three")+
      scale_fill_viridis() 
if (require(patchwork)) p1 + p2 + p3 + plot_spacer() + plot_layout(guides = 'collect')

# If two data sets share the same variable, one of them has to be renamed.
# (A comparison of the two can act as a indication of possible errors 
# introduced through the post-processing)

himg21 = multiResGrid(ifl, vars = c("UAA", "UAAXK0000_ORG"), ifg = ifg, postProcess = FALSE)
hh3 = try(MRGmerge(himg1, himg21, himg3 = himg3))
himg21 = himg21 %>% rename(UAA2 = UAA, weight_UAA2 = weight_UAA) 
hh3 = MRGmerge(himg1, himg21, himg3 = himg3)


summary(hh3[, c("UAA", "UAA2")])

himg4 = multiResGrid(ifl, vars = c("UAA", "ft", "UAAXK0000_ORG"), ifg = ifg, postProcess = FALSE)
summary(hh1[, c("UAA", "UAAXK0000_ORG", "ft")])
summary(himg4[, c("UAA", "UAAXK0000_ORG", "ft")])
# }

Run the code above in your browser using DataLab