Learn R Programming

⚠️There's a newer version (1.2.2) of this package.Take me there.

geomultistar

Multidimensional systems allow complex queries to be carried out in an easy way. The geographic dimension, together with the temporal dimension, plays a fundamental role in multidimensional systems. Through the geomultistar package, vector geographic data layers can be associated to the attributes of geographic dimensions, so that the results of multidimensional queries can be obtained directly as vector geographic data layers. In other words, this package allows enriching multidimensional queries with geographic data.

The multidimensional structures on which we can define the queries can be created from flat tables with the rolap or starschemar packages, or imported directly using functions from the geomultistar package.

Installation

You can install the released version of geomultistar from CRAN with:

install.packages("geomultistar")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("josesamos/geomultistar")

Example

If we start from a flat table, we can generate a star schema using the rolap package, as described in its vignettes.

If we have a star schema in another tool, we need to import the fact and dimension tables into R in the form of tables implemented by tibble (mrs_fact_age, mrs_fact_cause, mrs_where, mrs_when and mrs_who in the example). Once we have them in this format, we have to build a multistar structure from them: This structure can contain multiple fact and dimension tables, so facts can share dimensions. The definition for tables is included below. The measures of the facts are defined and the relationships between facts and dimensions are established.

library(geomultistar)

ms <- multistar() |>
  add_facts(
    fact_name = "mrs_age",
    fact_table = mrs_fact_age,
    measures = "n_deaths",
    nrow_agg = "count"
  ) |>
  add_facts(
    fact_name = "mrs_cause",
    fact_table = mrs_fact_cause,
    measures = c("pneumonia_and_influenza_deaths", "other_deaths"),
    nrow_agg = "nrow_agg"
  ) |>
  add_dimension(
    dimension_name = "where",
    dimension_table = mrs_where,
    dimension_key = "where_pk",
    fact_name = "mrs_age",
    fact_key = "where_fk"
  ) |>
  add_dimension(
    dimension_name = "when",
    dimension_table = mrs_when,
    dimension_key = "when_pk",
    fact_name = "mrs_age",
    fact_key = "when_fk",
    key_as_data = TRUE
  ) |>
  add_dimension(
    dimension_name = "who",
    dimension_table = mrs_who,
    dimension_key = "who_pk",
    fact_name = "mrs_age",
    fact_key = "who_fk"
  ) |>
  relate_dimension(dimension_name = "where",
                   fact_name = "mrs_cause",
                   fact_key = "where_fk") |>
  relate_dimension(dimension_name = "when",
                   fact_name = "mrs_cause",
                   fact_key = "when_fk")

Once we have a multistar structure, we will associate vector geographic data layers to the attributes of the geographic dimension. We can use existing layers or generate them from the previous definitions. As a result we will have a geomultistar structure.

gms <-
  geomultistar(ms, geodimension = "where") |>
  define_geoattribute(
    attribute = "city",
    from_layer = usa_cities,
    by = c("city" = "city", "state" = "state")
  ) |>
  define_geoattribute(
    attribute = "county",
    from_layer = usa_counties,
    by = c("county" = "county", "state" = "state")
  )  |>
  define_geoattribute(
    attribute = c("state"),
    from_layer = usa_states,
    by = c("state" = "state")
  ) |>
  define_geoattribute(from_attribute = "state")

In the last definition, because no geographic attribute is specified, the rest of the dimension’s attributes are automatically defined from the layer associated with the indicated parameter.

Finally, we can define multidimensional queries on this structure using the functions available in this package. When executing these queries, the vector geographic data layers of the attributes will be taken into account to result in a new vector geographic data layer.

gdqr <- dimensional_query(gms) |>
  select_dimension(name = "where",
                   attributes = c("division_name", "region_name")) |>
  select_dimension(name = "when",
                   attributes = c("year", "week")) |>
  select_fact(name = "mrs_age",
              measures = c("n_deaths")) |>
  select_fact(
    name = "mrs_cause",
    measures = c("pneumonia_and_influenza_deaths", "other_deaths")
  ) |>
  filter_dimension(name = "when", week <= "03") |>
  run_geoquery(wider = TRUE)

The result is a vector geographic data layer that we can save or we can see it as a map, using the functions associated with the sf class.

class(gdqr)
#> [1] "list"

plot(gdqr$sf[,"n_deaths_01"])

Although we have indicated in the query the attributes division_name and region_name, as can be seen in the figure, the result obtained is at the finest granularity level, in this case at the division_name level.

Only the parts of the divisions made up of states where there is recorded data are shown. If we wanted to show the full extent of each division, we should have explicitly associated a layer at that level.

The result includes the meaning of each variable in table form.

id_variablemeasureweek
n_deaths_01n_deaths01
n_deaths_02n_deaths02
n_deaths_03n_deaths03
count_01count01
count_02count02
count_03count03
mrs_cause_pneumonia_and_influenza_deaths_01mrs_cause_pneumonia_and_influenza_deaths01
mrs_cause_pneumonia_and_influenza_deaths_02mrs_cause_pneumonia_and_influenza_deaths02
mrs_cause_pneumonia_and_influenza_deaths_03mrs_cause_pneumonia_and_influenza_deaths03
mrs_cause_other_deaths_01mrs_cause_other_deaths01
mrs_cause_other_deaths_02mrs_cause_other_deaths02
mrs_cause_other_deaths_03mrs_cause_other_deaths03

It can be saved directly as a GeoPackage, using the save_as_geopackage() function.

save_as_geopackage(vl_sf_w, "division")

Copy Link

Version

Install

install.packages('geomultistar')

Monthly Downloads

169

Version

1.2.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Jose Samos

Last Published

January 9th, 2024

Functions in geomultistar (1.2.1)

group_facts

Group facts
dimensional_query

dimensional_query S3 class
get_empty_geoinstances

Get empty instances of a geographic attribute
mrs_age_test

Mortality Reporting System by Age Test
dereference_dimension

Dereference a dimension
group_table

Group the records in the table
filter_dimension

Filter dimension
multistar

multistar S3 class
mrs_where

Dimension where
mrs_who

Dimension who
ms_mrs_test

Multistar for Mortality Reporting System Test
mrs_when

Dimension when
ms_mrs

Multistar for Mortality Reporting System
filter_selected_instances

Filter selected instances
geomultistar

geomultistar S3 class
multistar_as_flat_table

Export a multistar as a flat table
get_selected_measure_names

get_selected_measure_names
mrs_fact_age

Fact age
mrs_fact_cause

Fact cause
name_with_nexus

Name with nexus
run_geoquery

Get a geographic vector from a query
remove_duplicate_dimension_rows

Remove duplicate dimension rows
new_multistar_empty

multistar S3 class
prepare_join

Transform a tibble to join
new_multistar

multistar S3 class
new_geomultistar

geomultistar S3 class
reference_dimension

Reference a dimension
select_dimension

Select dimension
select_fact

Select fact
save_as_geopackage

Save as geopackage
run_query

Run query
uk_london_boroughs

UK London Boroughs
st_mrs_age_test

Star Schema for Mortality Reporting System by Age Test
relate_dimension

Relate a dimension table to a fact table in a multistar
usa_cities

USA Cities, 2014
usa_nation

USA Nation, 2018
new_dimensional_query

dimensional_query S3 class
new_fact_table

fact_table S3 class
usa_counties

USA Counties, 2018
usa_regions

USA Regions, 2018
unify_facts_by_grain

Unify facts by grain
widen_flat_table

widen_flat_table
validate_names

Validate names
usa_divisions

USA Divisions, 2018
usa_states

USA States, 2018
add_dimension

Add a dimension table to a multistar
define_geoattribute_from_attribute

Define a geoattribute from another
define_geoattribute_from_layer

Define an attribute from a layer
add_facts

Add a fact table to a multistar
define_geoattribute

Define geographic attributes
define_selected_facts

Define selected facts
add_geodimension_additional_attributes

Add geodimension additional attributes
define_selected_dimensions

Define selected dimensions
default_attribute

Default attribute
delete_unused_foreign_keys

Delete unused foreign keys