tab2xml
The goal of tab2xml
is to convert spreadsheet files (.xlsx or .ods)
into structured XML documents using a predefined template. The package
processes the spreadsheet data, replacing template tokens with
corresponding values, and manages foreign key relationships
automatically.
Installation
You can install the released version of tab2xml
from
CRAN with:
install.packages("tab2xml")
And the development version from GitHub with:
# install.packages("pak")
pak::pak("josesamos/tab2xml")
Example
This example demonstrates how to use tab2xml
to convert an Excel or
ODS file into an XML file, based on a schema example from the Mondrian
Documentation.
library(tab2xml)
# Define file paths
source_xml <- system.file("extdata", "schema_template.xml", package = "tab2xml")
source_xlsx <- system.file("extdata", "schema.xlsx", package = "tab2xml")
temp_file <- tempfile(fileext = ".xml")
Show spreadsheet contents
sheet_names <- readxl::excel_sheets(source_xlsx)
for (sheet in sheet_names) {
cat("\n#### Sheet:", sheet)
data <- readxl::read_excel(source_xlsx, sheet = sheet)
print(knitr::kable(data))
}
Sheet: Cube
cube_pk | name | table_fk |
---|
1 | Sales | 1 |
Sheet: Table
table_pk | name |
---|
1 | sales_fact_1997 |
2 | customer |
3 | time_by_day |
Sheet: Dimension
dimension_pk | name | foreignKey | cube_fk |
---|
1 | Gender | customer_id | 1 |
2 | Time | time_id | 1 |
Sheet: Hierarchy
hierarchy_pk | name | hasAll | allMemberName | primaryKey | dimension_fk | table_fk |
---|
1 | Gender | true | allMemberName=“All Genders” | customer_id | 1 | 2 |
2 | Time | false | NA | time_id | 2 | 3 |
Sheet: Level
name | column | type | uniqueMembers | hierarchy_fk |
---|
Gender | gender | String | true | 1 |
Year | the_year | Numeric | true | 2 |
Quarter | quarter | Numeric | false | 2 |
Month | month_of_year | Numeric | false | 2 |
Sheet: Measure
name | column | aggregator | formatString | cube_fk |
---|
Unit Sales | unit_sales | sum | #,### | 1 |
Store Sales | store_sales | sum | #,###.## | 1 |
Store Cost | store_cost | sum | #,###.00 | 1 |
Sheet: CalculatedMember
calculatedmember_pk | name | dimension | formula | cube_fk |
---|
1 | Profit | Measures | [Measures].[Store Sales] - [Measures].[Store Cost] | 1 |
Sheet: CalculatedMemberProperty
name | value | calculatedmember_fk |
---|
FORMAT_STRING | $#,##0.00 | 1 |
Convert spreadsheet to XML
file <- sheet2xml(source_xlsx, source_xml, temp_file, optimize = TRUE)
Check output
library(xml2)
#> Warning: package 'xml2' was built under R version 4.4.3
xml_content <- readLines(file, warn = FALSE)
cat("```xml\n", paste(xml_content, collapse = "\n"), "\n```", sep = "")
<?xml version="1.0" encoding="UTF-8"?>
<Schema>
<Cube name="Sales">
<Table name="sales_fact_1997"/>
<Dimension name="Gender" foreignKey="customer_id">
<Hierarchy name="Gender" hasAll="true" allMemberName="All Genders" primaryKey="customer_id">
<Table name="customer"/>
<Level name="Gender" column="gender" type="String" uniqueMembers="true"/>
</Hierarchy>
</Dimension>
<Dimension name="Time" foreignKey="time_id">
<Hierarchy name="Time" hasAll="false" primaryKey="time_id">
<Table name="time_by_day"/>
<Level name="Year" column="the_year" type="Numeric" uniqueMembers="true"/>
<Level name="Quarter" column="quarter" type="Numeric" uniqueMembers="false"/>
<Level name="Month" column="month_of_year" type="Numeric" uniqueMembers="false"/>
</Hierarchy>
</Dimension>
<Measure name="Unit Sales" column="unit_sales" aggregator="sum" formatString="#,###"/>
<Measure name="Store Sales" column="store_sales" aggregator="sum" formatString="#,###.##"/>
<Measure name="Store Cost" column="store_cost" aggregator="sum" formatString="#,###.00"/>
<CalculatedMember name="Profit" dimension="Measures" formula="[Measures].[Store Sales] - [Measures].[Store Cost]">
<CalculatedMemberProperty name="FORMAT_STRING" value="$#,##0.00"/>
</CalculatedMember>
</Cube>
</Schema>
In this way, we can organize and work with the data in tabular form and
generate XML documents directly using the provided templates.