Learn R Programming

topolow (version 1.0.0)

long_to_matrix: Convert Long Format Data to Distance Matrix

Description

Converts a dataset from long format to a symmetric distance matrix. The function handles antigenic cartography data where measurements may exist between antigens and antisera points. Row and column names can be optionally sorted by a time variable.

Usage

long_to_matrix(
  data,
  chnames,
  chorder = NULL,
  rnames,
  rorder = NULL,
  values_column,
  rc = FALSE,
  sort = FALSE
)

Value

A symmetric matrix of distances with row and column names corresponding to the unique points in the data. NA values represent unmeasured pairs.

Arguments

data

Data frame in long format

chnames

Character. Name of column holding the challenge point names.

chorder

Character. Optional name of column for challenge point ordering.

rnames

Character. Name of column holding reference point names.

rorder

Character. Optional name of column for reference point ordering.

values_column

Character. Name of column containing distance/difference values. It should be from the nature of "distance" (e.g., antigenic distance or IC50), not "similarity" (e.g., HI Titer.)

rc

Logical. If TRUE, reference points are treated as a subset of challenge points. If FALSE, they are treated as distinct sets. Default is FALSE.

sort

Logical. Whether to sort rows/columns by chorder/rorder. Default FALSE.

Details

The function expects data in long format with at least three columns:

  • A column for challenge point names

  • A column for reference point names

  • A column containing the distance/difference values

Optionally, ordering columns can be provided to sort the output matrix. The 'rc' parameter determines how to handle shared names between references and challenges.

Examples

Run this code
data <- data.frame(
  antigen = c("A", "B", "A"),
  serum = c("X", "X", "Y"), 
  distance = c(2.5, 1.8, 3.0),
  year = c(2000, 2001, 2000)
)

# Basic conversion
mat <- long_to_matrix(data, 
                     chnames = "antigen",
                     rnames = "serum",
                     values_column = "distance")
                     
# With sorting by year
mat_sorted <- long_to_matrix(data,
                            chnames = "antigen",
                            chorder = "year",
                            rnames = "serum", 
                            rorder = "year",
                            values_column = "distance",
                            sort = TRUE)

Run the code above in your browser using DataLab