Learn R Programming

gclink (version 1.1)

gc_scale: Scale Gene-Cluster Coordinates for Visualization

Description

Prepares a gene‐cluster annotation table for downstream plotting by converting absolute genomic coordinates into relative positions, ensuring that every cluster starts at 0 and is oriented consistently. Hypothetical ORFs (originally labeled with NA) and missing labels are replaced with placeholders, and factor levels are set as requested.

Usage

gc_scale(GC_meta = GC_meta, levels_gene_group = levels_gene_group)

Value

The input data frame with the following new or overwritten columns:

gene_label

Empty string ("") if originally NA.

gene_group

Set to "hypothetical ORF" if originally NA, then coerced to a factor using levels_gene_group.

Pgenome

Factor version of gene_cluster; levels follow the order of appearance in the data.

Pstart, Pend

Relative start and end coordinates (numeric) within each cluster, scaled so that the left-most gene starts at 0.

Pdirection

Logical vector: TRUE for forward, FALSE for reverse.

Arguments

GC_meta

A data frame containing gene-cluster information. Must include the columns gene_cluster, gene_group, gene_label, start, end, and direction (numeric: 1 for forward, -1 for reverse).

levels_gene_group

Character vector specifying the desired factor levels for gene_group. Group names should appear in the order required for plotting legends.

Details

  • Absolute start/end values are not modified; scaled values are stored in new columns (Pstart, Pend).

  • Pgenome can be swapped for any unique identifier (e.g., Genome) downstream if each genome contains only one cluster.