Learn R Programming

gclink (version 1.1)

gc_range: Determine ORF Range for a Candidate Gene Cluster

Description

Internal helper used by gc_cal. After gc_position has isolated the ORF positions belonging to a single cluster, this function validates and trims that range so that the final span (distance between the first and last retained ORF) does not exceed AllGeneNum. The goal is to retain the largest contiguous block that still satisfies the user-defined size limit.

Usage

gc_range(Norf_position = Norf_position, AllGeneNum = 20, MinConSeq = 10)

Value

A numeric vector containing the final ORF positions that define the validated gene cluster. If no valid block can be produced, the vector will be empty.

Arguments

Norf_position

Numeric vector of ORF positions (ascending) that belong to the current candidate cluster (output from gc_position).

AllGeneNum

Integer. Maximum allowed genomic span (in ORF count) for the final cluster.

MinConSeq

Integer. Minimum number of consecutive reference genes required for the cluster.

Details

  • For every reference gene in Norf_position, the function evaluates whether a window of at least MinConSeq consecutive reference genes centred on that gene can fit within AllGeneNum consecutive ORFs.

  • Genes that pass the test are collected in retain.site.

  • The minimal and maximal positions in retain.site are then used to slice the full ORF range, guaranteeing that the final cluster length <= AllGeneNum.