ICAMS (version 2.0.7)

FindDelMH: Return the length of microhomology at a deletion.

Description

Return the length of microhomology at a deletion.

Usage

FindDelMH(context, deleted.seq, pos, trace = 0)

Arguments

context

The deleted sequence plus ample surrounding sequence on each side (at least as long as del.sequence).

deleted.seq

The deleted sequence in context.

pos

The position of del.sequence in context.

trace

If > 0, cat various messages.

Value

The length of the maximum microhomology of del.sequence in context.

Details

This function is primarily for internal use, but we export it to document the underlying logic.

Example:

GGCTAGTT aligned to GGCTAGAACTAGTT with a deletion represented as:

GGCTAGAACTAGTT GG------CTAGTT GGCTAGTT GG[CTAGAA]CTAGTT ---- ----

Presumed repair mechanism leading to this:

  ....
GGCTAGAACTAGTT
CCGATCTTGATCAA

=>

.... GGCTAG TT CC GATCAA ....

=>

GGCTAGTT CCGATCAA

Variant-caller software can represent the same deletion in several different, but completely equivalent, ways.

GGC------TAGTT GGCTAGTT GGC[TAGAAC]TAGTT * --- * ---

GGCT------AGTT GGCTAGTT GGCT[AGAACT]AGTT ** -- ** --

GGCTA------GTT GGCTAGTT GGCTA[GAACTA]GTT *** - *** -

GGCTAG------TT GGCTAGTT GGCTAG[AACTAG]TT **** ****

A deletion in a repeat can also be represented in several different ways. A deletion in a repeat is abstractly equivalent to microhomology that spans the entire deleted sequence. For example;

GACTAGCTAGTT
GACTA----GTT GACTAGTT GACTA[GCTA]GTT
                        *** -*** -

is really a repeat

GACTAG----TT GACTAGTT GACTAG[CTAG]TT
                        **** ----

GACT----AGTT GACTAGTT GACT[AGCT]AGTT ** --** --

This function only flags this case with a -1 return; it does not figure out the repeat extent.

This function finds:

  1. The maximum match of undeleted sequence to the left of the deletion that is identical to the right end of the deleted sequence, and

  2. The maximum match of undeleted sequence to the right of the deletion that is identical to the left end of the deleted sequence.

The microhomology sequence is the concatenation of items (1) and (2).

Examples

Run this code
# NOT RUN {
# GAGAGG[CTAGAA]CTAGTT
#        ----   ----
FindDelMH("GGAGAGGCTAGAACTAGTTAAAAA", "CTAGAA", 8, trace = 0)  # 4
# }

Run the code above in your browser using DataCamp Workspace