Learn R Programming

sequenza (version 2.1.2)

example.seqz: Example “seqz” data

Description

The “seqz” file is produced by sequenza-utils.py and typically has the file extension .seqz. The data here is representative of an exome-sequenced tumor sample, such as could be obtained from TCGA.

Usage

data(example.seqz)

Arguments

Format

A data frame with a header row and 13 columns:

chromosome

with the chromosome name

position

with the base position

base.ref

with the base in the reference genome used (usually hg19). Note the base.ref is NOT necessarily the base in the normal specimen.

The remaining 10 columns contain the following information:
depth.normal

read depth observed in the normal sample

depth.tumor

read depth observed in the tumor sample

depth.ratio

ratio of depth.tumor and depth.normal

Af

A-allele frequency observed in the tumor sample

Bf

B-allele frequency observed in the tumor sample in heterozygous positions

zygosity.normal

zygosity of the reference sample. "hom" corresponds to AA or BB, whereas "het" corresponds to AB or BA

GC.percent

GC-content (percent), calculated from the reference genome in fixed nucleotide windows

good.reads

number of reads that passed the quality threshold (threshold specified in the pre-processing software), in the tumor specimen

AB.normal

base(s) found in the germline sample; for heterozygous positions AB are sorted using the values of Af and Bf respectively

AB.tumor

base(s) found in the tumor sample not present in the normal specimen. The field include all the variants found in the tumor alignment, separated by a colon. Each variant contains the base and the observed frequency

tumor.strand

frequency of the variant nucleotides detected on the forward orientation. The field have a consistent structure with AB.tumor, indicating the fraction, relative to the total number of reads presenting the specific variant, orientated in the forward direction

Details

example.seqz can be loaded in the standard R way via data(example.seqz), or it can be read from a text file using read.seqz. The former is useful for examples and testing, whereas the latter is representative of the standard workflow.