Learn R Programming

VariantAnnotation (version 1.18.5)

PolyPhenDbColumns: PolyPhenDb Columns

Description

Description of the PolyPhen Sqlite Database Columns

Arguments

Column descriptions

These column names are displayed when columns is called on a PolyPhenDb object.
  • rsid : rsid
Original query :
  • OSNPID : original SNP identifier from user input
  • OSNPACC : original protein identifier from user input
  • OPOS : original substitution position in the protein sequence from user input
  • OAA1 : original wild type (reference) aa residue from user input
  • OAA2 : original mutant (reference) aa residue from user input
Mapped query :
  • SNPID : SNP identifier mapped to dbSNP rsID if available, otherwise same as o_snp_id. This value was used as the rsid column
  • ACC : protein UniProtKB accession if known protein, otherwise same as o_acc
  • POS : substitution position mapped to UniProtKB protein sequence if known, otherwise same as o_pos
  • AA1 : wild type aa residue
  • AA2 : mutant aa residue
  • NT1 : wild type allele nucleotide
  • NT2 : mutant allele nucleotide
PolyPhen-2 prediction :
  • PREDICTION : qualitative ternary classification FPR thresholds
PolyPhen-1 prediction :
  • BASEDON : prediction basis
  • EFFECT : predicted substitution effect on the protein structure or function
PolyPhen-2 classifiers :
  • PPH2CLASS : binary classifier outcome ("damaging" or "neutral")
  • PPH2PROB : probability of the variation being dammaging
  • PPH2FPR : false positive rate at the pph2_prob level
  • PPH2TPR : true positive rate at the pph2_prob level
  • PPH2FDR : false discovery rate at the pph2_prob level
UniProtKB-SwissProt derived protein sequence annotations :
  • SITE : substitution SITE annotation
  • REGION : substitution REGION annotation
  • PHAT : PHAT matrix element for substitution in the TRANSMEM region
Multiple sequence alignment scores :
  • DSCORE : difference of PSIC scores for two aa variants (Score1 - Score2)
  • SCORE1 : PSIC score for wild type aa residue (aa1)
  • SCORE2 : PSIC score for mutant aa residue (aa2)
  • NOBS : number of residues observed at the substitution position in the multiple alignment (sans gaps)
Protein 3D structure features :
  • NSTRUCT : initial number of BLAST hits to similar proteins with 3D structures in PDB
  • NFILT : number of 3D BLAST hits after identity threshold filtering
  • PDBID : protein structure identifier from PDB
  • PDBPOS : position of substitution in PDB protein sequence
  • PDBCH : PDB polypeptide chain identifier
  • IDENT : sequence identity between query and aligned PDB sequences
  • LENGTH : PDB sequence alignment length
  • NORMACC : normalized accessible surface
  • SECSTR : DSSP secondary structure assignment
  • MAPREG : region of the phi-psi (Ramachandran) map derived from the residue dihedral angles
  • DVOL : change in residue side chain volume
  • DPROP : change in solvent accessible surface propensity resulting from the substitution
  • BFACT : normalized B-factor (temperature factor) for the residue
  • HBONDS : number of hydrogen sidechain-sidechain and sidechain-mainchain bonds formed by the residue
  • AVENHET : average number of contacts with heteroatoms per residue
  • MINDHET : closest contact with heteroatom
  • AVENINT : average number of contacts with other chains per residue
  • MINDINT : closest contact with other chain
  • AVENSIT : average number of contacts with critical sites per residue
  • MINDSIT : closest contact with a critical site
Nucleotide sequence features (CpG/codon/exon junction) :
  • TRANSV : whether substitution is a transversion
  • CODPOS : position of the substitution within the codon
  • CPG : whether or not the substitution changes CpG context
  • MINDJNC : substitution distance from exon/intron junction
Pfam protein family :
  • PFAMHIT : Pfam identifier of the query protein
Substitution scores :
  • IDPMAX : maximum congruency of the mutant aa residue to all sequences in multiple alignment
  • IDPSNP : maximum congruency of the mutant aa residue to the sequence in alignment with the mutant residue
  • IDQMIN : query sequence identity with the closest homologue deviating from the wild type aa residue
Comments :
  • COMMENTS : Optional user comments

See Also

?PolyPhenDb