training: Training dataset used to contruct the profile scoring matrix in the SCORER 2.0 algorithm.

Description

A dataframe containing three columns that must be named "type", "sequence" and "register". The order of the columns in the dataframe does not matter

column "type": contains the known oligomeric state of the coiled-coil sequences in the training data. Acceptable oligomeric states are "DIMER" and "TRIMER" only.
column "sequence": contains the amino-acid sequences of the coiled coils in the training data. Valid characters are all uppercase letters except ‘B’, ‘J’, ‘O’, ‘U’, ‘X’, and ‘Z’; invalid characters will not be tolerated and their use will result in a failure of the program.
Contains the register assignments specific to each coiled-coil sequence in the training data. As such, it must always have the same length as the matching amino-acid sequence in the "sequence" column. Valid characters are the lowercase letters ‘a’ to ‘g’ only. Register assignments are not required to be in proper order and may start with any of the seven letters.

Usage

data(training)

Arguments

Format

A multi-dimensional array with 7 element, each of dimension 2x21.

Source

DOI: 10.1093/bioinformatics/btr299.

References

Craig T. Armstrong, Thomas L. Vincent, Peter J. Green and Dek N. Woolfson. (2011) SCORER 2.0: an algortihm for distinguishing parallel dimeric and trimeric coiled-coil sequences. Bioinformatics. DOI: 10.1093/bioinformatics/btr299

Examples

Run this code

	data(training)
	print(training)

Run the code above in your browser using DataLab