Data for amino acid compositions, and updates to UniProt IDs.
The columns of human_aa
are compatible with the layout used for amino acid compositions in CHNOSZ (see thermo
):
protein |
character | Identification of protein |
organism |
character | Identification of organism |
ref |
character | Reference key for source of compositional data |
abbrv |
character | Abbreviation or other ID for protein |
chains |
numeric | Number of polypeptide chains in the protein |
Here, the protein
column contains the UniProt ID (accession), possibly with a suffix indicating the isoform of the protein (esp. from human_additional.csv
).
These amino acid compositions were compiled from amino acid sequences downloaded from UniProt. Amino acid sequences of human proteins were obtained from files in the UniProt reference proteome, dated 2016-04-03, downloaded from ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/reference_proteomes/Eukaryota/.
The amino acid compositions of human proteins are stored in three files.
human_base.Rdata
contains amino acid compositions of proteins in the UniProt reference proteome (UP000005640_9606.fasta.gz
containing canonical, manually reviewed sequences).
human_additional.Rdata
contains amino acid compositions of additional proteins in the UniProt reference proteome (UP000005640_9606_additional.fasta.gz
containing isoforms and unreviewed sequences).
human_extra.csv
contains amino acid compositions of other (“extra”) proteins identified in proteomic experiments but not listed in one of the files above.
On loading the package, the individual data files are read and combined using rbind
, and the result is assigned to the human_aa
object in the canprot
environment.
As an aid for processing some datasets that use old (obsoleted) UniProt IDs, the corresponding new (current) IDs are are stored in uniprot_updates
.
uniprot_updates
also lists the source (i.e. reference key) that uses each old ID.
# NOT RUN {
nrow(get("human_aa", canprot))
# }
Run the code above in your browser using DataLab