Last chance! 50% off unlimited learning
Sale ends in
get.protein(protein, organism, abundance = NULL, pname = NULL,
average = TRUE, digits = 1)
yeastgfp(location, exclusive = TRUE)
get.protein
, returns the amino acid composition(s) of the specified protein(s), or a single overall composition if abundance
is not NULL. yeastgfp
returns a list with elements yORF
and abundance
, unless location
is NULL, when the function returns (invisible-y
) the names of all locations.protein
contains one or more Ordered Locus Names (OLN) or Open Reading Frame names (ORF), get.protein
retrieves the amino acid composition of the respective proteins in Escherichia coli or Saccharomyces cerevisiae (for organism
equal to ECO or SGD, respectively). The calculation depends on presence of the objects thermo$ECO
and thermo$SGD
, which contain the amino acid compositions of proteins in these organisms. If protein
is instead a name of one of the stress response experiments contained in thermo$stress
, e.g. low.C or heat.up, the function returns the amino acid compositions of the corresponding proteins. If the abundances of the proteins are given in abundance
, the individual protein compositions are multiplied by these values then summed into an overall composition; the average is taken if average
is TRUE
; then the amino acid frequencies are rounded to the number of decimal places specified in digits
. Unless names for the new proteins are given in pname
, they are generated using the values in protein
.
The yeastgfp
function returns the identities and abundances of proteins with the requested subcellular localization (specified in location
) using data from the YeastGFP project that is stored in extdata/abundance/yeastgfp.csv.xz
. The default value of exclusive
(FALSE
) tells the function to grab all proteins that are localized to a compartment even if they are also localized to other compartments. If exclusive
is TRUE
, only those proteins that are localized exclusively to the requested compartments are identified, unless there are no such proteins, then the non-exclusive localizations are used (applies to the bud localization). The values returns by yeastgfp
can be fed to get.protein
in order to get the amino acid compositions of the proteins.
Dick, J. M. (2009) Calculation of the relative metastabilities of proteins in subcellular compartments of Saccharomyces cerevisiae. BMC Syst. Biol. 3:75.
Richmond, C. S., Glasner, J. D., Mau, R., Jin, H. F. and Blattner, F. R. (1999) Genome-wide expression profiling in Escherichia coli K-12. Nucleic Acids Res. 27, 3821--3835.
Tai, S. L., Boer, V. M., Daran-Lapujade, P., Walsh, M. C., de Winde, J. H., Daran, J.-M. and Pronk, J. T. (2005) Two-dimensional transcriptome analysis in chemostat cultures: Combinatorial effects of oxygen availability and macronutrient limitation in Saccharomyces cerevisiae. J. Biol. Chem. 280, 437--447.
get.protein
can be used as input to add.protein
to add the proteins to the thermo$protein
data frame in preparation for further calculations (see examples below).data(thermo)
## basic examples of get.protein
# amino acid composition of two proteins
get.protein(c("YML020W","YBR051W"),"SGD")
# average composition of proteins
get.protein(c("YML020W","YBR051W"),"SGD",
abundance=1,pname="PROT1_NEW")
# 1 of one and 1/2 of the other
get.protein(c("YML020W","YBR051W"),"SGD",
abundance=c(1,0.5),average=FALSE,pname="PROT2_NEW")
# compositions of proteins induced in carbon limitation
get.protein("low.C","SGD")
## overall composition of proteins exclusively localized
## to cytoplasm of S. cerevisiae with reported expression levels
y <- yeastgfp("cytoplasm")
p <- get.protein(y$yORF,"SGD",y$abundance,"cytoplasm")
# add the protein and calculate its properties
i <- add.protein(p)
protein(i)
## speciation diagram for ER.to.Golgi proteins (COPII coat
## proteins) as a function of logfO2, after Dick, 2009
y <- yeastgfp("ER.to.Golgi")
# take out proteins with NA experimental abundance
ina <- which(is.na(y$abundance))
y$yORF <- y$yORF[-ina]
y$abundance <- y$abundance[-ina]
# get the amino acid compositions of the proteins
p <- get.protein(y$yORF,"SGD")
ip <- add.protein(p)
# use logarithms of activities of proteins such
# that total activity of residues is unity
pl <- protein.length(-ip)
logact <- unitize(rep(1,length(ip)),pl)
# load the proteins
basis("CHNOS+")
a <- affinity(O2=c(-80,-73),iprotein=ip,loga.protein=logact)
# make a speciation diagram
diagram(a,ylim=c(-4.9,-2.9))
# where we are closest to experimental log activity
logfO2 <- rep(-78,length(ip))
abline(v=logfO2[1],lty=3)
# scale experimental abundances such that
# total activity of residues is unity
logact.expt <- unitize(log10(y$abundance),pl)
# plot experimental log activity
points(logfO2,logact.expt,pch=16)
text(logfO2+0.5,logact.expt,y$yORF)
# add title
title(main=paste("ER.to.Golgi; points - relative abundances",
"from YeastGFP. Figure after Dick, 2009",sep=""))
## Chemical activities of model subcellular proteins
# speciation diagram as a function of logfO2, after Dick, 2009
basis("CHNOS+")
names <- yeastgfp()
# calculate amino acid compositions using "get.protein" function
for(i in 1:length(names)) {
y <- yeastgfp(names[i])
p <- get.protein(y$yORF,"SGD",y$abundance,names[i])
add.protein(p)
}
species(names,"SGD")
# set unit activity of residues
pl <- protein.length(thermo$species$name)
species(NULL,unitize(thermo$species$logact,pl))
res <- 200
a <- affinity(O2=c(-82,-65,res))
mycolor <- topo.colors(6)[1:4]
mycolor <- rep(mycolor,times=rep(6,4))
logact <- diagram(a,balance="PBB",names=names,ylim=c(-5,-3),legend.x=NULL,
col=mycolor,lwd=2)$logact
# so far good, but how about labels on the plot?
for(i in 1:length(logact)) {
myloga <- as.numeric(logact[[i]])
# don't take values that lie above the plot (vacuole in this example)
myloga[myloga > -3.1] <- -999
imax <- which.max(myloga)
adj <- 0.5
if(imax > 180) adj <- 1
if(imax < 20) adj <- 0
text(seq(-82,-65,length.out=res)[imax],logact[[i]][imax],
labels=names[i],adj=adj)
}
title(main=paste("Subcellular proteins of S. cerevisiae, after Dick, 2009",
describe(thermo$basis[-5,]),sep="\n"),col.main=par("fg"),cex.main=0.9)
## Oxygen fugacity - activity of H2O predominance
## diagrams for proteologs for 23 YeastGFP localizations
# arranged by decreasing metastability:
# order of this list of locations is based on the
# (dis)appearance of species on the current set of diagrams
names <- c("vacuole","early.Golgi","ER","lipid.particle",
"cell.periphery","ambiguous","Golgi","mitochondrion",
"bud","actin","cytoplasm","late.Golgi",
"endosome","nucleus","vacuolar.membrane","punctate.composite",
"peroxisome","ER.to.Golgi","nucleolus","spindle.pole",
"nuclear.periphery","bud.neck","microtubule")
nloc <- c(4,5,3,4,4,3)
inames <- 1:length(names)
# define the system
basis("CHNOS+")
# calculate amino acid compositions using "get.protein" function
for(i in 1:length(names)) {
y <- yeastgfp(names[i])
p <- get.protein(y$yORF,"SGD",y$abundance,names[i])
add.protein(p)
}
species(names,"SGD")
a <- affinity(H2O=c(-5,0,256),O2=c(-80,-66,256))
# setup the plot
layout(matrix(c(1,1,2:7),byrow=TRUE,nrow=4),heights=c(0.7,3,3,3))
par(mar=c(0,0,0,0))
plot.new()
text(0.5,0.5,paste("Subcellular proteins of S. cerevisiae,",
"after Dick, 2009\n",describe(thermo$basis[-c(2,5),])),cex=1.5)
opar <- par(mar=c(3,4,1,1),xpd=TRUE)
for(i in 1:length(nloc)) {
cex.axis <- 0.75
# uncomment the following and dev.off() below to generate png files
#png(paste(i,"png",sep="."),width=300,height=250); cex.axis <- 1
diagram(a,balance="PBB",names=names[inames],
ispecies=inames,cex.axis=cex.axis)
label.plot(letters[i])
title(main=paste(length(inames),"locations"))
#dev.off()
# take out the stable species
inames <- inames[-(1:nloc[i])]
}
# make an animated gif from png files (with ImageMagick convert tool)
#system(paste("convert -delay 100 1.png 1.png 1.png 2.png",
# "3.png 4.png 5.png 6.png 6.png 6.png yeast.gif"))
# return to plot defaults
layout(matrix(1))
par(opar)
## Compare calculated and experimenal relative abundances
## of proteins in a subcellular location, after Dick, 2009
# get the amino acid composition of the proteins
loc <- "vacuolar.membrane"
y <- yeastgfp(loc)
ina <- which(is.na(y$abundance))
p <- get.protein(y$yORF[-ina],"SGD")
add.protein(p)
# set up the system
basis("CHNOS+")
# this is the logfO2 value that gives the best fit (see paper)
basis("O2",-74)
is <- species(p$protein,p$organism)
np <- length(is)
pl <- protein.length(species()$name)
# we use unitize so total activity of residues is unity
loga <- rep(0,np)
species(1:np,unitize(loga,pl))
a <- affinity()
d <- diagram(a,do.plot=FALSE)
calc.loga <- as.numeric(d$logact)
expt.loga <- unitize(log10(y$abundance[-ina]),pl)
# which ones are outliers
rmsd <- sqrt(sum((expt.loga-calc.loga)^2)/np)
residuals <- abs(expt.loga - calc.loga)
iout <- which(residuals > rmsd)
pch <- rep(16,length(is))
pch[iout] <- 1
# the colors reflect average oxidation number of carbon
# corrects misassigned colors in Figs. 5 and 6 of Dick 2009
ZC <- ZC(thermo$obigt$formula[species()$ispecies])
col <- rgb(0.15-ZC,0,0.35+ZC,max=0.5)
# there is a color-plotting error on line 567 of the plot.R file
# of Dick, 2009 that can be reproduced with
#col <- rep(col,length.out=9)
xlim <- ylim <- extendrange(c(calc.loga,expt.loga))
thermo.plot.new(xlim=xlim,ylim=ylim,xlab=expression(list("log"*italic(a),
"calc")),ylab=expression(list("log"*italic(a),"expt")))
points(calc.loga,expt.loga,pch=pch,col=col)
lines(xlim,ylim+rmsd,lty=2)
lines(xlim,ylim-rmsd,lty=2)
title(main=paste("Calculated and experimental relative abundances of\n",
"proteins in ",loc,", after Dick, 2009",sep=""),cex.main=0.95)
### examples for stress response experiments
## predominance fields for overall protein compositions induced by
## carbon, sulfur and nitrogen limitation
## (experimental data from Boer et al., 2003)
expt <- c("low.C","low.N","low.S")
for(i in 1:length(expt)) {
p <- get.protein(expt[i],"SGD",abundance=1)
add.protein(p)
}
basis("CHNOS+")
basis("O2",-75.29)
species(expt,"SGD")
a <- affinity(CO2=c(-5,0),H2S=c(-10,0))
diagram(a,balance="PBB",names=expt,color=NULL)
title(main=paste("Proteins induced by",
"carbon, sulfur and nitrogen limitation",sep="\n"))
## predominance fields for overall protein compositions
## induced and repressed in an/aerobic carbon limitation
## (experiments of Tai et al., 2005)
# the activities of glucose, ammonium and sulfate
# are similar to the non-growth-limiting concentrations
# used by Boer et al., 2003
basis(c("glucose","H2O","NH4+","hydrogen","SO4-2","H+"),
c(-1,0,-1.3,999,-1.4,-7))
# the names of the experiments in thermo$stress
expt <- c("Clim.aerobic.down","Clim.aerobic.up",
"Clim.anaerobic.down","Clim.anaerobic.up")
# here we use abundance to indicate that the protein
# compositions should be summed together in equal amounts
for(i in 1:length(expt)) {
p <- get.protein(expt[i],"SGD",abundance=1)
add.protein(p)
}
species(expt,"SGD")
a <- affinity(C6H12O6=c(-35,-20),H2=c(-20,0))
diagram(a,color=NULL,as.residue=TRUE)
title(main=paste("Average protein residue composition in",
"an/aerobic carbon limitation in yeast",sep="\n"))
Run the code above in your browser using DataLab