Learn R Programming

apComplex (version 2.38.0)

mergeComplexes: Iteratively combine columns in initial PCMG estimate

Description

Repeatedly applies the function LCdelta to make combinations of columns in the affiliation matrix representing the protein complex membership graph (PCMG) for AP-MS data.

Usage

mergeComplexes(bhmax,adjMat,VBs=NULL,VPs=NULL,simMat=NULL,sensitivity=.75,specificity=.995,Beta=0,commonFrac=2/3,wsVal = 2e7)

Arguments

bhmax
Initial complex estimates coming from bhmaxSubgraph
adjMat
Adjacency matrix of bait-hit data from an AP-MS experiment. Rows correspond to baits and columns to hits.
VBs
VBs is an optional vector of viable baits.
VPs
VPs is an optional vector of viable prey.
simMat
An optional square matrix with entries between 0 and 1. Rows and columns correspond to the proteins in the experiment, and should be reported in the same order as the columns of adjMat. Higher values in this matrix are interpreted to mean higher similarity for protein pairs.
sensitivity
Believed sensitivity of AP-MS technology.
specificity
Believed specificity of AP-MS technology.
Beta
Optional additional parameter for the weight to give data in simMat in the logistic regression model.
commonFrac
This is the fraction of baits that need to be overlapping for a complex combination to be considered.
wsVal
A numeric. This is the value assigned to the work-space in the call to fisher.test.

Value

A list of character vectors containing the names of the proteins in the estimated complexes.

Details

The local modeling algorithm for AP-MS data described by Scholtens and Gentleman (2004) and Scholtens, Vidal, and Gentleman (2005) uses a two-component measure of protein complex estimate quality, namely P=LxC. Columns in cMat represent individual complex estimates. The algorithm works by starting with a maximal BH-complete subgraph estimate of cMat, and then improves the estimate by combining complexes such that P=LxC increases.

By default commonFrac is set relatively high at 2/3. This means that some potentially reasonable complex combinations could be missed. For smaller data sets, users may consider decreasing the fraction. For larger data sets, this may cause a large increase in computation time.

References

Scholtens D and Gentleman R. Making sense of high-throughput protein-protein interaction data. Statistical Applications in Genetics and Molecular Biology 3, Article 39 (2004).

Scholtens D, Vidal M, and Gentleman R. Local modeling of global interactome networks. Bioinformatics 21, 3548-3557 (2005).

See Also

bhmaxSubgraph,findComplexes

Examples

Run this code

data(apEX)
PCMG0 <- bhmaxSubgraph(apEX)
PCMG1 <- mergeComplexes(PCMG0,apEX,sensitivity=.7,specificity=.75)

Run the code above in your browser using DataLab