## S3 method for class 'matrix,missing':
apclusterL(s, x,
sel, p=NA, q=NA, maxits=1000, convits=100, lam=0.9,
includeSim=FALSE, nonoise=FALSE, seed=NA)
## S3 method for class 'character,ANY':
apclusterL(s, x,
frac, sweeps, p=NA, q=NA, maxits=1000, convits=100, lam=0.9,
includeSim=TRUE, nonoise=FALSE, seed=NA, ...)
## S3 method for class 'function,ANY':
apclusterL(s, x,
frac, sweeps, p=NA, q=NA, maxits=1000, convits=100, lam=0.9,
includeSim=TRUE, nonoise=FALSE, seed=NA, ...)
x
is a matrix or data
frame, rows are interpreted as samples and columns are interpreted
as features; apart from matrices or data frames, x
may be
any other structured data type that cNA
,
exemplar preferences are initialized according to the
distribution of np=NA
, exemplar preferences are initialized
according to the distribution of non-Inf values in s
.
If q=NA
, exemplar preferences are set to the median
of non-Inf values in s
. If q
convits
iterationsTRUE
, the similarity matrix (either computed
internally or passed via the s
argument) is stored to the
slot sim
of the returned
APResult
object. The default is <apcluster
adds a small amount of noise to
s
to prevent degenerate cases; if TRUE
,
this is disabledNA
, the seed remains
unchangedapcluster
and arguments of the similarity
function may occur; therefore, we recommend to writeAPResult
object.Leveraged Affinity Propagation reduces dynamic and static load for large datasets. Only a subset of the samples are considered in the clustering process assuming that they provide already enough information about the cluster structure.
When called with input data and the name of a package provided or a user
provided similarity function the function selects a random sample subset
according to the frac
parameter, calculates a rectangular
similarity matrix of all samples against this subset and repeats
affinity propagation sweep
times. A new sample subset is used
for each repetition. The clustering result of the sweep with the highest
net similarity is returned. Any parameters specific to the chosen
method of similarity calculation can be passed to apcluster
in addition to the parameters described above. The similarity matrix
for the best trial is also returned in the result object when requested
by the user (argument includeSim
).
When called with a rectangular similarity matrix (which represents a
column subset of the full similarity matrix) the function performs
AP clustering on this similarity matrix. The information
about the selected samples is passed to clustering with the
parameter sel
. This function is only needed when the user needs full
control of distance calculation or sample subset selection.
Apart from minor adaptations and optimizations, the implementation
of the function apclusterL
is largely analogous to Frey's and Dueck's Matlab code
(see
Frey, B. J. and Dueck, D. (2007) Clustering by passing messages
between data points. Science 315, 972-976.
DOI:
Bodenhofer, U., Kothmeier, A., and Hochreiter, S. (2011)
APCluster: an R package for affinity propagation clustering.
Bioinformatics 27, 2463-2464.
DOI:
APResult
, show-methods
,
plot-methods
, labels-methods
,
preferenceRange
, apcluster-methods
,
apclusterK
## define percentage of samples to be used for clustering
frac=0.2
sweeps <- 3
## create two Gaussian clouds
cl1 <- cbind(rnorm(150,0.2,0.05),rnorm(150,0.8,0.06))
cl2 <- cbind(rnorm(100,0.7,0.08),rnorm(100,0.3,0.05))
x <- rbind(cl1,cl2)
## leveraged apcluster
apres <- apclusterL(negDistMat(r=2), x, frac, sweeps, p=-0.2)
## show details of leveraged clustering results
show(apres)
## plot leveraged clustering result
plot(apres, x)
## plot heatmap of clustering result
heatmap(apres)
## show net similarities of single sweeps
apres@netsimLev
## show samples on which best sweep was based
apres@sel
Run the code above in your browser using DataLab