Returns a complex object named truncated.lists containing the Idata
vector (see prepare.idata
), the estimated truncation index \(j_0=k+1\) (see compute.stream
) for each pair of input lists, the overall top-k estimate (see j0.multi
), and other objects with necessary plotting information for the aggmap
calculate.maxK(lists, L, d, v, threshold)
A named list of the following content:
Contains information about the overlap of all pairwise compared lists (structure for the aggmap
)
Contains information about the list names
Contains information which objects in a list are consolidated (gray-shaded in the aggmap
)
Table of top-k list overlaps containing rank information, the rank sum, the order of objects as a function of the rank sum, the frequency of an object in the input lists and the frequency of an object in the truncated lists (for plotting in the aggmap
)
Contains the top-k objects for each of the input lists (for display in the Venn-diagram)
Contains the overlap information (for display in the Venn-table)
Selected pilot sample size (tuning parameter) \(\nu\)
Number of columns to be plotted in the aggmap
Data frame of Idata vectors (see compute.stream
) for each pair of input lists and the associated delta's
selected delta
selected threshold
number of lists
number of items in data frame (lists)
data frame of lists that entered the analysis
maximal estimate of the top-k's (for all pairwise comparisons)
the final integrated list of objects as result of the CEMC algorithm applied to the maxK truncated lists
Data frame containing two or more columns that represent input lists of ordered objects subject to comparison
Number of input lists that are compared
The maximal distance delta between object ranks required for the estimation of \(j_0\)
The pilot sample size (tuning parameter) \(\nu\) required for the estimation of \(j_0\)
The percentage of occurencies of an object in the top-k selection among all comparisons in order to be gray-shaded in the aggmap
as a consolidated object
Eva Budinska <budinska@iba.muni.cz>, Michael G. Schimek <michael.schimek@medunigraz.at>
Hall, P. and Schimek, M. G. (2012). Moderate deviation-based inference for random degeneration in paired rank lists. J. Amer. Statist. Assoc., 107, 661-672.
CEMC, prepare.idata
set.seed(1234)
data(breast)
truncated.lists = calculate.maxK(breast, d=6, v=10, L=3, threshold=50)
if (FALSE) {
aggmap(truncated.lists)
}
Run the code above in your browser using DataLab