mancie(mat_main,mat_supp,cutoff1=0.5,cutoff2=0)
mat_supp
must have the same dimensions as mat_main
mat_main
, it can be directly fed to mancie
. An example is RNA-Seq data of the same cell lines from two labs. If the supplementary dataset has different rows from mat_main
. It need to be first summarized using summarize_mat
to be compatible with mat_main
. An example is RNA-Seq data and DNase-seq data of the same tissue types.
The underlying rationale for using MANCIE is that the variation of genomic features in mat_supp
are concordant with and can be used to remove noise in the variation of genomic features in mat_main
.
(a) If the correlation between row i of mat_main
and row i of mat_supp
is larger than cutoff1
, the new row vector will be the first PC of the matrix formed by these two row vectors.
(b) If the correlation is between cutoff1
and cutoff2
, the new row vector will be the weighted average of these two rows. The weight for row i of mat_main
is 1 and the weight for row i of mat_supp
is the correlation between these two row vectors.
(c)If the correlation is smaller than cutoff2
, the new row vector is the original row i of mat_main
There should be a reasonable portion of rows that fall into the first and second category. If not, the user should check if the data they would like to try MANCIE on really fits the aforementioned rationale. The user may also vary the default values of cutoff1
and cutoff2
if they see fit. The mancie
function will report percentage of rows falling into each category.
summarize_mat
data(mancie_example,package="MANCIE")
sum_DNase=summarize_mat(exp,ann_exp,DNase,ann_DNase)
lev_exp=mancie(exp,sum_DNase)
Run the code above in your browser using DataLab