Learn R Programming

betaclust (version 1.0.4)

beta_kn: Fit the KN. model

Description

Fit the KN. model from the betaclust family of beta mixture models for DNA methylation data. The KN. model analyses a single DNA sample type and identifies the thresholds between the different methylation states.

Usage

beta_kn(data, M = 3, parallel_process = FALSE, seed = NULL)

Value

A list containing:

  • cluster_size - The total number of CpG sites in each of the K clusters.

  • llk - A vector containing the log-likelihood value at each step of the EM algorithm.

  • alpha - The first shape parameter for the beta mixture model.

  • delta - The second shape parameter for the mixture model.

  • tau - The estimated mixing proportion for each cluster.

  • z - A matrix of dimension \(C \times K\) containing the posterior probability of each CpG site belonging to each of the \(K\) clusters.

  • classification - The classification corresponding to z, i.e. map(z).

  • uncertainty - The uncertainty of each CpG site's clustering.

Arguments

data

A dataframe of dimension \(C \times N\) containing methylation values for \(C\) CpG sites from \(R = 1\) sample type collected from \(N\) patients. Samples are grouped together in the dataframe such that the columns are ordered as Sample1_Patient1, Sample1_Patient2, etc.

M

Number of methylation states to be identified in a DNA sample type.

parallel_process

The "TRUE" option results in parallel processing of the models for increased computational efficiency. The default option has been set as "FALSE" due to package testing limitations.

seed

Seed to allow for reproducibility (default = NULL).

Details

The KN. model clusters each of the \(C\) CpG sites into one of \(K\) methylation states, based on data from \(N\) patients for one DNA sample type (i.e. \(R = 1\)). As each CpG site can belong to any of the \(M = 3\) methylation states (hypomethylated, hemimethylated or hypermethylated), the default value of \(K = M = 3\). The KN. model differs from the K.. model as it is less parsimonious, allowing cluster and patient-specific shape parameters. The returned object can be passed as an input parameter to the threshold function available in this package to calculate the thresholds between the methylation states.

See Also

beta_k

betaclust

threshold

Examples

Run this code
my.seed <- 190
M <- 3
data_output <- beta_kn(pca.methylation.data[1:30,2:5], M,
                       parallel_process = FALSE, seed = my.seed)
thresholds <- threshold(data_output, pca.methylation.data[1:30,2:5], "KN.")

Run the code above in your browser using DataLab