- data_mat
Matrix of normalized gene expession (default) or PCA
embeddings (see do_pca).
Cells can be rows or columns.
- meta_data
Either (1) Dataframe with variables to integrate or (2)
vector with labels.
- vars_use
If meta_data is dataframe, this defined which variable(s)
to remove (character vector).
- do_pca
Whether to perform PCA on input matrix.
- npcs
If doing PCA on input matrix, number of PCs to compute.
- theta
Diversity clustering penalty parameter. Specify for each
variable in vars_use Default theta=2. theta=0 does not encourage any
diversity. Larger values of theta result in more diverse clusters.
- lambda
Ridge regression penalty parameter. Specify for each variable
in vars_use.
Default lambda=1. Lambda must be strictly positive. Smaller values result
in more aggressive correction.
- sigma
Width of soft kmeans clusters. Default sigma=0.1. Sigma scales
the distance from a cell to cluster centroids. Larger values of sigma
result in cells assigned to more clusters. Smaller values of sigma make
soft kmeans cluster approach hard clustering.
- nclust
Number of clusters in model. nclust=1 equivalent to simple
linear regression.
- tau
Protection against overclustering small datasets with large ones.
tau is the expected number of cells per cluster.
- block.size
What proportion of cells to update during clustering.
Between 0 to 1, default 0.05. Larger values may be faster but less accurate
- max.iter.harmony
Maximum number of rounds to run Harmony. One round
of Harmony involves one clustering and one correction step.
- max.iter.cluster
Maximum number of rounds to run clustering at each
round of Harmony.
- epsilon.cluster
Convergence tolerance for clustering round of
Harmony. Set to -Inf to never stop early.
- epsilon.harmony
Convergence tolerance for Harmony. Set to -Inf to
never stop early.
- plot_convergence
Whether to print the convergence plot of the
clustering objective function. TRUE to plot, FALSE to suppress. This can be
useful for debugging.
- return_object
(Advanced Usage) Whether to return the Harmony object
or only the corrected PCA embeddings.
- verbose
Whether to print progress messages. TRUE to print,
FALSE to suppress.
- reference_values
(Advanced Usage) Defines reference dataset(s).
Cells that have batch variables values matching reference_values will not
be moved.
- cluster_prior
(Advanced Usage) Provides user defined clusters for
cluster initialization. If the number of provided clusters C is less than K,
Harmony will initialize K-C clusters with kmeans. C cannot exceed K.