Learn R Programming

singleCellHaystack (version 1.0.2)

A Universal Differential Expression Prediction Tool for Single-Cell and Spatial Genomics Data

Description

One key exploratory analysis step in single-cell genomics data analysis is the prediction of features with different activity levels. For example, we want to predict differentially expressed genes (DEGs) in single-cell RNA-seq data, spatial DEGs in spatial transcriptomics data, or differentially accessible regions (DARs) in single-cell ATAC-seq data. 'singleCellHaystack' predicts differentially active features in single cell omics datasets without relying on the clustering of cells into arbitrary clusters. 'singleCellHaystack' uses Kullback-Leibler divergence to find features (e.g., genes, genomic regions, etc) that are active in subsets of cells that are non-randomly positioned inside an input space (such as 1D trajectories, 2D tissue sections, multi-dimensional embeddings, etc). For the theoretical background of 'singleCellHaystack' we refer to our original paper Vandenbon and Diez (Nature Communications, 2020) and our update Vandenbon and Diez (Scientific Reports, 2023) .

Copy Link

Version

Install

install.packages('singleCellHaystack')

Monthly Downloads

258

Version

1.0.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Alexis Vandenbon

Last Published

January 11th, 2024

Functions in singleCellHaystack (1.0.2)

kmeans_haystack

Function for k-means clustering of genes according to their expression distribution in 2D or multi-dimensional space
kmeans_haystack_highD

Function for k-means clustering of genes according to their distribution in a higher-dimensional space.
get_log_p_D_KL

Estimates the significance of the observed Kullback-Leibler divergence by comparing to randomizations.
haystack_highD

The main Haystack function, for higher-dimensional spaces.
hclust_haystack_raw

Function for hierarchical clustering of genes according to their distribution on a 2D plot.
kde2d_faster

Based on the MASS kde2d() function, but heavily simplified; it's just tcrossprod() now.
plot_rand_fit

plot_rand_fit
plot_rand_KLD

plot_rand_KLD
write_haystack

Function to write haystack result data to file.
singleCellHaystack-package

singleCellHaystack: A Universal Differential Expression Prediction Tool for Single-Cell and Spatial Genomics Data
hclust_haystack

Function for hierarchical clustering of genes according to their expression distribution in 2D or multi-dimensional space
hclust_haystack_highD

Function for hierarchical clustering of genes according to their distribution in a higher-dimensional space.
plot_gene_set_haystack

Visualizing the detection/expression of a set of genes in a 2D plot
plot_gene_set_haystack_raw

Visualizing the detection/expression of a set of genes in a 2D plot
plot_gene_haystack

Visualizing the detection/expression of a gene in a 2D plot
get_reference

Get reference distribution
plot_gene_haystack_raw

Visualizing the detection/expression of a gene in a 2D plot
get_parameters_haystack

Function that decides most of the parameters that will be used during the "Haystack" analysis.
show_result_haystack

show_result_haystack
get_log_p_D_KL_continuous

Estimates the significance of the observed Kullback-Leibler divergence by comparing to randomizations for the continuous version of haystack.
plot_compare_ranks

plot_compare_ranks
kmeans_haystack_raw

Function for k-means clustering of genes according to their distribution on a 2D plot.
read_haystack

Function to read haystack results from file.
get_D_KL_continuous_highD

Calculates the Kullback-Leibler divergence between distributions for the high-dimensional continuous version of haystack.
extract_row_dgRMatrix

Returns a row of a sparse matrix of class dgRMatrix. Function made by Ben Bolker and Ott Toomet (see https://stackoverflow.com/questions/47997184/)
get_D_KL_highD

Calculates the Kullback-Leibler divergence between distributions for the high-dimensional version of haystack().
get_dist_two_sets

Calculate the pairwise Euclidean distances between the rows of 2 matrices.
dat.expression

Single cell RNA-seq dataset.
get_D_KL

Calculates the Kullback-Leibler divergence between distributions.
get_density

Function to get the density of points with value TRUE in the (x,y) plot
default_bandwidth.nrd

Default function given by function bandwidth.nrd in MASS. No changes were made to this function.
extract_row_lgRMatrix

Returns a row of a sparse matrix of class lgRMatrix. Function made by Ben Bolker and Ott Toomet (see https://stackoverflow.com/questions/47997184/)
haystack

The main Haystack function
get_grid_points

A function to decide grid points in a higher-dimensional space
get_euclidean_distance

Calculate the Euclidean distance between x and y.
dat.tsne

Single cell tSNE coordingates.
haystack_continuous_highD

The main Haystack function, for higher-dimensional spaces and continuous expression levels.
haystack_2D

The main Haystack function, for 2-dimensional spaces.