E-spatial

Beta

New application is live now

E-spatial

Single-cell spatial explorer

Notebooks

Premium

Harmony: fast, sensitive, and accurate integration of single cell data
lock icon

BioTuring

Single-cell RNA-seq datasets in diverse biological and clinical conditions provide great opportunities for the full transcriptional characterization of cell types. However, the integration of these datasets is challeging as they remain biological and techinical differences. **Harmony** is an algorithm allowing fast, sensitive and accurate single-cell data integration.
Only CPU
harmonpy
Hierarchicell: estimating power for tests of differential expression with single-cell data
lock icon

BioTuring

Power analyses are considered important factors in designing high-quality experiments. However, such analyses remain a challenge in single-cell RNA-seq studies due to the presence of hierarchical structure within the data (Zimmerman et al., 2021). As cells sampled from the same individual share genetic and environmental backgrounds, these cells are more correlated than cells sampled from different individuals. Currently, most power analyses and hypothesis tests (e.g., differential expression) in scRNA-seq data treat cells as if they were independent, thus ignoring the intra-sample correlation, which could lead to incorrect inferences. Hierarchicell (Zimmerman, K.D. and Langefeld, C.D., 2021) is an R package proposed to estimate power for testing hypotheses of differential expression in scRNA-seq data while considering the hierarchical correlation structure that exists in the data. The method offers four important categories of functions: data loading and cleaning, empirical estimation of distributions, simulating expression data, and computing type 1 error or power. In this notebook, we will illustrate an example workflow of Hierarchicell. The notebook is inspired by Hierarchicell's vignette and modified to demonstrate how the tool works on BioTuring's platform.
Geneformer: a deep learning model for exploring gene networks
lock icon

BioTuring

Geneformer is a foundation transformer model pretrained on a large-scale corpus of ~30 million single cell transcriptomes to enable context-aware predictions in settings with limited data in network biology. Here, we will demonstrate a basic workflow to work with ***Geneformer*** models. These notebooks include the instruction to: 1. Prepare input datasets 2. Finetune Geneformer model to perform specific task 3. Using finetuning models for cell classification and gene classification application
Doublet Detection: Detect doublets (technical errors) in single-cell RNA-seq count matrices
lock icon

BioTuring

Doublets are a characteristic error source in droplet-based single-cell sequencing data where two cells are encapsulated in the same oil emulsion and are tagged with the same cell barcode. Across type doublets manifest as fictitious phenotypes that can be incorrectly interpreted as novel cell types. DoubletDetection present a novel, fast, unsupervised classifier to detect across-type doublets in single-cell RNA-sequencing data that operates on a count matrix and imposes no experimental constraints. This classifier leverages the creation of in silico synthetic doublets to determine which cells in the input count matrix have gene expression that is best explained by the combination of distinct cell types in the matrix. In this notebook, we will illustrate an example workflow for detecting doublets in single-cell RNA-seq count matrices.

Trends

Celltypist: A tool for semi-automatic cell type classification

BioTuring

CellTypist is an automated cell type annotation tool for scRNA-seq datasets on the basis of logistic regression classifiers optimised by the stochastic gradient descent algorithm. CellTypist allows for cell prediction using either built-in (with a current focus on immune sub-populations)or custom models, in order to assist in the accurate classification of different cell types and subtypes. CellTypist can identify 101 cell types or states from more than one million cells, including previously underappreciated cell states. For the CellTypist pre-trained models, immune cells from 20 tissues of 19 studies were collected and harmonized into consistent labels. These cells were split into equal-sized mini-batches, and these batches were sequentially trained by the l2-regularized logistic regression using stochastic gradient descent learning. Feature selection was performed to choose the top 300 genes from each cell type, and the union of these genes was supplied as the input for a second round of training.