E-spatial

Beta

New application is live now

E-spatial

Single-cell spatial explorer

Notebooks

Premium

expiMap: Biologically informed deep learning to query gene programs in single-cell atlases
lock icon

BioTuring

The development of large-scale single-cell atlases has allowed describing cell states in a more detailed manner. Meanwhile, current deep leanring methods enable rapid analysis of newly generated query datasets by mapping them into reference atlases. expiMap (‘explainable programmable mapper’) Lotfollahi, Mohammad, et al. is one of the methods proposed for single-cell reference mapping. Furthermore, it incorporates prior knowledge from gene sets databases or users to analyze query data in the context of known gene programs (GPs).
Required GPU
expiMap
SpaCET: Cell type deconvolution and interaction analysis
lock icon

BioTuring

Spatial transcriptomics (ST) technology has allowed to capture of topographical gene expression profiling of tumor tissues, but single-cell resolution is potentially lost. Identifying cell identities in ST datasets from tumors or other samples remains challenging for existing cell-type deconvolution methods. Spatial Cellular Estimator for Tumors (SpaCET) is an R package for analyzing cancer ST datasets to estimate cell lineages and intercellular interactions in the tumor microenvironment. Generally, SpaCET infers the malignant cell fraction through a gene pattern dictionary, then calibrates local cell densities and determines immune and stromal cell lineage fractions using a constrained regression model. Finally, the method can reveal putative cell-cell interactions in the tumor microenvironment. In this notebook, we will illustrate an example workflow for cell type deconvolution and interaction analysis on breast cancer ST data from 10X Visium. The notebook is inspired by SpaCET's vignettes and modified to demonstrate how the tool works on BioTuring's platform.
Hierarchicell: estimating power for tests of differential expression with single-cell data
lock icon

BioTuring

Power analyses are considered important factors in designing high-quality experiments. However, such analyses remain a challenge in single-cell RNA-seq studies due to the presence of hierarchical structure within the data (Zimmerman et al., 2021). As cells sampled from the same individual share genetic and environmental backgrounds, these cells are more correlated than cells sampled from different individuals. Currently, most power analyses and hypothesis tests (e.g., differential expression) in scRNA-seq data treat cells as if they were independent, thus ignoring the intra-sample correlation, which could lead to incorrect inferences. Hierarchicell (Zimmerman, K.D. and Langefeld, C.D., 2021) is an R package proposed to estimate power for testing hypotheses of differential expression in scRNA-seq data while considering the hierarchical correlation structure that exists in the data. The method offers four important categories of functions: data loading and cleaning, empirical estimation of distributions, simulating expression data, and computing type 1 error or power. In this notebook, we will illustrate an example workflow of Hierarchicell. The notebook is inspired by Hierarchicell's vignette and modified to demonstrate how the tool works on BioTuring's platform.
Inference and analysis of cell-cell communication using CellChat
lock icon

BioTuring

Understanding global communications among cells requires accurate representation of cell-cell signaling links and effective systems-level analyses of those links. We construct a database of interactions among ligands, receptors and their cofactors that accurately represent known heteromeric molecular complexes. We then develop **CellChat**, a tool that is able to quantitatively infer and analyze intercellular communication networks from single-cell RNA-sequencing (scRNA-seq) data. CellChat predicts major signaling inputs and outputs for cells and how those cells and signals coordinate for functions using network analysis and pattern recognition approaches. Through manifold learning and quantitative contrasts, CellChat classifies signaling pathways and delineates conserved and context-specific pathways across different datasets. Applying **CellChat** to mouse and human skin datasets shows its ability to extract complex signaling patterns.
Required GPU
CellChat

Trends

MUON: multimodal omics analysis framework

BioTuring

Advances in multi-omics have led to an explosion of multimodal datasets to address questions from basic biology to translation. While these data provide novel opportunities for discovery, they also pose management and analysis challenges, thus motivating the development of tailored computational solutions. `muon` is a Python framework for multimodal omics. It introduces multimodal data containers as `MuData` object. The package also provides state of the art methods for multi-omics data integration. `muon` allows the analysis of both unimodal omics and multimodal omics.
Required GPU
muon
FunPat: Function-based Pattern analysis on RNA-seq time series data

BioTuring

Dynamic expression data, nowadays obtained using high-throughput RNA sequencing (RNA-seq), are essential to monitor transient gene expression changes and to study the dynamics of their transcriptional activity in the cell or response to stimuli. FunPat is an R package designed to provide: - a useful tool to analyze time series genomic data; - a computational pipeline which integrates gene selection, clustering and functional annotations into a single framework to identify the main temporal patterns associated to functional groups of differentially expressed (DE) genes; - an easy way to exploit different types of annotations from currently available databases (e.g. Gene Ontology) to extract the most meaningful information characterizing the main expression dynamics; - a user-friendly organization and visualization of the outcome, automatically linking the DE genes and their temporal patterns to the functional information for an easy biological interpretation of the results.
Only CPU
FunPat
BioTuring Data Converter: Seurat <=> Scanpy for single-cell data transcriptomic and spatial transcriptomics

BioTuring

This notebook illustrates how to convert data from a Seurat object into a Scanpy annotation data and a Scanpy annotation data into a Seurat object using the BioStudio data transformation library (currently under development). It facilitates continued research using libraries that interact with Scanpy in Python and Seurat in R. seurat.to.adata function can retain information about reductions (such as PCA, t-SNE, UMAP and Seurat Clusters) and spatial information.
BPCells: Scaling Single Cell Analysis to Millions of Cells

BioTuring

BPCells is a package for high performance single cell analysis on RNA-seq and ATAC-seq datasets. It can analyze a 1.3M cell dataset with 2GB of RAM in under 10 minutes. This makes analysis of million-cell datasets practical on a laptop. BPCells provides: * Efficient storage of single cell datasets via bitpacking compression * Fast, disk-backed RNA-seq and ATAC-seq data processing powered by C++ * Downstream analysis such as marker genes, and clustering * Interoperability with AnnData, 10x datasets, R sparse matrices, and GRanges
Only CPU
BPCells
Bulk RNA-seq analysis with limma and edgeR

BioTuring

The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. The complete analysis offered by these packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights.
Analyzing RNA-seq data with DESeq2

BioTuring

A basic task in the analysis of count data from RNA-seq is the detection of differentially expressed genes. The count data are presented as a table which reports, for each sample, the number of sequence fragments that have been assigned to each gene. Analogous data also arise for other assay types, including comparative ChIP-Seq, HiC, shRNA screening, and mass spectrometry. An important analysis question is the quantification and statistical inference of systematic changes between conditions, as compared to within-condition variability. The package DESeq2 provides methods to test for differential expression by use of negative binomial generalized linear models; the estimates of dispersion and logarithmic fold changes incorporate data-driven prior distributions. This notebook explains the use of the package and demonstrates typical workflows.
Only CPU
DESeq2
Cellpose: A generalist algorithm for cell and nucleus segmentation

BioTuring

Cell segmentation is the process of identifying and isolating individual cells in an image, typically a microscopic image. This is a crucial step in many biological studies, as it allows researchers to analyze individual cells and their properties. In this notebook, we will introduce Cellpose, a deep learning algorithm for segmenting cells from microscopy images. Cellpose was trained on a diverse dataset of over 70,000 manually segmented cells from various imaging modalities. It is designed as a "generalist" model that can segment new image types without retraining or parameter tuning. Cellpose is a powerful tool for segmenting biological images of cells. It can be used to identify and isolate individual cells in images, even when they are crowded together or have complex shapes. This makes it a valuable tool for researchers studying cell biology, neuroscience, and other fields.
Required GPU
Cellpose
CopyKAT: Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes

BioTuring

Classification of tumor and normal cells in the tumor microenvironment from scRNA-seq data is an ongoing challenge in human cancer study. Copy number karyotyping of aneuploid tumors (***copyKAT***) (Gao, Ruli, et al., 2021) is a method proposed for identifying copy number variations in single-cell transcriptomics data. It is used to predict aneuploid tumor cells and delineate the clonal substructure of different subpopulations that coexist within the tumor mass. In this notebook, we will illustrate a basic workflow of CopyKAT based on the tutorial provided on CopyKAT's repository. We will use a dataset of triple negative cancer tumors sequenced by 10X Chromium 3'-scRNAseq (GSM4476486) as an example. The dataset contains 20,990 features across 1,097 cells. We have modified the notebook to demonstrate how the tool works on BioTuring's platform.
Identifying tumor cells at the single-cell level using machine learning - inferCNV

BioTuring

Tumors are complex tissues of cancerous cells surrounded by a heterogeneous cellular microenvironment with which they interact. Single-cell sequencing enables molecular characterization of single cells within the tumor. However, cell annotation—the assignment of cell type or cell state to each sequenced cell—is a challenge, especially identifying tumor cells within single-cell or spatial sequencing experiments. Here, we propose ikarus, a machine learning pipeline aimed at distinguishing tumor cells from normal cells at the single-cell level. We test ikarus on multiple single-cell datasets, showing that it achieves high sensitivity and specificity in multiple experimental contexts. **InferCNV** is a Bayesian method, which agglomerates the expression signal of genomically adjointed genes to ascertain whether there is a gain or loss of a certain larger genomic segment. We have used **inferCNV** to call copy number variations in all samples used in the manuscript.
Only CPU
inferCNV
scGPT: Towards Building a Foundational Model for Single-Cell Multi-omics Using Generative AI

BioTuring

Generative pre-trained models have demonstrated exceptional success in various fields, including natural language processing and computer vision. In line with this progress, scGPT has been developed as a foundational model tailored specifically for the field of single-cell biology. It employs the generative pre-training transformer framework on an extensive dataset comprising more than 33 million cells. scGPT effectively extracts valuable biological insights related to genes and cells and can be fine-tuned to excel in numerous downstream applications.
Required GPU
scgpt
Seurat
Geneformer: a deep learning model for exploring gene networks

BioTuring

Geneformer is a foundation transformer model pretrained on a large-scale corpus of ~30 million single cell transcriptomes to enable context-aware predictions in settings with limited data in network biology. Here, we will demonstrate a basic workflow to work with ***Geneformer*** models. These notebooks include the instruction to: 1. Prepare input datasets 2. Finetune Geneformer model to perform specific task 3. Using finetuning models for cell classification and gene classification application
Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata.

BioTuring

SCANPY integrates the analysis possibilities of established R-based frameworks and provides them in a scalable and modular form. Specifically, SCANPY provides preprocessing comparable to SEURAT and CELL RANGER, visualization through TSNE, graph-drawing and diffusion maps, clustering similar to PHENOGRAPH, identification of marker genes for clusters via differential expression tests and pseudotemporal ordering via diffusion pseudotime, which compares favorably with MONOCLE 2, and WISHBONE.
Only CPU
Scanpy
edgeR: Differential analysis of sequence read count data

BioTuring

This notebook provides an overview of the Bioconductor package edgeR for differential expression analyses of read counts arising from RNA-Seq, SAGE or similar technologies. The package can be applied to any technology that produces read counts for genomic features. edgeR implements statistical methods based on generalized linear models (glms), suitable for multifactor experiments of any complexity. The glm functions can test for differential expression using either likelihood ratio tests or quasi-likelihood F-tests. A particular feature of edgeR functionality, both classic and glm, are empirical Bayes methods that permit the estimation of gene-specific biological variation, even for experiments with minimal levels of biological replication. edgeR can be applied to differential expression at the gene, exon, transcript or tag level. In fact, read counts can be summarized by any genomic feature. edgeR analyses at the exon level are easily extended to detect differential splicing or isoform-specific differential expression.
Only CPU
edgeR
Inference and analysis of cell-cell communication using CellChat

BioTuring

Understanding global communications among cells requires accurate representation of cell-cell signaling links and effective systems-level analyses of those links. We construct a database of interactions among ligands, receptors and their cofactors that accurately represent known heteromeric molecular complexes. We then develop **CellChat**, a tool that is able to quantitatively infer and analyze intercellular communication networks from single-cell RNA-sequencing (scRNA-seq) data. CellChat predicts major signaling inputs and outputs for cells and how those cells and signals coordinate for functions using network analysis and pattern recognition approaches. Through manifold learning and quantitative contrasts, CellChat classifies signaling pathways and delineates conserved and context-specific pathways across different datasets. Applying **CellChat** to mouse and human skin datasets shows its ability to extract complex signaling patterns.
Required GPU
CellChat
Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram

BioTuring

Charting an organs’ biological atlas requires us to spatially resolve the entire single-cell transcriptome, and to relate such cellular features to the anatomical scale. Single-cell and single-nucleus RNA-seq (sc/snRNA-seq) can profile cells comprehensively, but lose spatial information. Spatial transcriptomics allows for spatial measurements, but at lower resolution and with limited sensitivity. Targeted in situ technologies solve both issues, but are limited in gene throughput. To overcome these limitations we present Tangram, a method that aligns sc/snRNA-seq data to various forms of spatial data collected from the same region, including MERFISH, STARmap, smFISH, Spatial Transcriptomics (Visium) and histological images. **Tangram** can map any type of sc/snRNA-seq data, including multimodal data such as those from SHARE-seq, which we used to reveal spatial patterns of chromatin accessibility. We demonstrate Tangram on healthy mouse brain tissue, by reconstructing a genome-wide anatomically integrated spatial map at single-cell resolution of the visual and somatomotor areas.
Required GPU
Tangram