Follow the Seurat official website to learn spatial transcriptome (10X data set as an example, including cell mapping, unsupervised learning)

Here is a summary of a tutorial on spatial transcriptome analysis using Seurat, including code for each section:

This tutorial demonstrates how to use Seurat (>=3.2) to analyze spatially resolved RNA-seq data. Although the analysis pipeline is similar to the Seurat workflow for single-cell RNA-seq analysis, we introduce newer interaction and visualization tools with a special emphasis on the integration of spatial and molecular information. This tutorial will cover the following tasks, which we believe are common to many spatial analyses:

Normalization
Dimensionality reduction and clustering
Detection of spatially varying features
Interactive visualization Integration
with single-cell RNA-seq data
Using multiple slices
For our first vignette, we analyzed a dataset generated using 10x Genomics' Visium technology. We will be extending Seurat to use other data types in the near future, including SLIDE-Seq, STARmap, and MERFISH.

First, we load Seurat and the other packages required for this vignette.

library(Seurat)
library(SeuratData)
library(ggplot2)
library(patchwork)
library(dplyr)

data set

Here we will use a recently published dataset of mouse sagittal brain slices generated chemically using Visium v1. There are two consecutive front parts and two (matching) consecutive rear parts.

You can download the data here and load it into Seurat using the Load10X_Spatial() function. This reads the output of the spaceranger pipeline and returns a Seurat object containing point-level expression data and the associated image of the tissue section. You can also use our SeuratData package to easily access the data as shown below. After installing the dataset, you can type ?stxBrain to learn more.

InstallData("stxBrain")
brain <- LoadData("stxBrain", type = "anterior1")

Data preprocessing

The initial preprocessing steps we perform in situ with gene expression data are similar to typical scRNA-seq experiments. We first need to normalize the data to account for differences in sequencing depth between data points. We note that for spatial datasets, the differences in molecular counts/spots can be large, especially if there are differences in cell density across the tissue. We see a lot of heterogeneity here, which requires effective normalization.
Insert image description here

These figures demonstrate that differences in molecular counts across sites are not only technical in nature but also dependent on tissue anatomy. For example, tissue regions that are depleted of neurons, such as cortical white matter, reproducibly exhibit lower molecular counts. Therefore, standard methods (such as the LogNormalize() function) that force each data point to have the same underlying "size" after normalization can be problematic.

plot1 <- VlnPlot(brain, features = "nCount_Spatial", pt.size = 0.1) + NoLegend()
plot2 <- SpatialFeaturePlot(brain, features = "nCount_Spatial") + theme(legend.position = "right")
wrap_plots(plot1, plot2)

As an alternative, we propose using sctransform (Hafemeister and Satija, Genome Biology 2019), which builds regularized negative binomial models of gene expression to account for technical artifacts while preserving biological differences. For more details on sctransform, see the paper here and Seurat vignette here. sctransform normalizes data, detects high variance features, and stores the data in analysis SCT.

brain <- SCTransform(brain, assay = "Spatial", verbose = FALSE)

Gene expression visualization

In Seurat, we have capabilities for exploring and interacting with the inherent visual properties of spatial data. SpatialFeaturePlot() Functionality in Seurat has been expanded so that FeaturePlot() can overlay molecular data on tissue histology. For example, in this dataset of mouse brains, the gene Hpca is a strong marker of the hippocampus and Ttr is a marker of the choroid plexus.

SpatialFeaturePlot(brain, features = c("Hpca", "Ttr"))

Insert image description here

library(ggplot2)
plot <- SpatialFeaturePlot(brain, features = c("Ttr")) + theme(legend.text = element_text(size = 0),
    legend.title = element_text(size = 20), legend.key.size = unit(1, "cm"))
jpeg(filename = "../output/images/spatial_vignette_ttr.jpg", height = 700, width = 1200, quality = 50)
print(plot)
dev.off()

Default parameters in Seurat emphasize visualization of molecular data. However, you can also adjust the size of the spots (and their transparency) to improve visualization of histology images by changing the following parameters:

pt.size.factor - This will scale the size of the blob. Defaults to 1.6
alpha - minimum and maximum transparency. The default is c(1, 1).
Try setting to alphac(0.1, 1) to reduce the transparency of points with lower expression

p1 <- SpatialFeaturePlot(brain, features = "Ttr", pt.size.factor = 1)
p2 <- SpatialFeaturePlot(brain, features = "Ttr", alpha = c(0.1, 1))
p1 + p2

Insert image description here

Dimensionality reduction, clustering and visualization

We can then proceed to dimensionality reduction and clustering of the RNA expression data, using the same workflow we used for scRNA-seq analysis.

brain <- RunPCA(brain, assay = "SCT", verbose = FALSE)
brain <- FindNeighbors(brain, reduction = "pca", dims = 1:30)
brain <- FindClusters(brain, verbose = FALSE)
brain <- RunUMAP(brain, reduction = "pca", dims = 1:30)
p1 <- DimPlot(brain, reduction = "umap", label = TRUE)
p2 <- SpatialDimPlot(brain, label = TRUE, label.size = 3)
p1 + p2

Insert image description here

DimPlot()We can then visualize the results of the clustering in UMAP space (using SpatialDimPlot()) or overlaid on the image using SpatialDimPlot().

Since there are multiple colors, it is difficult to visualize which voxel belongs to which cluster. We have some strategies to help with this. Setting the label parameter places a colored box in the middle of each cluster (see image above).

You can also use the cells.highlight parameter to divide SpatialDimPlot(). This is useful for distinguishing the spatial positioning of individual clusters, as shown below:

SpatialDimPlot(brain, cells.highlight = CellsByIdentities(object = brain, idents = c(2, 1, 4, 3,
    5, 8)), facet.highlight = TRUE, ncol = 3)

Insert image description here

interactive drawing

We've also built in many interactive drawing features. Both SpatialDimPlot() and SpatialFeaturePlot() now have an interactive parameter which, when set to TRUE, will open the Rstudio viewer pane with an interactive shiny plot. The example below demonstrates an interactive SpatialDimPlot() where you can hover over a point and view the cell name and current identity class (similar to the previous do.hover behavior).

SpatialDimPlot(brain, interactive = TRUE)

For SpatialFeaturePlot(), setting Interactive to TRUE brings up an interactive pane where you can adjust the point transparency, point size, and features drawn by the assay. After exploring the data, selecting the Done button will return the last active plot as a ggplot object.

SpatialFeaturePlot(brain, features = "Ttr", interactive = TRUE)

The LinkedDimPlot() function links a UMAP representation to a tissue image representation and allows interactive selection. For example, you can select an area in a UMAP diagram and the corresponding point in the image representation will be highlighted.

LinkedDimPlot(brain)

Identification of spatially variable features

Seurat offers two workflows to identify molecular features associated with spatial location within tissue. The first is to perform differential expression based on pre-annotated anatomical regions within the tissue, which can be determined by unsupervised clustering or prior knowledge. This strategy would work in this case, since the above clusters exhibit significant spatial constraints.

de_markers <- FindMarkers(brain, ident.1 = 5, ident.2 = 6)
SpatialFeaturePlot(object = brain, features = rownames(de_markers)[1:3], alpha = c(0.1, 1), ncol = 3)

Insert image description here

Another way to implement FindSpatiallyVariables() is to search for features that exhibit spatial patterns without pre-annotation. The default method ( method = 'markvariogram) is inspired by Trendsceek, which models spatial transcriptomic data as a marker point process and computes a "variogram", which identifies genes whose expression levels depend on their spatial location. More specifically, this process calculates the gamma® value, which measures the correlation between two points separated by a certain "r" distance. By default, we use an r value of "5" in these analyses, and only calculate these values ​​for variable genes (where variation is calculated independently of spatial location) to save time.

We note that there are several methods in the literature that accomplish this task, including SpatialDE and Splotch. We encourage interested users to explore these methods and hope to add support for them in the near future.

brain <- FindSpatiallyVariableFeatures(brain, assay = "SCT", features = VariableFeatures(brain)[1:1000],
    selection.method = "moransi")

Now we visualize the expression of the first 6 features identified by this metric.

top.features <- head(SpatiallyVariableFeatures(brain, selection.method = "moransi"), 6)
SpatialFeaturePlot(brain, features = top.features, ncol = 3, alpha = c(0.1, 1))

Insert image description here

delineate anatomical regions

As with single-cell objects, you can subset objects to focus on subsets of data. Here we roughly divide the frontal cortex. This process also helps integrate these data with the cortical scRNA-seq dataset in the next section. First, we take a subset of clusters and then further split them based on their exact locations. After subsetting, we can visualize cortical cells on full images or cropped images.

cortex <- subset(brain, idents = c(1, 2, 3, 4, 6, 7))
# now remove additional cells, use SpatialDimPlots to visualize what to remove
# SpatialDimPlot(cortex,cells.highlight = WhichCells(cortex, expression = image_imagerow > 400
# | image_imagecol < 150))
cortex <- subset(cortex, anterior1_imagerow > 400 | anterior1_imagecol < 150, invert = TRUE)
cortex <- subset(cortex, anterior1_imagerow > 275 & anterior1_imagecol > 370, invert = TRUE)
cortex <- subset(cortex, anterior1_imagerow > 250 & anterior1_imagecol > 440, invert = TRUE)
p1 <- SpatialDimPlot(cortex, crop = TRUE, label = TRUE)
p2 <- SpatialDimPlot(cortex, crop = FALSE, label = TRUE, pt.size.factor = 1, label.size = 3)
p1 + p2

Insert image description here

Integrate with single-cell data

At ~50um, spots from the visium assay will contain expression profiles from multiple cells. For the growing list of systems with available scRNA-seq data, users may be interested in “deconvolving” each spatial voxel to predict the underlying composition of the cell type. In preparing this vignette, we tested multiple deconvolution and integration methods using a reference scRNA-seq dataset of approximately 14,000 adult mouse cortical cell assortments from the Allen Institute, generated using the SMART-Seq2 protocol. We consistently find superior performance using integration methods (as opposed to deconvolution methods), likely because the noise models characterizing spatial and single-cell datasets are very different, and integration methods are specifically designed to be robust to these differences. sex. Therefore, we applied the “anchor”-based integration workflow introduced in Seurat v3, which enables probabilistic transfer of annotations from references to query sets. Therefore, we follow the label transfer workflow presented here, utilizing sctransform normalization, but anticipate that new methods will be developed to accomplish this task.

We first load the data (downloadable here), preprocess the scRNA-seq reference, and then perform label transfer. The program outputs the probabilistic classification of each scRNA-seq derived class for each point. We add these predictions as new analyzes in the Seurat object.

allen_reference <- readRDS("../data/allen_cortex.rds")
# note that setting ncells=3000 normalizes the full dataset but learns noise models on 3k
# cells this speeds up SCTransform dramatically with no loss in performance
library(dplyr)
allen_reference <- SCTransform(allen_reference, ncells = 3000, verbose = FALSE) %>%
    RunPCA(verbose = FALSE) %>%
    RunUMAP(dims = 1:30)# After subsetting, we renormalize cortex
cortex <- SCTransform(cortex, assay = "Spatial", verbose = FALSE) %>%
    RunPCA(verbose = FALSE)
# the annotation is stored in the 'subclass' column of object metadata
DimPlot(allen_reference, group.by = "subclass", label = TRUE)

Insert image description here

anchors <- FindTransferAnchors(reference = allen_reference, query = cortex, normalization.method = "SCT")
predictions.assay <- TransferData(anchorset = anchors, refdata = allen_reference$subclass, prediction.assay = TRUE,
    weight.reduction = cortex[["pca"]], dims = 1:30)
cortex[["predictions"]] <- predictions.assay

Now we get the predicted scores for each location for each class. Of particular interest in frontal cortical areas are laminar excitatory neurons. Here we can distinguish different sequential layers of these neuronal subtypes, for example:

DefaultAssay(cortex) <- "predictions"
SpatialFeaturePlot(cortex, features = c("L2/3 IT", "L4"), pt.size.factor = 1.6, ncol = 2, crop = TRUE)

Insert image description here
Based on these prediction scores, we can also predict cell types whose locations are spatially restricted. We used the same approach based on a marker point process to define spatially variable features, but used cell type prediction scores as “markers” rather than gene expression.

cortex <- FindSpatiallyVariableFeatures(cortex, assay = "predictions", selection.method = "moransi",
    features = rownames(cortex), r.metric = 5, slot = "data")
top.clusters <- head(SpatiallyVariableFeatures(cortex, selection.method = "moransi"), 4)
SpatialPlot(object = cortex, features = top.clusters, ncol = 2)

Insert image description here
Finally, we show that our comprehensive procedure is able to recover known spatial localization patterns for neuronal and non-neuronal subsets, including laminar excitability, layer 1 astrocytes, and cortical gray matter.

SpatialFeaturePlot(cortex, features = c("Astro", "L2/3 IT", "L4", "L5 PT", "L5 IT", "L6 CT", "L6 IT",
    "L6b", "Oligo"), pt.size.factor = 1, ncol = 2, crop = FALSE, alpha = c(0.1, 1))

Insert image description here
Insert image description here

Guess you like

Origin blog.csdn.net/coffeeii/article/details/130955116