Spateo Tool Usage Guide (Cell Chromatin Segmentation)

Spateo tool usage guide (1)

近期发现Spateo这个神奇的工具能做到细胞分割,里面也有很多有趣的算法

Install

It is worth noting that the win system cannot be installed currently! ! ! The method only applies to linux systems! ! !

We need at least version 3.7 and above of python.

#这里我们使用3.9版本的python
conda create -n spa python=3.9

There are three ways to install this package, (pip/pip git/offline). However, under various version conflicts, only pip is effective. The following is the code:

#方法一
pip install spateo-release
方法二(有网络条件的可以自己尝试一下)
pip install git+https://github.com/aristoteleo/spateo-release

Official Known Issues and Fixes

If there is a conflict you can run

pip install h5py==3.7.0
pip install anndata==0.8.0

It is worth noting that the win system cannot be installed currently! ! ! The above method only applies to linux systems! ! !

Getting to the point, using Spateo for cell chromatin segmentation

In this tutorial we assume that we have paired RNA and nuclear stain images of the same tissue section and that the two coordinate systems are (roughly) registered. We will use the nuclear staining as the true location of the nucleus and use this information to obtain cell segmentation. We will do this in the following steps.

1. As a preprocessing step, we will improve the alignment between stained images and RNA coordinates.
2. Identify and label individual nuclei using a watershed-based approach.
3. Perform the same operation using a deep learning-based method called StarDist.
4. [Optional] Enhance the StarDist tags with watershed tags by copying tags that exist in the watershed method but do not overlap with any tags in the StarDist method.
5. [Optional] Expand nuclear labeling into the cytoplasm.

Import the installed package

The author should be using an Apple computer

import spateo as st
import matplotlib.pyplot as plt

st.config.n_threads = 8
%config InlineBackend.print_figure_kwargs = {
    
    'facecolor' : "w"}
%config InlineBackend.figure_format = 'retina'

Download Data

We will use the mouse coronal section data set truncated by Chen et al., 2021. This is a Stereo-seq dataset providing ssDNA (nucleus) staining.

!wget "https://drive.google.com/uc?export=download&id=1nONOaUy7utvtXQ3ZPx7R3TePq2Oo4JFM" -nc -O SS200000135IL-D1.ssDNA.tif
!wget "https://drive.google.com/uc?export=download&id=18sM-5LmxOgt-3kq4ljtq_EdWHjihvPUx" -nc -O SS200000135TL_D1_all_bin1.txt.gz

Load the downloaded UMI count and nuclear stain images into an AnnData object. For the purpose of cell segmentation, we will use an aggregate count matrix, where the obs and vars of AnnData correspond to the spatial X and Y coordinates, and each element of the matrix contains the total number of UMIs captured for each X and Y coordinate.

adata = st.io.read_bgi_agg(
    'SS200000135TL_D1_all_bin1.txt.gz', 'SS200000135IL-D1.ssDNA.tif',
)
adata

|-----> Construct a counting matrix.
|-----> __type to uns in AnnData object.
|-----> pp to uns in AnnData object.
|-----> Space to uns in AnnData object.
AnnData object with n_obs × n_vars = 2000 × 2000
uns: '__type', 'pp', 'spatial'
layers: 'stained', 'stitched', 'unstitched'

st.pl.imshow(adata, 'stain')

|-----> Color layer in AnnData object
Insert image description here

Optimize alignment

The stained image should already be roughly aligned with the RNA coordinates, but may be slightly misaligned. Large misalignments can lead to incorrect UMI aggregation (and therefore incorrect cells!). Therefore, we consider it a good practice to improve alignments provided directly from spatial transcriptomics analyses.

Spateo offers two alignment strategies, but here we will use the simpler rigid alignment because testing shows that this performs well for this sample. For other samples, non-rigid alignment methods may perform better.

before = adata.layers['stain'].copy()
st.cs.refine_alignment(adata, mode='rigid', transform_layers=['stain'])

|-----> Colored layer in AnnData object
|-----> Unstitched layer in AnnData object
|-----> Refine alignment in rigid mode.
|-----> Transform layer['stain']
|-----> Stain layer in AnnData object
|-----> Stain the layer in AnnData object.

fig, axes = plt.subplots(ncols=2, figsize=(8, 4), tight_layout=True)
axes[0].imshow(before)
st.pl.imshow(adata, 'unspliced', ax=axes[0], alpha=0.6, cmap='Reds', vmax=2, use_scale=False, save_show_or_return='return')
axes[0].set_title('before alignment')
st.pl.imshow(adata, 'stain', ax=axes[1], use_scale=False, save_show_or_return='return')
st.pl.imshow(adata, 'unspliced', ax=axes[1], alpha=0.6, cmap='Reds', vmax=2, use_scale=False, save_show_or_return='return')
axes[1].set_title('after alignment')

|-----> Unspliced ​​layer in AnnData object
|-----> Colored layer in AnnData object
|-----> Unspliced ​​layer in AnnData object
Text(0.5, 1.0, 'After alignment ')
Insert image description here

watershed-based approach

Spateo includes a custom watershed-based method for segmenting and labeling nuclei from stained images. At a high level, it uses a combination of global and local thresholds to first obtain nuclei masks (recall this is called segmentation), and then uses Watershed to assign labels (recall this is called labeling).

Split

st.cs.mask_nuclei_from_stain(adata)
st.pl.imshow(adata, 'stain_mask')

|-----> Stain layer in AnnData object
|-----> Constructs a nuclear mask from a stained image.
|-----> stain_mask to the layer in the AnnData object.
|-----> stain_mask layer in AnnData object
Insert image description here

Label

st.cs.find_peaks_from_mask(adata, 'stain', 7)
st.cs.watershed(adata, 'stain', 5, out_layer='watershed_labels')

fig, ax = st.pl.imshow(adata, 'stain', save_show_or_return='return')
st.pl.imshow(adata, 'watershed_labels', labels=True, alpha=0.5, ax=ax)

|-----> stain_mask layer in AnnData object
|-----> Find peaks with a minimum distance of 7.
|-----> to the stain_distances of the layer in the AnnData object.
|-----> stain_markers to the layers in the AnnData object.
|-----> Stain layer in AnnData object
|-----> Stain_mask layer in AnnData object
|-----> Stain_markers layer in AnnData object
|-----> Flow divide.
|-----> watershed_labels to the layer in the AnnData object.
|-----> Colored layer in AnnData object
|-----> Watershed_labels layer in AnnData object

Insert image description here
So far it’s done! But we still need to know more about its data structure! It should be noted that if there are different read structures, there may not be spliced ​​and unspliced ​​layers.

For more bioinformatics knowledge, welcome to exchange v: coffeeiix (single cell transcriptome analysis training is also available)

Guess you like

Origin blog.csdn.net/coffeeii/article/details/130396112