5 Spatial RNA: VisiumHD

Visium HD Spatial Gene Expression (10x Genomics)

The Visium HD Spatial Gene Expression platform offers whole-transcriptome spatial profiling at single-cell resolution. Below is a concise summary of the key features and benefits.

Technology Key Features

High-Resolution Capture
- Slides contain two 6.5 x 6.5 mm capture areas.
- Each area includes millions of contiguous 2 x 2 µm barcoded squares.
- Enables single-cell–scale spatial resolution.
Broad Sample Compatibility
- Supports human and mouse tissue.
- Compatible with:
  - Fresh frozen
  - Fixed frozen
  - FFPE sections
  - Archived blocks and pre-sectioned slides
CytAssist Workflow Integration
- Uses the Visium CytAssist instrument.
- Transfers transcriptomic probes from standard slides to Visium HD slides.
- Streamlines sample preparation and improves precision.
Enhanced Data Analysis
- Spatial gene expression data at 2 µm resolution.
- Recommended binning: 8 x 8 µm for analysis.
- Compatible with:
  - Space Ranger
  - Loupe Browser

Applications

Spatially resolved transcriptomics at single-cell resolution.
Analyze cell identity, function, and tissue architecture.
Ideal for research in cancer, neuroscience, and developmental biology.

Learn more at: 10x Genomics Visium HD

The difference of VisiumHD and Visium

Feature	Visium HD	Visium (Standard)
Spatial Resolution	2 × 2 µm barcoded squares (~10 million features)	55 µm diameter spots (~5,000 features)
Sample Types	FFPE, fresh frozen, fixed frozen	Fresh frozen (v1), FFPE (v2 with CytAssist)
Data Analysis	High-resolution, single-cell level	Lower resolution, may need scRNA-seq integration

Choosing the Right Bin Size in Visium HD: Implications from Human Cell Size

Human cell sizes vary widely across tissues and cell types—from small red blood cells (~6–8 µm) to large adipocytes or neurons (50–100+ µm). Visium HD provides raw data start at 2 × 2 µm resolution, allowing users to flexibly define bin sizes during analysis.

Why Bin Size Matters

The bin size determines how raw 2 µm pixel data are aggregated spatially. Choosing the appropriate bin size impacts:

Signal strength: Larger bins capture more transcripts (UMIs), reducing sparsity.
Cell boundary resolution: Smaller bins preserve spatial detail but may increase data dropout.

Smaller bins often have lower total UMI counts (nCount) and fewer detected genes (nFeature), which can pose challenges for clustering, normalization, and downstream statistical power.

Best Practices

Use 8 × 8 µm binning for most human tissues to balance resolution and data quality.
Use exploratory binning at multiple scales (e.g. 8 µm, 16 µm) to assess resolution vs. noise trade-off.
Align bin size with known cell morphology and tissue architecture.
Consider downstream filtering thresholds and quality control—small bins may require more stringent preprocessing or imputation.
Use segmentation tools like:
- StarDist – for accurate nuclear segmentation from DAPI or H&E images.
- bin2cell – to assign spatial bins to cells using segmentation-informed boundaries.

Quick prep before we dive in!!

Data Downloads

Before starting this tutorial, please making sure to downloaded/find VisiumHD output.

In this tutorial, we will use one of 10X public datasets, Human Breast Cancer (Fresh Frozen).

More different tissue type data can be found at here

# standard spaceranger output for visiumHD
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_web_summary.html
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_cloupe_008um.cloupe
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_feature_slice.h5
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_metrics_summary.csv
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_molecule_info.h5
curl -O https://cf.10xgenomics.com/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_spatial.tar.gz
curl -O https://s3-us-west-2.amazonaws.com/10x.files/samples/spatial-exp/3.1.2/Visium_HD_FF_Human_Breast_Cancer/Visium_HD_FF_Human_Breast_Cancer_binned_outputs.tar.gz

## unzip
tar -xvf Visium_HD_FF_Human_Breast_Cancer_spatial.tar.gz
tar -xvf Visium_HD_FF_Human_Breast_Cancer_binned_outputs.tar.gz

Just a quick chekcing without coding, check cloupe with Loupe Browser 8.1.2

Check 10X Web_Summary

Always the easiest and first step to understand your sample with XXXX_web_summary.html

Let’s dive in !!

installing R packages

#increase timeout because some of these packages take longer to download
options(timeout=9999999)
#the extra arguments are so that it won't stop to ask you any questions
remotes::install_github("mojaveazure/seurat-disk", upgrade = "always")
remotes::install_github("cellgeni/schard", upgrade = "always")
remotes::install_github("huayc09/SeuratExtend", upgrade = "always")
remotes::install_github("dmcable/spacexr", upgrade = "always")
remotes::install_github("RubD/Giotto", upgrade = "always") 
remotes::install_github("jinworks/CellChat", upgrade = "always")
install.packages("arrow", repos='http://cran.us.r-project.org')
remotes::install_github("immunogenomics/presto", upgrade = "always")

loading R packages

Click to expand R pacakges list

{
  library(matrixStats)  ## remotes::install_version("matrixStats", version="1.1.0") ## packageVersion('matrixStats')
  library(scCustomize)
  
  library(Seurat) # packageVersion('Seurat'), 5.2.0
  library(SeuratDisk)

  library(fields) 
  library(grid)
  library(ggplot2)
  library(schard)
  library(cowplot)
  library(paletteer)
  library(SeuratExtend)
  library(reshape2)
  library(patchwork) 
  library(plotly)
  library(RColorBrewer)
  library(arrow)
  library(magick) 
  library(ggpubr)
  
  library(kableExtra)
  library(rvest)
  
  library(ggrepel) 
  
  ## visiumHD annotation
  library(spacexr)
  library(Rfast)
  
  ## copykat
  library(copykat)
  
  ##spatial
  library(deldir)
  library(Rfast2)
  library(deldir)
  library(igraph)
  library(pheatmap)
  library(Giotto)
  library(CellChat)

  set.seed(1234) ## for Reproducible
}

Note, this tutorial was prepared using Seurat 5

you might need to install the packages if you are running your analysis on a local computer

Functions in R

Here is some pre-defined Functions will be used in this tutorial, which help to organize, reuse and simplify your code.

Click to expand R function code

Filter_cluster <- function(Object_8um){
## Extract metadata with cluster info
before <- Object_8um@meta.data %>%
  count(cluster = default_10X_cluster, name = "n_before")

after <- Object_8um_filtered@meta.data %>%
  count(cluster = default_10X_cluster, name = "n_after")

## Join and calculate filtered-out percentage
filtered_stats_8bin <- left_join(before, after, by = "cluster") %>%
  mutate(
    n_after = ifelse(is.na(n_after), 0, n_after),
    n_filtered = n_before - n_after,
    percent_filtered = round(100 * n_filtered / n_before, 1),
    percent_Keep = 100-percent_filtered
  )

if( any(is.na(filtered_stats_8bin$cluster)) ){
filtered_stats_8bin <- filtered_stats_8bin[-which(is.na(filtered_stats_8bin$cluster)),]
}
 
p2_8u_BF <- ggplot(filtered_stats_8bin, aes(x = factor(cluster), y = percent_Keep)) +
  geom_bar(stat = "identity", fill = "#3182bd", width = 0.7) +
  geom_text(
    aes(label = n_after),
    vjust = -0.6,
    size = 4.5,
    fontface = "bold"
  ) +
  labs(
    title = "Qualified bins per Cluster",
    x = "Cluster",
    y = "Percentage of bins Pass QC"
  ) +
  theme_minimal(base_size = 15) +
  theme(
    plot.title = element_text( size = 16, hjust = 0.5),
    axis.text = element_text(size = 12),
    panel.grid.major.x = element_blank(),
    panel.grid.minor = element_blank()
  )

return(p2_8u_BF)
}

PieChart_Spot <- function(objectPie){
    # objectPie <-Object_8um_filteredQ
    spot_counts <- as.data.frame(table(objectPie$spot_class))  # Count cell types
    colnames(spot_counts) <- c("spot_class","count")
    spot_counts$fraction = round(spot_counts$count / sum(spot_counts$count) * 100,1)
    spot_counts$ymax = cumsum(spot_counts$fraction)
    spot_counts$ymin = c(0, head(spot_counts$ymax, n=-1))
    spot_counts$labelPosition <- (spot_counts$ymax + spot_counts$ymin) / 2
    spot_counts$label <- paste0(spot_counts$spot_class, "\n ", 
                            spot_counts$count," (",spot_counts$fraction,"%)")
  
  ## label only on the legend
  set1_colors <- brewer.pal(n = length(unique(spot_counts$spot_class)), name = "Set1")
  spot_counts$label_legend <- paste0(spot_counts$spot_class," (",spot_counts$fraction,"%)")
      
  Bin8_RCTD_spot_Pie <-plot_ly(
  data = spot_counts,
  labels = ~label_legend,
  values = ~count,
  type = 'pie',
  textinfo = 'none',  # hide labels on chart
  #hoverinfo = 'label+value+percent',  # keep tooltip
  hoverinfo = 'label+value',  # keep tooltip
  marker = list(colors = set1_colors)
  ) %>%
  layout(
    showlegend = TRUE,  # show the legend
    legend = list(orientation = "v", x = 1.05, y = 1)  # position legend to right
  )
    
    return(Bin8_RCTD_spot_Pie)
}
  
## pie chart for cell type
PieChart_CellType <- function(objectPie){
    #objectPie<-Object_8um_filteredQ
    spot_counts <- as.data.frame(table(objectPie$RCTD_CellType))  # Count cell types
    colnames(spot_counts) <- c("RCTD_CellType","count")
    spot_counts$fraction = round(spot_counts$count / sum(spot_counts$count) * 100,1)
    spot_counts$ymax = cumsum(spot_counts$fraction)
    spot_counts$ymin = c(0, head(spot_counts$ymax, n=-1))
    spot_counts$labelPosition <- (spot_counts$ymax + spot_counts$ymin) / 2
    spot_counts$label <- paste0(spot_counts$RCTD_CellType, "\n ", 
                            spot_counts$count," (",spot_counts$fraction,"%)")
    
    spot_counts$label_legend <- paste0(spot_counts$RCTD_CellType," (",spot_counts$fraction,"%)")
 
   ## label only on the legend
   Bin8_RCTD_CellType_Pie <-plot_ly(
  data = spot_counts,
  labels = ~label_legend,
  values = ~count,
  type = 'pie',
  textinfo = 'none',  # hide labels on chart
  #hoverinfo = 'label+value+percent',  # keep tooltip
  hoverinfo = 'label+value',  # keep tooltip
  marker = list(colors = DNA_col[seq_len(nrow(spot_counts))])
  ) %>%
  layout(
    showlegend = TRUE,  # show the legend
    legend = list(orientation = "v", x = 1.05, y = 1)  # position legend to right
  )
   
    return(Bin8_RCTD_CellType_Pie)
}

## TME density
TMEDensity <- function(Object, cellstateXY_TME, distance_matrix, distance_cutoff, query) {
  
  ## chose the cell-state or cell type for density calcuate
  all_states <- sort(unique(cellstateXY_TME[, which(colnames(cellstateXY_TME) == query)]))
  
  ## cell-state
  result_matrix <- matrix(0, 
                          nrow = nrow(cellstateXY_cancer), 
                          ncol = length(all_states))
  
  colnames(result_matrix) <- all_states
  rownames(result_matrix) <- rownames(cellstateXY_cancer)
  
  for (i in 1:nrow(distance_matrix)) {
    # i=1
    close_indices <- which(distance_matrix[i, ] <= distance_cutoff)
    
    if (length(close_indices) > 0) {
      # Subset the states of the nearby TME cells.
      tme_states <- as.character(cellstateXY_TME[, which(colnames(cellstateXY_TME) ==query)][close_indices])

      # Create a table of counts for the states.
      state_counts <- table(tme_states)
      
      # Fill in the corresponding columns for the current cancer cell.
      for (state in names(state_counts)) {
        result_matrix[i, state] <- state_counts[[state]]
      }
    }
  }
  
  colnames(result_matrix) <- paste("TME_",colnames(result_matrix),sep="")

  ## add meta ---------------------------------------------
  Object_cancer <- subset(Object, cells = rownames(result_matrix))
  
  ## add count
  result_matrix_NA <- result_matrix
  result_matrix_NA[result_matrix_NA == 0] <- NA
  colMeans(is.na(result_matrix_NA)) * 100
  Object_cancer <- AddMetaData(object = Object_cancer, metadata = result_matrix_NA)
  
  Object_cancer@meta.data$x <- GetTissueCoordinates(Object_cancer)$y
  Object_cancer@meta.data$y <- -GetTissueCoordinates(Object_cancer)$x
  tissue_coords <- as.matrix(Object_cancer@meta.data[,c("x","y")])
  colnames(tissue_coords) <- c("image_1", "image_2")

  Object_cancer[["image"]] <- CreateDimReducObject(
    embeddings = tissue_coords,
    assay = DefaultAssay(Object_cancer),
    key = "image_"
  )
  
  return(Object_cancer)
}

## clone color 
DNA_col <- c("#ffe5cd","#F70373","#822E1CFF","#AA0DFEFF","#1CBE4FFF",
             "#ff89cc","#1CFFCEFF","#ed968c","#f35d36","#bbb1ff","#D3FC5E",
             "#7c4b73","#CD69C9","#31c7ba","#5f5c0b","#4d1da9","#f4d451","#d0ffb7","#239f6e","#1b80ad","#2F4F4F",
             "#acb1b4","#080000","#2ab2e5","#b97e45","#f03c2b","#daa9d9","#63ff55","#ebc2b9",
             "#fae70f","#c9ceda","#564c6c","#4539dd","#dd0cc5","#c6662f","#105c13","#dd7d6d","#b1d8ff","#FEAF16FF",
             "#ffd000","#6596cd","#b90303","#aabf88","#534e46","#974949","#828282","#bd8399","#5373a7")

Data Preprocessing with Seurat 5

Create Seurat spatial Objects

#data_path = "Path to the cell ranger output folder"
#SampleName = "Your Sample Name"
Object <- Load10X_Spatial(data.dir = data_path, bin.size = c(8, 16))

load 10X default graph cluster

we only load and demonstrate 8um here, feel free to try other bin size.

## split into 8 um and load 10X umap and cluster
DefaultAssay(Object) <- "Spatial.008um"
Object_8um <- DietSeurat(Object,
                       assays = "Spatial.008um")
projection_8um <- read.csv(paste0(data_path,"/binned_outputs/square_008um/analysis/clustering/gene_expression_graphclust/clusters.csv",sep=""), row.names = 1,stringsAsFactors=F,check.names = FALSE)
cluster_8um <- read.csv(paste0(data_path,"/binned_outputs/square_008um/analysis/umap/gene_expression_2_components/projection.csv"), row.names = 1,stringsAsFactors=F,check.names = FALSE)
projection_cluster_8um <- cbind(projection_8um,cluster_8um)
#head(projection_cluster_8um)

Object_8um@reductions$umap <- CreateDimReducObject(embeddings = as.matrix(projection_cluster_8um[, c("UMAP-1", "UMAP-2")]),
                                             key = "UMAP-", assay = DefaultAssay(Object_8um))
Object_8um$default_10X_cluster <- projection_cluster_8um$Cluster[match(colnames(Object_8um), rownames(projection_cluster_8um))]

p1_8u <- DimPlot(Object_8um, reduction = "umap", 
        group.by = c("default_10X_cluster"),
        ncol = 1,label=T,raster = T)

p2_8u <-SpatialDimPlot(Object_8um, 
               label = F, repel = F,  
               pt.size.factor = 1.5, 
               image.alpha = 0.6,
               stroke = 0.01,
               group.by = c("default_10X_cluster"))


plot_grid(p1_8u,p2_8u,ncol=1)

rm(Object)

QC metrics Visualization in 8 um Bin

Object_8um[["percent.mt"]] <- PercentageFeatureSet(object = Object_8um, pattern = "^MT-")


bin8_bin_ncount <- ggplot(Object_8um@meta.data, aes(x = nCount_Spatial.008um)) +
  xlim(0,500) + 
  #ylim(0,100000) +
  geom_histogram(binwidth = 10, fill = "steelblue", color = "black") +
  labs(x = "nCount_Spatial.008um",
       y = "Cell Count") + theme_minimal()

bin8_bin_nFeature <- ggplot(Object_8um@meta.data, aes(x = nFeature_Spatial.008um)) +
  xlim(0,500) + 
  #ylim(0,100000) +
  geom_histogram(binwidth = 10, fill = "steelblue", color = "black") +
  labs(x = "nFeature_Spatial.008um",
       y = "Cell Count") + theme_minimal()

Bin8_cellViability4 <-ggplot(Object_8um@meta.data,aes(x=percent.mt))+stat_ecdf(aes(colour=orig.ident))+ scale_x_continuous(name = "percent.Mitochondrial",breaks=seq(0,100,10),limits=c(0,100))+theme_bw()+ylab("The percentage of bins")+ theme(legend.position="none")

Bin8_QC <-plot_grid(bin8_bin_nFeature,bin8_bin_ncount,Bin8_cellViability4,ncol=3)

Bin8_Feature <- FeaturePlot(Object_8um,features = c("nFeature_Spatial.008um","nCount_Spatial.008um")) + theme(legend.position = "right")
Bin8_S_Feature <- SpatialFeaturePlot(Object_8um, features = c("nFeature_Spatial.008um","nCount_Spatial.008um")) + theme(legend.position = "top")

plot_grid(Bin8_QC,Bin8_Feature,Bin8_S_Feature,ncol=1)

Filtering criterion for 8um Bin

Bins detected in fewer than 50 spots were excluded
Bins with fewer than 50 total detected genes were excluded

Object_8um_filtered <- subset(
  Object_8um,
  subset = nFeature_Spatial.008um > 50 &
           nCount_Spatial.008um > 50) #& percent.mt < 20)

Bin8_pass <- (round(ncol(Object_8um_filtered) / ncol(Object_8um)*100,2))

table_8um <-data.frame(`8um Bin`=ncol(Object_8um_filtered),
                       Bin8_pass,
                       round(median(Object_8um_filtered@meta.data$nFeature_Spatial.008um)) )

colnames(table_8um)<-c("Number of Bin","Filter Ratio(%)","Median Genes per Bin")
table_8um

checking the Filtering result

## check the filter cell in each cluster 
p2_8u_BF <- Filter_cluster(Object_8um)

Keep_cells <- colnames(Object_8um) %in% colnames(Object_8um_filtered)
Object_8um$QC_passed <- Keep_cells

p1_8u_F <- DimPlot(Object_8um, reduction = "umap", 
        group.by = c("QC_passed"),
        ncol = 1,label=T,raster = T)

p2_8u_F <-SpatialDimPlot(Object_8um, 
               label = F, repel = F,  
               pt.size.factor = 1.5, 
               image.alpha = 0.6,
               stroke = 0.01,
               group.by = c("QC_passed"))

plot_grid(p2_8u_BF, plot_grid(p1_8u_F,p2_8u_F,ncol=2),
          rel_heights=c(0.4,0.6),ncol=1)

UMAP Cluster and Marker Genes for 8um Bin (10k bins subset)

## subset for 10k cell
sampled_cells <- sample(Cells(Object_8um_filtered), size = 10000, replace = FALSE)
Object_8um_filteredQ <- subset(Object_8um_filtered, cells = sampled_cells)

## standard seurat pre-processing like scRNA
{
Object_8um_filteredQ <- NormalizeData(Object_8um_filteredQ)
Object_8um_filteredQ <- FindVariableFeatures(Object_8um_filteredQ,nfeatures = 2000)
Object_8um_filteredQ <- ScaleData(Object_8um_filteredQ)

Object_8um_filteredQ <- RunPCA(Object_8um_filteredQ, verbose = FALSE)
Object_8um_filteredQ <- RunUMAP(Object_8um_filteredQ, dims = 1:30, verbose = FALSE)
Object_8um_filteredQ <- FindNeighbors(Object_8um_filteredQ, dims = 1:30, verbose = FALSE)
Object_8um_filteredQ <- FindClusters(Object_8um_filteredQ, verbose = FALSE,resolution = 0.3)
}

bin8_F_d <- DimPlot(Object_8um_filteredQ, label = TRUE)

bin8_F_sd <-SpatialDimPlot(Object_8um_filteredQ, 
               label = F, repel = F,  
               pt.size.factor = 10, 
              image.alpha = 0.5,
              stroke = 0.1)

Bin8_filter_VP <- VlnPlot(Object_8um_filteredQ, features = c("nCount_Spatial.008um"), pt.size = 0, log = TRUE)

plot_grid(plot_grid(bin8_F_d,bin8_F_sd,ncol = 2), 
          Bin8_filter_VP, 
          rel_heights=c(0.3,0.2),ncol=1)

UMAPplot_CSHL )

## cluster markers
bin8_markers <- FindAllMarkers(Object_8um_filteredQ, only.pos = TRUE)
bin8_markers %>%
  group_by(cluster) %>%
  dplyr::filter(avg_log2FC > 1) %>%
  slice_head(n = 10) %>%
  ungroup() -> bin8_top10

bin8_F_G <- DoHeatmap(Object_8um_filteredQ, features = bin8_top10$gene, size = 2.5) + theme(axis.text = element_text(size = 8)) + NoLegend()
bin8_F_G

CSHL_Heatmap

Due to the probe-based chemistry (similar to FLEX) used in Visium HD, and the platform’s inherent characteristics (such as targeted transcript capture, limited probe design coverage, and reduced sensitivity to low-abundance transcripts), not all canonical scRNA-seq markers are reliably captured in Visium HD data.

## canonical scRNA markers

Major_CellType_Markers  <- list(
  Endo_genes = c("CDH5","VWF","PECAM1","ESAM","THBD","CD36", "CD34", "HEG1"),
  Immun_genes = c("PTPRC","SRGN","TYROBP"),
  myeloid_genes = c("CD14", "CD33", "TREM2"),
  Epi_genes = c("EPCAM","KRT5","KRT17","KRT6","KRT18","KRT19", "KRT8", "KRT14"),
  Adipo_genes = c("APMAP","ADIPOQ","ADIPOQ-AS1","TPRA1"),
  Fibro_genes = c("COL1A1","LUM","FBLN1", "DCN", "FBN1", "COL1A2", "PCOLCE")
)

Dotplot_Group <- function(objectQ_subset,CellType_Markers) {
  marker_genes <- unlist(CellType_Markers)
  gene_annotation <- stack(marker_genes)
  colnames(gene_annotation) <- c("Gene", "Category")
  gene_annotation$Category <- gsub("_genes[0-9]", "", gene_annotation$Category)
  head(gene_annotation)
  
  dotplot <- DotPlot(objectQ_subset, features = as.character(marker_genes)) +
    coord_flip() +  # Rotate the plot
    theme_minimal()
  
  dotplot$data <- merge(dotplot$data, gene_annotation, by.x = "features.plot", by.y = "Gene", all.x = TRUE)
  
  # Re-plot with facets for annotation
  P <- dotplot + facet_grid(rows = vars(Category), scales = "free_y", space = "free_y")
  
  return(P)
}

bin8_F_C <- Dotplot_Group(Object_8um_filteredQ,Major_CellType_Markers)
bin8_F_C

CSHLWorkshop_Dotplot

## Subsetting cluster 2

Object_8um_filteredQ_subset<-subset(Object_8um_filteredQ, idents = '2', invert = TRUE)
DimPlot(Object_8um_filteredQ_subset, label = TRUE)

CSHL_UMAPsubset

Specific mouse markers to explore in your own data

Click to expand mouse brain cell type markers list

# A curated list of relevant brain cell types and the genes that define them
# Neural
Neural_genes <- c("Cadm2","Pcdh9","Syt1","Dlg2","Dlgap1","Negr1", "Fgf14")
# Oligodendrocyte
Oligo_genes <- c("Mbp","Plp1","Cnp","Mag","Mobp", "Olig1", "Mal") 
# Astrocytes
Astro_genes <- c("Slc39a12","Ntsr2","Fgfr3", "Bcan", "Aqp4", "Gfap") 
# Microglial 
Microglial_genes <- c("Hexb","C1qa","C1qb", "C1qc", "Csf1r", "Ctss") 
# Purkinje
Purkinje_genes <- c("Auts2","Foxp2","Ebf1","Grid2", "Dner", "Khdrsb2") 
# Ependymal
Ependymal_genes <- c("Dynlrb2","Tmem212","Rsph1","Foxj1", "Rarres2", "Tmem47")
# Hypendymal
Hypendymal_genes <- c("Rmst","Epha7","Zfhx4", "Efnb3", "Zic4", "Wls")
# Choroid 
Choroid_genes <- c("Rbm47","Htr2c","Enpp2", "Ttr", "Fras1", "Glis3")
#Endothelial Mural
Mural_genes <- c("Adap2","Pdgfrb","Pde8b", "Myo1b", "Rbpms", "Dlc1")
#Endothelial Stalk
Stalk_genes <- c("Slco1a4", "Adgrl4", "Abcg2", "Nostrin", "Cldn5", "Ptprb")

# Granular neuron types - Location (Suggest to just use Cerbral cortex, thalamus and hypothalamus markers)
# Cerebellar granule cell - Cerebellum
CBgranular_genes <- c("Kcnd2","Grik2","Fgf14", "Chn2", "Tenm1", "Olfm3", "Astn2") #Not present in this section
# Cerebellar golgi cell - Cerebellum
CBgolgi_genes <- c("Sgcd","Lgi2","Megf11", "Elavl2", "Cacna1d", "Fgf13", "Cdh4") #Not present in this section
# Unipolar brush cell - Cerebral cortex and Cochlear nucleus
CTX_CN_UBC_genes <- c("Unc5c","Tmem108","Nnat","Marchf1","Ccser1", "Sorbs2")
#Cerebral cortex neuron
CTXneuron_genes <-  c("Vamp2","Selenow","Grin2b","Snap25","Atp6v1g2","Norad", "Syp")
#Hippocamal neuron
HIPneuron_genes <- c("Ncdn","Ppfia2","Rfx3", "Ahcyl2", "Epha7")
#Cortical Interneuron
CTXInterneuron_genes <- c("Galntl6","Gad2","Nap1l5", "Grip1", "Tspyl4", "Cplx1")
#Medium Spiny neuron
CTX_TH_MSN_genes <- c("Rarb","Snhg14","Rian", "Rgs9", "Ryr3", "Gnal")
Pallium_spiny_genes <- c("Ppp1r1b")
#Cajal-Retzius cells
CTX_HIP_CRC_genes <- c("Ndnf","Reln","Ramp1", "Thsd7b", "Fat3", "Kcnb2")
#Excitary projection neurons
EP_genes <- c("Tbr1", "Neurod6", "Satb2")
#Hypothalamus
HY_genes <- c("Nkx2-1", "Sim1", "Lhx6", "Lhx8")
#Thalamus
TH_genes <- c("Tcf7l2", "Six3", "Plekhg1")
#Midbrain and Pontine neurons
MBP_genes <- c("Otx2", "Gata3", "Pax5", "Pax7", "Sox14")

# Granular cell types - Neurotransmitters
#GABAergic neurons
GABAergic_genes <- c("Slc32a1","Gabra1","Gad2","Galntl6", "Grip1", "Gbx1")
#Glutamergic neurons
Glutamate_genes <- c("Slc17a6", "Slc17a7", "Slc17a8", "Arpp21","Car10","Sv2b","Camk4", "Pcsk2", "Brinp1")
#Cholinergic neurons
Choline_genes <- c("Chat","Slc18a3","Tbx20","Chrnb3","Nppb","Ush1c", "Npy4r")
#Dopaminergic neurons
Dopa_genes <- c("Slc6a3", "Chrna6", "Gucy2c", "Aldh1a7", "Tdh")
#Serotonin neurons
Sero_genes <- c("Slc6a4","Tph2") #not in this section
#Noradrenergic neurons 
Noradren_genes <- c("Slc6a2", "Dbh") #not in this section

Spot Decomposition with RCTD

Robust Cell Type Decomposition (RCTD) is a probabilistic method used to infer cell type composition in spatial transcriptomics data by mapping expression profiles to a single-cell RNA-seq reference. Each spatial bin is classified based on how confidently the model matches the expression to one or more cell types.

For the current human breast tissue VisiumHD data, we will utilize a single-cell reference dataset sourced from: Kumar, T., Nee, K., Wei, R. et al. A spatially resolved single-cell genomic atlas of the adult human breast. Nature 620, 181–191 (2023)

## load the RCTD reference made from Navin lab HBCA scRNA data
reference_path <- paste("THE_PATH_TO_Navin_HBCA_rds") 
reference <- readRDS(reference_path) #Read the mouse brain dataset - allen_scRNAseq_ref.Rds

### Create the RCTD reference object (for mouse scRNAseq data)
Idents(reference) <- "subclass_label"
counts <- reference[["RNA"]]$counts
cluster <- as.factor(reference$subclass_label)
nUMI <- reference$nCount_RNA
levels(cluster) <- gsub("/", "-", levels(cluster))
cluster <- droplevels(cluster)
reference <- Reference(counts, cluster, nUMI)

## prepare the query 
counts_hd <- GetAssayData(Object_8um_filteredQ, slot = "counts")
cortex_cells_hd <- colnames(Object_8um_filteredQ)
coords <- GetTissueCoordinates(Object_8um_filteredQ)[cortex_cells_hd, 1:2]
query <- SpatialRNA(coords, counts_hd, colSums(counts_hd))
  
## run RCTD (it take around 4 hours to run on my server)
Bin8_RCTD <- create.RCTD(query, reference, max_cores = 1, UMI_min = 10) ## the number of core need to set as 1 on core server, but it take all cores anyway
Bin8_RCTD <- run.RCTD(Bin8_RCTD, doublet_mode = "doublet")

## RCTD Result
Object_8um_filteredQ <- AddMetaData(Object_8um_filteredQ, metadata = Bin8_RCTD@results$results_df)
Object_8um_filteredQ$RCTD_CellType <- as.character(Object_8um_filteredQ$first_type)
Object_8um_filteredQ$RCTD_CellType[is.na(Object_8um_filteredQ$RCTD_CellType)] <- "Unknown"

## Skip waiting time by load pre-run rds object
Object_8um_filteredQ <- readRDS( paste(wd, SampleName,"_8um_Filtered_Subset10k.rds",sep=""))

Spot Decomposition with RCTD: Cell Type

## cell type
Idents(Object_8um_filteredQ) <- "RCTD_CellType"

## fix the cell type order
celltypes_8 <- as.character(Idents(Object_8um_filteredQ))
current_levels_8 <- unique(sort(celltypes_8))
unknown_levels_8 <- sort(current_levels_8[grepl("(?i)^unknown$", current_levels_8, perl = TRUE)])
other_levels_8 <- setdiff(current_levels_8, unknown_levels_8)
new_levels_8 <- c(other_levels_8, unknown_levels_8)
Object_8um_filteredQ$RCTD_CellType <- factor(celltypes_8, levels = new_levels_8)

Bin8_RCTD_celltype_VP <- VlnPlot(Object_8um_filteredQ, features = c("nCount_Spatial.008um"), group.by = "RCTD_CellType",pt.size = 0, log = TRUE, cols = DNA_col) + NoLegend()

Bin8_RCTD_dimplot_celltype <- DimPlot(Object_8um_filteredQ, reduction = "umap",pt.size = 1 , group.by = c("RCTD_CellType"), cols = DNA_col, ncol = 1) + ggtitle("UMAP of RCTD clustering")

## cluster markers
bin8_celltye_markers <- FindAllMarkers(Object_8um_filteredQ, only.pos = TRUE)
bin8_celltye_markers %>%
  group_by(cluster) %>%
  dplyr::filter(avg_log2FC > 1) %>%
  slice_head(n = 5) %>%
  ungroup() -> bin8_ct_top5

bin8_CT_Heat <- DoHeatmap(Object_8um_filteredQ, features = bin8_ct_top5$gene, size = 2.5) + theme(axis.text = element_text(size = 8)) + NoLegend()

Bin8_RCTD_CellType_DP <- SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,  
                                        group.by = "RCTD_CellType",
                                        pt.size.factor = 10, 
                                        image.alpha = 0.5,
                                        stroke = 0.1)  + 
                                        scale_fill_manual(values = DNA_col)

Bin8_RCTD_CellType_Pie <- PieChart_CellType(Object_8um_filteredQ)

Bin8_RCTD_CellType_ALL <- plot_grid(Bin8_RCTD_CellType_Pie,
                                    Bin8_RCTD_celltype_VP,
                                    bin8_CT_Heat,
                                    Bin8_RCTD_dimplot_celltype,
                                    Bin8_RCTD_CellType_DP, 
                                    rel_heights = c(0.2,0.2,0.3,0.3,0.3), ncol=1)
Bin8_RCTD_CellType_ALL

RCTD_VlnPlot

CSHLRCTD_Heatmap

UMAP_RCTD

RCTD_Tissue

Integrated Cell Type Mapping in VisiumHD

In Visium HD data, we combine canonical DEGs from unbiased graph-based clustering with RCTD reference inference for accurate cell type annotation. DEGs provide biologically interpretable markers, while RCTD deconvolves mixed spots and detects subtle populations

celltype_df <- Object_8um_filteredQ@meta.data %>%
  count(RCTD_CellType, seurat_clusters)

celltype_df_prop <- celltype_df %>%
  group_by(seurat_clusters) %>%
  mutate(freq = n / sum(n))

celltype_df_prop

Cell_type_Bar <- ggplot(celltype_df_prop, aes(x = seurat_clusters, y = freq, fill = RCTD_CellType)) +
  geom_bar(stat = "identity", position = "stack") +
  theme_minimal() +
  labs(x = "seurat_clusters", y = "Proportion", fill = "RCTD_CellType") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_fill_manual(values = DNA_col)


Dimplot_Both <- DimPlot(Object_8um_filteredQ, group.by = c("seurat_clusters","RCTD_CellType"),label = TRUE)

SDimplot_Both <-  SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F, 
                                         group.by = c("seurat_clusters","RCTD_CellType"),
                                         pt.size.factor = 10, 
                                         image.alpha = 0.5,
                                         stroke = 0.1) + 
                                         scale_fill_manual(values = DNA_col)

plot_grid(Dimplot_Both, SDimplot_Both, Cell_type_Bar,
          rel_heights = c(0.3,0.4,0.3), ncol=1)

Naming Cluster for Cell-Type

Note: Cancer-Associated Fibroblasts (CAFs) in breast cancer support tumor growth and immune evasion through extracellular matrix (ECM) remodeling and cytokine secretion, characterized by the expression ofexpressed both fibroblast and tumor-associated signatures markers,

Idents(Object_8um_filteredQ) <- "seurat_clusters"

new.cluster.ids <- c("fibroblasts_0", "lumhr_1", "CAFs_2",
                     "bcells_3", "Endothelial_4", 
                     "tcells_5", "lumhr_6", 
                     "lymphatic_7","lumhr_8")

names(new.cluster.ids) <- levels(Object_8um_filteredQ)

Object_8um_filteredQ <- RenameIdents(Object_8um_filteredQ, new.cluster.ids)
Object_8um_filteredQ$CellType <- Idents(Object_8um_filteredQ)

Dimplot_CT <- DimPlot(Object_8um_filteredQ,label = TRUE,group.by = c("CellType"), cols = DNA_col,) + theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

SDimplot_CT <-  SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F, 
                                group.by = c("CellType"),pt.size.factor = 10, 
                                image.alpha = 0.5,stroke = 0.1) + 
                                scale_fill_manual(values = DNA_col) + 
                                theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

plot_grid(Dimplot_CT, SDimplot_CT, ncol=2)

Advance Spatial Analysis

Pathologist annotation of H&E images

Pathologists annotate H&E images by assessing key histological features such as cell shape, nuclear size, tissue architecture, and staining patterns to define regions like tumor, stroma, or immune areas. These expert-defined regions serve as a reference to validate spatial clustering, guide spot selection, and interpret gene expression patterns in Visium HD analysis.

Feature	TLS (Tertiary Lymphoid Structure)	DCIS (Ductal Carcinoma In Situ)	IDC (Invasive Ductal Carcinoma)
Nature	Immune structure	Pre-invasive lesion	Invasive cancer
Location	Near or within tumor microenvironment	Inside breast ducts	Infiltrates surrounding stroma
Invasiveness	Non-malignant	Non-invasive	Invasive
Biological Role	Reflects immune response	Early-stage precursor to IDC	Main form of malignant progression
Clinical Relevance	Associated with better prognosis	May progress to invasive cancer	Most common breast cancer type

copykat Infering CNV

Copykat is the Navin lab tool that infers genome-wide copy number variations (CNVs) from single-cell RNA-seq data, distinguishing aneuploid tumor cells from diploid normal cells. Please refer to our published paper Delineating copy number and clonal substructure in human tumors from single cell transcriptomes

SampleName <- "BreastCancer_10X_FF_copykat_8um"

## run copykat
objectQS_cpk <- copykat(rawmat=Object_8um_filteredQ@assays$Spatial.008um$counts,
        ngene.chr=5,
        win.size=25,
        KS.cut=0.2, ##  segmentation parameter 0.05-0.15, Increasing KS.cut decreases sensitivity, i.e. less segments/breakpoints
        min.gene.per.cell = 50, ## min gene
        sam.name=SampleName, distance="euclidean", n.cores=50)

knitr::include_graphics(paste0(SampleName,"_copykat_heatmap.jpeg"))

Cell Ploidy Mapping

ck <- read.table(paste(SampleName,"_copykat_prediction.txt",sep=""),header = T)

Object_8um_filteredQ@meta.data$CopyKat<-"Undefined"
diploid<-ck[which((ck$copykat.pred=="diploid")|(ck$copykat.pred=="c1:low.confidence")),]
aneuploid<-ck[which((ck$copykat.pred=="aneuploid")|(ck$copykat.pred=="c2:low.confidence")),]

Object_8um_filteredQ@meta.data[which(row.names(Object_8um_filteredQ@meta.data) %in% aneuploid$cell.names),"CopyKat"]<-"Aneuploid"
Object_8um_filteredQ@meta.data[which(row.names(Object_8um_filteredQ@meta.data) %in% diploid$cell.names),"CopyKat"]<-"Diploid"

saveRDS(Object_8um_filteredQ, paste(SampleName,"_8um_Filtered_Subset10k.rds",sep=""))

Bin8_dimplot_copykat <- DimPlot(Object_8um_filteredQ, reduction = "umap",pt.size = 1 , group.by = c("CopyKat"), ncol = 1) + ggtitle("UMAP of RCTD clustering") + theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

Bin8_Sdimplot_copykat <- SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,  
                        pt.size.factor = 10, 
                        image.alpha = 0.5,
                        stroke = 0.1,
                        group.by = c("CopyKat") ) + theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

plot_grid(Bin8_dimplot_copykat, Bin8_Sdimplot_copykat, ncol=2)

Input copykat ratio matrix to copykit for further analysis

library(copykit)
source('/Users/admin/Desktop/CSHL_2025/scRNA/Rscript/Copykit_copykat_functions.R')

CNA.test <- read.table(paste(SampleName,"_copykat_CNA_results.txt",sep="") ,header = T, check.names = FALSE)
colnames(CNA.test) <- sub("\\.1$", "-1", colnames(CNA.test))

cpkobj <- Copykit_copykat(CNA_result = CNA.test, genome_version = 'hg19',resolution = '200k',method = 'copykat')
assay(cpkobj, "segratio_unlog") <- 2^assay(cpkobj, "segment_ratios")
assay(cpkobj, "logr") <- assay(cpkobj, "segment_ratios")

meta <- colData(cpkobj) %>% data.frame()

#add celltype annotation
meta_8Q <- Object_8um_filteredQ@meta.data
meta_8Q$cell.name <- rownames(meta_8Q)

meta <- left_join(meta,meta_8Q)
colData(cpkobj)$CopyKat = meta$CopyKat
colData(cpkobj)$CellType = meta$CellType

color_heat = circlize::colorRamp2(breaks = c(-0.5,0,0.5), c("dodgerblue3", "white", "firebrick3"))
getwd()
#subset the data for quick demonstration
#cpkobj_sub = cpkobj[,sample(colnames(cpkobj),300,replace = F)]
pdf('Copykit_Heatmap_celltype.pdf',width = 10,height = 10)
print(plotHeatmap_copykatST(cpkobj,n_threads = 40, pt.name = 'test', order_cells = "hclust",
                            use_raster = T, assay = "segratio_unlog", col = color_heat,
                            row_split = 'CellType',
                            label = c('CopyKat','CellType')))
dev.off()

Key Observations

All three tumor clusters (lumhr_1, lumhr_6, lumhr_8) show strong CNV signals, and their spatial locations match well with pathologist annotations of IDC and DCIS, confirming their malignant nature.
lumhr_1 is located in the IDC region, while lumhr_6 and lumhr_8 are in DCIS areas. Although DCIS is non-invasive, both clusters already show clear CNV patterns, indicating early genetic changes.
CAFs_2 are strongly enriched around lumhr_1 in the IDC region, suggesting stromal involvement in tumor invasion.
The surrounding cell types differ between IDC and DCIS, reflecting changes in the tumor microenvironment during cancer progression.

Identification of Spatially Variable Features (Moran's I)

Moran’s I was used to identify genes with similar expression levels in neighboring spots across Visium HD data. This high-resolution spatial analysis highlights genes influenced by local tissue architecture and the tumor microenvironment, revealing region-specific biological activity within the tissue.

## the FindSpatiallyVariableFeatures only working with SCT assay for now
Object_8um_filteredQ <- SCTransform(Object_8um_filteredQ, assay = "Spatial.008um", verbose = FALSE)

Object_8um_filteredQ <- FindSpatiallyVariableFeatures(Object_8um_filteredQ, 
                                                      assay = "SCT",
                                                      features = VariableFeatures(Object_8um_filteredQ)[1:500],
                                                      selection.method = "moransi")

SpatialFeaturePlot(Object_8um_filteredQ, 
                   features = head(SpatiallyVariableFeatures(Object_8um_filteredQ, selection.method = "moransi"), 12),
                   ncol = 4,  max.cutoff = "q95", pt.size.factor = 12)

Spatial Domain Analysis

BANKSY BANKSY is a method for clustering spatial omics data by augmenting the features of each cell with both an average of the features of its spatial neighbors along with neighborhood feature gradients. BANKSY is applicable to a wide array of spatial technologies (e.g. 10x Visium, Slide-seq, MERFISH, CosMX, CODEX) and scales well to large datasets.

For more details, please refer to the paper - https://www.nature.com/articles/s41588-024-01664-3

##Load the Libraries
library(Banksy)
library(Banksyobjecturat)
library(BanksyobjecturatWrappers)
library(SummarizedExperiment)
library(SpatialExperiment)
library(scuttle)

library(scater)
library(cowplot)
library(ggplot2)

## Read in the Seurat object as a Banksy Spatial Experiment object
gcm1 <- Object_8um_filteredQ@assays$Spatial.008um$counts
coords <- GetTissueCoordinates(Object_8um_filteredQ)
coords<-coords[,-3]
Banksyobject <- SpatialExperiment(assay = list(counts = gcm1), spatialCoords = as.matrix(coords))

## Do basic normalization necessary for running Banksy
Banksyobject <- computeLibraryFactors(Banksyobject)
aname <- "normcounts"
assay(Banksyobject, aname) <- normalizeCounts(Banksyobject, log = FALSE)

## Run Banksy algorithm
lambda <- c(0.2)
k_geom <- c(15, 30)
Banksyobject <- Banksy::computeBanksy(Banksyobject, assay_name = aname, compute_agf = TRUE, k_geom = k_geom)
set.seed(1000)
Banksyobject <- Banksy::runBanksyPCA(Banksyobject, use_agf = TRUE, lambda = lambda)
Banksyobject <- Banksy::runBanksyUMAP(Banksyobject, use_agf = TRUE, lambda = lambda)
Banksyobject <- Banksy::clusterBanksy(Banksyobject, use_agf = TRUE, lambda = lambda, resolution = 1.2)
#Banksyobject <- Banksy::connectClusters(Banksyobject) [Not needed when working with just one lambda value]

## Put Banksy spatial domain information back into the Seurat Object for plotting and DE analysis

Object_8um_filteredQ$clust_M1_lam0.2_k50_res1.2<-Banksyobject$clust_M1_lam0.2_k50_res1.2

p1<-SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,
                   group.by = c("CellType"),pt.size.factor = 10,
                   image.alpha = 0.5,stroke = 0.1) +
  scale_fill_manual(values = DNA_col) +
  theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

p2<-SpatialDimPlot(Object_8um_filteredQ, label = F, repel = F,
                   group.by = c("clust_M1_lam0.2_k50_res1.2"),pt.size.factor = 10,
                   image.alpha = 0.5,stroke = 0.1) +
  scale_fill_manual(values = DNA_col) +
  theme(plot.margin = margin(0.2, 0.2, 0.2, 0.2), aspect.ratio = 1)

plot_grid(p1,p2)

## Identify DE genes for the Spatial Domains and plot heatmap
Idents(Object_8um_filteredQ)<-"clust_M1_lam0.2_k50_res1.2"

Banksy8_markers <- FindAllMarkers(Object_8um_filteredQ, only.pos = TRUE)
Banksy8_markers %>%
  group_by(cluster) %>%
  dplyr::filter(avg_log2FC > 1) %>%
  slice_head(n = 5) %>%
  ungroup() -> bin8_top5

DoHeatmap(Object_8um_filteredQ, features = bin8_top5$gene, size = 2.5) + theme(axis.text = element_text(size = 8)) + NoLegend()

Tumor Microenvironment(TME) Density Analysis

coords <- GetTissueCoordinates(Object_8um_filteredQ)
Object_8um_filteredQ$x <- coords$x
Object_8um_filteredQ$y <- coords$y

cellstateXY <- subset(Object_8um_filteredQ@meta.data,select = c("CellType","x","y"))
head(cellstateXY)

## Define the Tumor and Non-Tumor sections
cellstateXY_cancer <- cellstateXY[which(cellstateXY$CellType %in% c("lumhr_1","lumhr_6","lumhr_8")),] 
cellstateXY_TME <- cellstateXY[-which(cellstateXY$CellType %in% c("lumhr_1","lumhr_6","lumhr_8")),] 
head(cellstateXY_cancer)
 
Object_8um_filteredQ@meta.data$CancerAnnotation <- ifelse(colnames(Object_8um_filteredQ) %in% rownames(cellstateXY_cancer), "Cancer", "TME")
table(Object_8um_filteredQ@meta.data$CancerAnnotation)

print(paste("Tumor Cells: ",nrow(cellstateXY_cancer),
            "; TME Cells: ",nrow(cellstateXY_TME),sep=""))

## distance matrix
cancer_coords <- as.matrix(cellstateXY_cancer[, c("x", "y")])
rownames(cancer_coords) <- rownames(cellstateXY_cancer)

TME_coords <- as.matrix(cellstateXY_TME[, c("x", "y")])
rownames(TME_coords) <- rownames(cellstateXY_TME)
  
distance_matrix <- rdist(cancer_coords, TME_coords)
rownames(distance_matrix) <- rownames(cellstateXY_cancer)
colnames(distance_matrix) <-rownames(cellstateXY_TME)
  
distance_matrix[1:10,1:10]

Here, we first examined the dominant surrounding cell types within a 300-micron radius of each tumor cell. You may adjust the radius (or cell state) parameter to explore how local cell-type/cell-state composition changes.

distance_cutoff = 300
query = "CellType"
distance_matrix = distance_matrix
cellstateXY_TME = cellstateXY_TME
Object_8um_filteredQ = Object_8um_filteredQ

Object_8um_filteredQ_cancer <- TMEDensity(Object_8um_filteredQ,cellstateXY_TME,distance_matrix,"300","CellType")

result_matrix_TME<-Object_8um_filteredQ_cancer@meta.data[,28:33]

DimPlot2(Object_8um_filteredQ_cancer,pt.size = 0.1 ,reduction = "image", ncol = 4,theme = NoAxes(),
                 features = c("CellType","CopyKat",
                              colnames(result_matrix_TME))) + theme_umap_arrows()

Delaunay-Based Network

Giotto is an R package for analyzing and visualizing spatial transcriptomics data. In our study, we use it to build spatial networks and perform neighborhood enrichment analysis to explore cell–cell interactions in the tumor microenvironment.

A Delaunay-based network was used to connect each spot in the Visium HD data to nearby neighbors by forming triangles without overlaps. This creates a balanced and accurate map of local spatial relationships, even when spot positions are uneven. It helps capture the tissue structure and is useful for analyzing how cells interact within their neighborhood.

## convert seurat object for Giotto
Object_8um_filteredQG<-seuratToGiottoV5(sobject =Object_8um_filteredQ, spatial_assay ="Spatial.008um")

## Create Delaunay Network
Object_8um_filteredQG <- createSpatialDelaunayNetwork(Object_8um_filteredQG)

spatPlot2D(
  gobject = Object_8um_filteredQG,
  show_network = TRUE,
  point_size = 1.2,
  network_color = "grey",
  save_plot = TRUE,
 cell_color ="CellType"
)

Neighborhood Enrichment Analysis

Neighborhood Enrichment Analysis using a Delaunay-based network was used to study how different cell types are positioned near each other in Visium HD data. By comparing real cell–cell interactions to random patterns, we identified which cell types tend to cluster together or stay apart. These spatial patterns help reveal immune infiltration, stromal changes, and cell communication in the tumor microenvironment.

Object_8um_filteredQG_Enrich <- cellProximityEnrichment(
  gobject = Object_8um_filteredQG,
  cluster_column = "CellType",           # column in metadata
  spatial_network_name = "Delaunay_network",  # default name
  adjust_method = "fdr"                   # multiple test correction
)

## heatmap from cell-cell proximity scores
cellProximityHeatmap( Object_8um_filteredQG, 
                      Object_8um_filteredQG_Enrich, 
                      save_plot =TRUE)

## network from cell-cell proximity scores
cellProximityNetwork( Object_8um_filteredQG,
                      Object_8um_filteredQG_Enrich,
                      save_plot =TRUE)

## barplot from cell-cell proximity scores
cellProximityBarplot( Object_8um_filteredQG,
                      Object_8um_filteredQG_Enrich,
                      save_plot =TRUE)

Spatially proximal cell-cell communication (ligand-receptor Analysis)

Cell–cell communication analysis with CellChat v2 enables studying interactions from single-cell RNA-seq and spatial transcriptomics data using a simple quantitative ligand–receptor model that accounts for cofactors. It supports inference of spatially proximal communication between interacting cell groups from spatial data, and also includes an expanded CellChatDB with over 1,000 protein and non-protein interactions (e.g. metabolic, synaptic), and improved compatibility with other single-cell analysis tools.

Please check CellChat paper and Github for detail information.

Create Cellchat object from Seurat

Object_8um_filteredQ
Idents(Object_8um_filteredQ) <- "CellType"

## prepare the input from Seurat
data.input = Seurat::GetAssayData(Object_8um_filteredQ, slot = "data", assay = "SCT") 
meta = data.frame(labels = Seurat::Idents(Object_8um_filteredQ), samples = "sample1", row.names = names(Seurat::Idents(Object_8um_filteredQ)))
meta$samples <- factor(meta$samples)
head(meta)

# load spatial transcriptomics information
spatial.locs = Seurat::GetTissueCoordinates(Object_8um_filteredQ, scale = NULL) 
spatial.locs <- spatial.locs[,c(1,2)]
colnames(spatial.locs) <- c("imagerow", "imagecol")
head(spatial.locs)

## Spatial factors of spatial coordinates
spot.size = 8 # the theoretical spot size (um) selected in 10X VisiumHD
spot_diameter_fullres = 14.610919474051226 ## the spot_diameter_fullres value in scalefactors_json.json from VisiumHD output folder
conversion.factor = spot.size/spot_diameter_fullres
spatial.factors = data.frame(ratio = conversion.factor, tol = spot.size/2)
d.spatial <- computeCellDistance(coordinates = spatial.locs, ratio = spatial.factors$ratio, tol = spatial.factors$tol)
min(d.spatial[d.spatial!=0]) ## should close to 8, reflects the real physical center-to-center distance between adjacent bins

## Create a CellChat object
cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels",
                           datatype = "spatial", coordinates = data.matrix(spatial.locs), spatial.factors = spatial.factors)

Set the ligand-receptor interaction database

CellChatDB <- CellChatDB.human # use CellChatDB.mouse if running on mouse data
showDatabaseCategory(CellChatDB)

# use a subset of CellChatDB for cell-cell communication analysis
#CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling", key = "annotation") # use Secreted Signaling
# use all CellChatDB except for "Non-protein Signaling" for cell-cell communication analysis (Recommend)
CellChatDB.use <- subsetDB(CellChatDB)

# set the used database in the object
cellchat@DB <- CellChatDB.use

Processing the cell-cell communication analysis

This processing step takes approximately 4–5 hours on our server; To save time, you can load the pre-processed object instead.

# subset the expression data of signaling genes for saving computation cost
cellchat <- subsetData(cellchat) # This step is necessary even if using the whole database
## future::plan("multisession", workers = 80), there is a bug in cellchat v2, using one core is much faster than multisession

cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat, variable.both = F)
## The number of highly variable ligand-receptor pairs used for signaling inference is 1495 

cellchat <- computeCommunProb(
  cellchat,
  type = "truncatedMean",     # better for low-expression smoothing
  trim = 0.1,                 # less aggressive trimming
  nboot = 20,                 # fast check; increase to ≥100 for robust results
  distance.use = TRUE,        # essential for spatial modeling
  scale.distance = 0.1,       # calibrated for 8µm bins:  1 / min(d.spatial[d.spatial != 0]) ≈ 1 / 8 = 0.125
  contact.range = 10          # typical human cell size.
)

cellchat <- filterCommunication(cellchat, min.cells = 10)

cellchat <- computeCommunProbPathway(cellchat)
cellchat <- aggregateNet(cellchat)

Visualization of cell-cell communication network

groupSize <- as.numeric(table(cellchat@idents))
par(mfrow = c(1,2), xpd=TRUE)
netVisual_circle(cellchat@net$count, vertex.weight = rowSums(cellchat@net$count), weight.scale = T, label.edge= F, title.name = "Number of interactions")
netVisual_circle(cellchat@net$weight, vertex.weight = rowSums(cellchat@net$weight), weight.scale = T, label.edge= F, title.name = "Interaction weights/strength")

netVisual_heatmap(cellchat, measure = "count", color.heatmap = "Blues")

Infer the cell-cell communication at A signaling pathway level

The TGFβ pathway was chosen because it is critically involved in breast cancer progression, where it can promote invasion and metastasis during the transition from ductal carcinoma in situ (DCIS; lumhr_6) to invasive ductal carcinoma (IDC; lumhr_1).

## for specific pathway
cellchat@netP$pathways
pathways.show <- c("TGFb") 
# Circle plot
par(mfrow=c(1,1), xpd = TRUE) # `xpd = TRUE` should be added to show the title
netVisual_aggregate(cellchat, signaling = pathways.show, layout = "circle")

# Spatial plot
par(mfrow=c(1,1))
netVisual_aggregate(cellchat, signaling = pathways.show, layout = "spatial", edge.width.max = 2, vertex.size.max = 1, alpha.image = 0.2, vertex.label.cex = 3.5)

# Compute the network centrality scores
cellchat <- netAnalysis_computeCentrality(cellchat, slot.name = "netP") # the slot 'netP' means the inferred intercellular communication network of signaling pathways
# Visualize the computed centrality scores using heatmap, allowing ready identification of major signaling roles of cell groups
par(mfrow=c(1,1))
netAnalysis_signalingRole_network(cellchat, signaling = pathways.show, width = 8, height = 2.5, font.size = 10)

save your cellchat object

saveRDS(cellchat, file = "cellchat_visiumHD_BC_FF.rds")

Convert to Python Universe

library(SeuratDisk)
Object_8um_filteredQP <- Object_8um_filteredQ

coords <- GetTissueCoordinates(Object_8um_filteredQP)
Object_8um_filteredQP$x <- coords$x
Object_8um_filteredQP$y <- coords$y

meta <- Object_8um_filteredQP@meta.data
meta[] <- lapply(meta, function(col) if (is.factor(col)) as.character(col) else col)
Object_8um_filteredQP@meta.data <- meta
str(Object_8um_filteredQP@meta.data)

SaveH5Seurat(Object_8um_filteredQP, filename = "./Object_8um_filteredQ.h5Seurat",overwrite = TRUE)
Convert("Object_8um_filteredQ.h5Seurat", dest = "h5ad",overwrite = TRUE)

Spatial ToolBox

Software	Language	Unique Key Feature	GitHub-Link
Seurat	R	Flexible integration and analysis of single-cell and spatial transcriptomics data	Link
Giotto	R	process, analyze and visualize spatial multi-omics data at all scales and multiple resolutions	Link
Semla	R	Efficient spatial transcriptomics visualization and analysis tailored for Seurat objects	Link
CellChat	R	inference, visualization and analysis of cell-cell communication from single-cell and spatially resolved transcriptomics	Link
Squidpy	Python	Graph-based spatial analysis with image integration and neighborhood statistics	Link
StLearn	Python	Joint modeling of spatial distance, tissue morphology, and gene expression patterns	Link
StarDist	Python	Deep learning-based cell segmentation from microscopy images using star-convex polygons	Link
Bin2Cell	Python	Assigns VisiumHD gene expression bins to segmented cells (expansion with on StarDist result) for single-cell spatial resolution	Link

Session Information

To keep your analysis reproducible for yourself or others in the future, it is important to know which version of R you used, which packages you used, and the version of each package. Therefore, a good method to keep track of this is to use the sessionInfo() function. You should always output this at the end of your sessions.

sessionInfo()

## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Red Hat Enterprise Linux

## Matrix products: default
## BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.s

## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods   base

## other attached packages:
##  [1] future_1.40.0          arrow_17.0.0.1         lubridate_1.9.3        forcats_1.0.0          stringr_1.5.1          dplyr_1.1.4            purrr_1.0.4           
##  [8] readr_2.1.5            tidyr_1.3.1            tibble_3.2.1           tidyverse_2.0.0        Matrix_1.6-4           copykat_1.1.0          Rfast_2.0.6           
## [15] RcppZiggurat_0.1.8     Rcpp_1.0.14            ggrepel_0.9.6          rvest_1.0.3            kableExtra_1.3.4       ggpubr_0.6.0           magick_2.8.6          
## [22] RColorBrewer_1.1-3     plotly_4.10.4          patchwork_1.3.0        reshape2_1.4.4         SeuratExtend_1.1.6     SeuratExtendData_0.2.1 paletteer_1.5.0       
## [29] cowplot_1.1.3          fields_16.3.1          viridisLite_0.4.2      spam_2.11-1            flexdashboard_0.6.2    scCustomize_3.0.1      matrixStats_1.1.0     
## [36] dbplyr_2.3.4           biomaRt_2.54.1         schard_0.0.1           spacexr_2.2.1          ggplot2_3.5.2          Seurat_5.2.1           SeuratObject_5.0.2    
## [43] sp_2.2-0 

## loaded via a namespace (and not attached):
##   [1] rappdirs_0.3.3         SparseM_1.84-2         ggprism_1.0.5          scattermore_1.2        coda_0.19-4            bit64_4.6.0-1          knitr_1.45            
##   [8] irlba_2.3.5.1          data.table_1.14.8      KEGGREST_1.38.0        RCurl_1.98-1.12        doParallel_1.0.17      generics_0.1.3         BiocGenerics_0.44.0   
##  [15] RSQLite_2.3.1          RANN_2.6.2             bit_4.6.0              tzdb_0.5.0             spatstat.data_3.1-6    webshot_0.5.5          xml2_1.3.5            
##  [22] httpuv_1.6.15          assertthat_0.2.1       xfun_0.40              hms_1.1.3              jquerylib_0.1.4        evaluate_0.23          promises_1.3.2        
##  [29] progress_1.2.3         igraph_2.1.4           DBI_1.1.3              htmlwidgets_1.6.4      mcmc_0.9-8             spatstat.geom_3.3-6    stats4_4.2.2          
##  [36] RSpectra_0.16-2        crosstalk_1.2.1        backports_1.4.1        MCMCpack_1.6-3         RcppParallel_5.1.10    deldir_2.0-4           vctrs_0.6.5           
##  [43] Biobase_2.58.0         quantreg_6.1           ROCR_1.0-11            abind_1.4-8            cachem_1.1.0           withr_3.0.2            progressr_0.15.1      
##  [50] sctransform_0.4.1      prettyunits_1.2.0      goftest_1.2-3          svglite_2.1.2          cluster_2.1.4          dotCall64_1.2          segmented_2.0-1       
##  [57] lazyeval_0.2.2         crayon_1.5.3           spatstat.explore_3.4-2 pkgconfig_2.0.3        labeling_0.4.3         GenomeInfoDb_1.34.9    nlme_3.1-163          
##  [64] vipor_0.4.7            rlang_1.1.6            globals_0.16.3         lifecycle_1.0.4        miniUI_0.1.1.1         MatrixModels_0.5-4     filelock_1.0.3        
##  [71] fastDummies_1.7.5      BiocFileCache_2.6.1    ggrastr_1.0.2          polyclip_1.10-7        RcppHNSW_0.6.0         lmtest_0.9-40          carData_3.0-5         
##  [78] zoo_1.8-14             beeswarm_0.4.0         ggridges_0.5.6         GlobalOptions_0.1.2    png_0.1-8              bitops_1.0-9           KernSmooth_2.23-22    
##  [85] Biostrings_2.66.0      blob_1.2.4             shape_1.4.6            spatstat.univar_3.1-2  parallelly_1.43.0      spatstat.random_3.3-3  rstatix_0.7.2         
##  [92] S4Vectors_0.36.2       ggsignif_0.6.4         scales_1.3.0           memoise_2.0.1          magrittr_2.0.3         plyr_1.8.9             ica_1.0-3             
##  [99] zlibbioc_1.44.0        compiler_4.2.2         fitdistrplus_1.2-2     snakecase_0.11.1       cli_3.6.2              XVector_0.38.0         listenv_0.9.1         
## [106] pbapply_1.7-2          MASS_7.3-60            tidyselect_1.2.1       stringi_1.8.7          yaml_2.3.10            sass_0.4.10            tools_4.2.2           
## [113] timechange_0.2.0       future.apply_1.11.3    parallel_4.2.2         parallelDist_0.2.6     circlize_0.4.15        rstudioapi_0.15.0      foreach_1.5.2         
## [120] janitor_2.2.0          gridExtra_2.3          farver_2.1.2           Rtsne_0.17             digest_0.6.33          shiny_1.10.0           car_3.1-2             
## [127] broom_1.0.5            later_1.4.2            RcppAnnoy_0.0.22       httr_1.4.7             AnnotationDbi_1.60.2   kernlab_0.9-32         colorspace_2.1-1      
## [134] XML_3.99-0.18          tensor_1.5             reticulate_1.42.0      IRanges_2.32.0         splines_4.2.2          uwot_0.2.3             rematch2_2.1.2        
## [141] spatstat.utils_3.1-3   systemfonts_1.0.5      xtable_1.8-4           jsonlite_2.0.0         R6_2.6.1               pillar_1.10.2          htmltools_0.5.8.1     
## [148] mime_0.13              glue_1.8.0             fastmap_1.2.0          codetools_0.2-19       maps_3.4.2.1           lattice_0.22-4         bslib_0.9.0           
## [155] spatstat.sparse_3.1-0  mixtools_2.0.0         curl_5.2.0             ggbeeswarm_0.7.2       survival_3.5-7         rmarkdown_2.29         munsell_0.5.1         
## [162] GenomeInfoDbData_1.2.9 iterators_1.0.14       gtable_0.3.6

For more info about us click here.

5 Spatial RNA: VisiumHD

Visium HD Spatial Gene Expression (10x Genomics)

Technology Key Features

Applications

The difference of VisiumHD and Visium

Choosing the Right Bin Size in Visium HD: Implications from Human Cell Size

Why Bin Size Matters

Best Practices

Quick prep before we dive in!!

Data Downloads

Check 10X Web_Summary

Let’s dive in !!

installing R packages

loading R packages

Functions in R

Data Preprocessing with Seurat 5

Create Seurat spatial Objects

load 10X default graph cluster

QC metrics Visualization in 8 um Bin

Filtering criterion for 8um Bin

checking the Filtering result

UMAP Cluster and Marker Genes for 8um Bin (10k bins subset)

Specific mouse markers to explore in your own data

Spot Decomposition with RCTD

Spot Decomposition with RCTD: Cell Type

Integrated Cell Type Mapping in VisiumHD

Naming Cluster for Cell-Type

Advance Spatial Analysis

Pathologist annotation of H&E images

copykat Infering CNV

Cell Ploidy Mapping

Input copykat ratio matrix to copykit for further analysis

Key Observations

Identification of Spatially Variable Features (Moran's I)

Spatial Domain Analysis

Tumor Microenvironment(TME) Density Analysis

Delaunay-Based Network

Neighborhood Enrichment Analysis

Spatially proximal cell-cell communication (ligand-receptor Analysis)

Create Cellchat object from Seurat

Set the ligand-receptor interaction database

Processing the cell-cell communication analysis

Visualization of cell-cell communication network

Infer the cell-cell communication at A signaling pathway level

save your cellchat object

Convert to Python Universe

Spatial ToolBox

Session Information

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally