Skip to contents

Recently, more and more spatial multimodal data are becoming available, however, the alignment and processing of some of these modalities can be challenging. One such example is the integration of Visium data with Mass Spectrometry Imaging (MSI) data, which can experimentally either be performed on the same or adjacent tissue sections. These two technologies produce data with different spatial resolution and mismatched spatial coordinate systems. In the case of MSI data, there is further a challenge with non-standardized file formats.

To perform the task of integrating two spatial omics datasets into shared coordinates, we have developed a new function within semla that joins two semla objects into a single object, with each modality present as separate assays/layers.

A few pre-requisites are required before joining the two samples, the primary one being that the spatial coordinates of the objects first needs to have been aligned and transformed into the same pixel dimensions. And secondly, both data sets needs to be available in the standardized Space Ranger output file formats. To for instance obtained aligned and properly preprocessed Visium and MSI spatial multimodal datasets, the pipeline MAGPIE can be used to perform these tasks in a streamlined fashion.

For this tutorial, we have processed mouse lung tissue with both Visium and MSI, from consecutive sections. The MSI data was first preprocessed and thereafter spatially aligned into the matching Visium coordinate system using MAGPIE, which produces output files in the Space Ranger format for the aligned MSI data. In this dataset, we also have image of the H&E stained tissue section that used to obtain the MSI data.

Let’s get started!

col_scale_rocket <- viridis::rocket(11, direction = -1)
col_scale_mako <- viridis::mako(11, direction = -1)

Load data

We will start by loading the two datasets, starting with the Visium data.

infoTable_visium <- data.frame(
  samples = file.path("multimodal_data/mm_bleo_visium", "filtered_feature_bc_matrix.h5"),
  imgs = file.path("multimodal_data/mm_bleo_visium", "spatial/tissue_hires_image.png"),
  spotfiles = file.path("multimodal_data/mm_bleo_visium", "spatial/tissue_positions_list.csv"),
  json = file.path("multimodal_data/mm_bleo_visium", "spatial/scalefactors_json.json")
)

se_visium <- ReadVisiumData(infoTable = infoTable_visium, 
                            assay = "Visium", 
                            remove_spots_outside_HE = T, 
                            remove_spots_outside_tissue = T)
## 
## ── Reading 10x Visium data ──
## 
##  Loading matrices:
## →   Finished loading expression matrix 1
##    There are 32285 features and 4108 spots in the expression matrix.
##  Loading coordinates:
## →   Finished loading coordinates for sample 1
##  Collected coordinates for 4108 spots.
## 
## ── Creating `Seurat` object
##  Expression matrices and coordinates are compatible
##  Created `Seurat` object
##  Created `Staffli` object
##  Returning a `Seurat` object with 32285 features and 4108 spots
MapFeatures(se_visium, features = "nCount_Visium", colors = col_scale_rocket)



Next, we read the MSI data available in the same standarised Space Ranger format. To make it easier to separate the two objects, we can pass a unique name to the assay argument.

infoTable_msi <- data.frame(
  samples = file.path("multimodal_data/mm_bleo_msi", "filtered_feature_bc_matrix.h5"),
  imgs = file.path("multimodal_data/mm_bleo_msi", "spatial/tissue_hires_image.png"),
  spotfiles = file.path("multimodal_data/mm_bleo_msi", "spatial/tissue_positions_list.csv"),
  json = file.path("multimodal_data/mm_bleo_msi", "spatial/scalefactors_json.json")
)

se_msi <- ReadVisiumData(infoTable = infoTable_msi, 
                         assay = "MSI", 
                         remove_spots_outside_HE = T, 
                         remove_spots_outside_tissue = T)
## 
## ── Reading 10x Visium data ──
## 
##  Loading matrices:
## →   Finished loading expression matrix 1
##    There are 2633 features and 14824 spots in the expression matrix.
##  Loading coordinates:
## →   Finished loading coordinates for sample 1
##  Collected coordinates for 14824 spots.
## 
## ── Creating `Seurat` object
##  Expression matrices and coordinates are compatible
## ! Found 627 spot(s) with x coordinates outside of the H&E image for sample 1
## ! Found 2855 spot(s) with y coordinates outside of the H&E image for sample 1
## ! Removing 3482 spots from the dataset
##  Created `Seurat` object
##  Created `Staffli` object
##  Returning a `Seurat` object with 2633 features and 11719 spots
MapFeatures(se_msi, features = "nCount_MSI", colors = col_scale_mako, pt_size = 0.6, min_cutoff = 0.05)


Here we see the transformed MSI data visualized in the same manner as we are used to see Visium data. The spatial resolution of MSI data is a lot higher, and since the coordinates here have been aligned to match the Visium H&E image, it is no longer present in a perfect grid but can rather appear slightly distorted.

If we want, we can load the MSI H&E image and view it together with the data.

se_msi_he <- LoadImages(se_msi)
## 
## ── Loading H&E images ──
## 
##  Loading image from multimodal_data/mm_bleo_msi/spatial/tissue_hires_image.png
##  Scaled image from 1945x2000 to 400x411 pixels
##  Saving loaded H&E images as 'rasters' in Seurat object
MapFeatures(se_msi_he, features = "nCount_MSI", colors = col_scale_mako, 
            pt_size = 0.4,
            image_use = "raw")

Data processing

Next, we will perform the standard processing of the expression data, starting with the Visium data.

se_visium <- se_visium |> 
  NormalizeData() |>
  ScaleData() |>
  FindVariableFeatures()
## Normalizing layer: counts
## Centering and scaling data matrix
## Finding variable features for layer counts

For the MSI data, the intensity values have already been normalized based on total ion count prior to creating the input data. However, to ease the integration and shorten the processing time, we will filter the data and only include the most spatially variable metabolites.

First we can have a look at the distribution of metabolites and peak intensities across pixels to thereafter determine reasonable cutoffs.

VlnPlot(se_msi, features = c("nCount_MSI", "nFeature_MSI"), 
        pt.size = 0, 
        ncol = 2)
## Warning: Default search for "data" layer in "MSI" assay yielded no results;
## utilizing "counts" layer instead.
## Warning: The `slot` argument of `FetchData()` is deprecated as of SeuratObject 5.0.0.
##  Please use the `layer` argument instead.
##  The deprecated feature was likely used in the Seurat package.
##   Please report the issue at <https://github.com/satijalab/seurat/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: `PackageCheck()` was deprecated in SeuratObject 5.0.0.
##  Please use `rlang::check_installed()` instead.
##  The deprecated feature was likely used in the Seurat package.
##   Please report the issue at <https://github.com/satijalab/seurat/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

#' Pixel filter
min_count_per_spot <- 14e5
min_peaks_per_spot <- 300
msi_coord_keep <- rownames(subset(se_msi@meta.data, 
                                  nFeature_MSI > min_peaks_per_spot & nCount_MSI > min_count_per_spot))

se_msi_filtered <- SubsetSTData(se_msi, spots = msi_coord_keep)
se_msi_filtered
## An object of class Seurat 
## 2633 features across 10770 samples within 1 assay 
## Active assay: MSI (2633 features, 0 variable features)
##  1 layer present: counts
# Metabolite filter
se_msi_filtered <- se_msi_filtered |> FindVariableFeatures()
## Finding variable features for layer counts
cor_features <- CorSpatialFeatures(se_msi_filtered, assay_use = "MSI", slot_use = "counts", verbose = T)
## 
## ── Computing spatial autocorrelation ──
## 
##  Sample 1:
## →   Cleaned out spots with 0 adjacent neighbors
## →   Computed feature lag expression
## →   Computed feature spatial autocorrelation scores
##    Returning results
top_n <- 500
top_metabolites <- cor_features[[1]] |> head(n = top_n) |> pull(gene)

se_msi_filtered <- SubsetSTData(se_msi, features = top_metabolites)

Create the spatial multimodal object

Now, each object is ready for being joined into a single object by using the CreateMultiModalObject() function. What this function does, is to use the second modality object (MSI in this case) and map it into the reference spatial object (here Visium). Since the spatial resolution of MSI is higher than for Visium, the function will detect and aggregate data from multiple MSI pixels per Visium spot, taking either mean of the sum of the MSI measurements (as specified with the agg_func argument).

se_mmo <- CreateMultiModalObject(object_ref = se_visium, 
                                 object_map = se_msi_filtered,
                                 agg_func = "mean", 
                                 new_assay_name = "MSI")
## 
## ── Joining second modality data to reference object ──
## 
##  Excluding 4651 coordinates from the mapping dataset due to being outside of reference coordinate scope
##  Aggregating assay data in mapping coordinates per reference coordinate (this step may take long if the data is large)
##  Using 7 threads
##  Data aggregation finished in 0.09 minutes
##  Storing aggregated data summarized by mean values
##  Adding second modality data to a new 'Assay' named 'MSI', containing 500 features
##  Returning a `Seurat` object with 32285 features and 4086 spots
MapFeatures(se_mmo, 
            features = c("nCount_Visium", "nFeature_Visium"), 
            ncol = 1, 
            colors = col_scale_rocket) |
  MapFeatures(se_mmo, 
              features = c("nCount_MSI", "nFeature_MSI"), 
              ncol = 1, 
              colors = col_scale_mako)

As seen here, we now have the MSI and Visium data within the same object and coordinate system, allowing us to perform any desired downstream analyses, e.g. factor analysis with NMF or MOFA.

As with any semla object, we can read in the tissue image and in this case the Visium H&E image will be selected by default as it acted as our reference dataset.

se_mmo <- LoadImages(se_mmo)
## 
## ── Loading H&E images ──
## 
##  Loading image from multimodal_data/mm_bleo_visium/spatial/tissue_hires_image.png
##  Scaled image from 1945x2000 to 400x411 pixels
##  Saving loaded H&E images as 'rasters' in Seurat object
spot_size <- 0.6
MapFeatures(se_mmo, 
            features = c("nCount_Visium"), pt_size = spot_size,
            colors = col_scale_rocket, image_use = "raw") |
  MapFeatures(se_mmo, 
              features = c("nCount_MSI"), pt_size = spot_size,
              colors = col_scale_mako, image_use = "raw")


Package version
  • semla: 1.4.1
Session info
## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20.0.0
## Running under: macOS Sequoia 15.7.3
## 
## Matrix products: default
## BLAS/LAPACK: /Users/javierescudero/miniconda3/envs/r-semlaupd/lib/libopenblas.0.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Stockholm
## tzcode source: system (macOS)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] patchwork_1.3.1    semla_1.4.1        ggplot2_3.5.2      dplyr_1.1.4       
## [5] Seurat_5.3.0       SeuratObject_5.1.0 sp_2.2-0          
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3     rstudioapi_0.17.1      jsonlite_1.9.0        
##   [4] magrittr_2.0.3         spatstat.utils_3.1-4   magick_2.8.7          
##   [7] farver_2.1.2           rmarkdown_2.29         fs_1.6.5              
##  [10] ragg_1.3.3             vctrs_0.6.5            ROCR_1.0-11           
##  [13] spatstat.explore_3.4-3 htmltools_0.5.8.1      forcats_1.0.0         
##  [16] sass_0.4.9             sctransform_0.4.2      parallelly_1.42.0     
##  [19] KernSmooth_2.23-26     bslib_0.9.0            htmlwidgets_1.6.4     
##  [22] desc_1.4.3             ica_1.0-3              plyr_1.8.9            
##  [25] plotly_4.11.0          zoo_1.8-13             cachem_1.1.0          
##  [28] igraph_2.1.4           mime_0.12              lifecycle_1.0.4       
##  [31] pkgconfig_2.0.3        Matrix_1.7-2           R6_2.6.1              
##  [34] fastmap_1.2.0          fitdistrplus_1.2-3     future_1.34.0         
##  [37] shiny_1.10.0           digest_0.6.37          colorspace_2.1-1      
##  [40] tensor_1.5.1           RSpectra_0.16-2        irlba_2.3.5.1         
##  [43] textshaping_0.4.0      labeling_0.4.3         progressr_0.15.1      
##  [46] spatstat.sparse_3.1-0  httr_1.4.7             polyclip_1.10-7       
##  [49] abind_1.4-5            compiler_4.4.2         bit64_4.5.2           
##  [52] withr_3.0.2            viridis_0.6.5          fastDummies_1.7.5     
##  [55] float_0.3-3            MASS_7.3-64            tools_4.4.2           
##  [58] lmtest_0.9-40          httpuv_1.6.15          future.apply_1.11.3   
##  [61] goftest_1.2-3          glue_1.8.0             dbscan_1.2.2          
##  [64] nlme_3.1-167           promises_1.3.2         grid_4.4.2            
##  [67] Rtsne_0.17             cluster_2.1.8          reshape2_1.4.4        
##  [70] generics_0.1.3         hdf5r_1.3.12           gtable_0.3.6          
##  [73] spatstat.data_3.1-6    tidyr_1.3.1            data.table_1.17.0     
##  [76] spatstat.geom_3.4-1    RcppAnnoy_0.0.22       ggrepel_0.9.6         
##  [79] RANN_2.6.2             pillar_1.10.1          stringr_1.5.1         
##  [82] spam_2.11-1            RcppHNSW_0.6.0         later_1.4.1           
##  [85] splines_4.4.2          lattice_0.22-6         bit_4.5.0.1           
##  [88] survival_3.8-3         deldir_2.0-4           tidyselect_1.2.1      
##  [91] miniUI_0.1.1.1         pbapply_1.7-2          knitr_1.50            
##  [94] gridExtra_2.3          scattermore_1.2        RhpcBLASctl_0.23-42   
##  [97] xfun_0.53              matrixStats_1.5.0      stringi_1.8.4         
## [100] lazyeval_0.2.2         yaml_2.3.10            evaluate_1.0.5        
## [103] codetools_0.2-20       tibble_3.2.1           cli_3.6.4             
## [106] uwot_0.2.3             xtable_1.8-4           reticulate_1.42.0     
## [109] systemfonts_1.3.1      munsell_0.5.1          jquerylib_0.1.4       
## [112] Rcpp_1.1.0             globals_0.16.3         spatstat.random_3.4-1 
## [115] zeallot_0.2.0          png_0.1-8              spatstat.univar_3.1-3 
## [118] parallel_4.4.2         pkgdown_2.1.1          dotCall64_1.2         
## [121] listenv_0.9.1          viridisLite_0.4.2      MatrixExtra_0.1.15    
## [124] scales_1.3.0           ggridges_0.5.6         crayon_1.5.3          
## [127] purrr_1.0.4            rlang_1.1.5            cowplot_1.1.3         
## [130] shinyjs_2.1.0