Create dataset

In this notebook, we’ll have a look at how you can create a Seurat object compatible with semla.

Load libraries

library(semla)
library(magick)
library(jsonlite)
library(tibble)

Load coordinates

The Staffli object requires image and coordinate data that can be retrieved from the spaceranger output files.

Below we’ll use the mouse brain and mouse colon example data:

he_imgs <- c(system.file("extdata/mousebrain", 
                         "spatial/tissue_lowres_image.jpg", 
                          package = "semla"),
             system.file("extdata/mousecolon", 
                         "spatial/tissue_lowres_image.jpg", 
                         package = "semla"))
spotfiles <- c(system.file("extdata/mousebrain", 
                           "spatial/tissue_positions_list.csv", 
                           package = "semla"),
               system.file("extdata/mousecolon", 
                           "spatial/tissue_positions_list.csv", 
                           package = "semla"))
jsonfiles <- c(system.file("extdata/mousebrain", 
                           "spatial/scalefactors_json.json", 
                           package = "semla"),
               system.file("extdata/mousecolon", 
                           "spatial/scalefactors_json.json", 
                           package = "semla"))

When loading coordinates from multiple samples, you need to change the barcode IDs so that the suffix matches their respective sampleID. For instance, a barcode in sample 1 might be called CATACAAAGCCGAACC-1 and in sample 2 the same barcode should be CATACAAAGCCGAACC-2. The coordinates tibble below contains the barcode IDs, the pixel coordinates and a column with the sampleIDs.

semla provides LoadSpatialCoordinates to make the task a little easier. Note that the suffix for the barcodes have changed for sample 2.

# Read coordinates
coordinates <- LoadSpatialCoordinates(spotfiles) |> 
  select(all_of(c("barcode", "pxl_col_in_fullres", "pxl_row_in_fullres", "sampleID")))

## ℹ Loading coordinates:

## →   Finished loading coordinates for sample 1

## →   Finished loading coordinates for sample 2

## ℹ Collected coordinates for 5164 spots.

head(coordinates, n = 2)

## # A tibble: 2 × 4
##   barcode            pxl_col_in_fullres pxl_row_in_fullres sampleID
##   <chr>                           <int>              <int>    <int>
## 1 CATACAAAGCCGAACC-1               6086               4117        1
## 2 CTGAGCAAGTAACAAG-1               5062               4472        1

tail(coordinates, n = 2)

## # A tibble: 2 × 4
##   barcode            pxl_col_in_fullres pxl_row_in_fullres sampleID
##   <chr>                           <int>              <int>    <int>
## 1 GAAGGAGTCGAGTGCG-2               4854               6829        2
## 2 TGACCAAATCTTAAAC-2               4957               6829        2

# Check number of spots per sample
table(coordinates$sampleID)

## 
##    1    2 
## 2560 2604

Fetch image info

Next, we’ll fetch meta data for the sample H&E images. Here, the tibble must contain the width and height of he_imgs as well as a sampleID column. Use LoadImageInfo to load the image info:

# Create image_info
image_info <- LoadImageInfo(he_imgs)
image_info

## # A tibble: 2 × 9
##   format width height colorspace matte filesize density sampleID type         
##   <chr>  <int>  <int> <chr>      <lgl>    <int> <chr>   <chr>    <chr>        
## 1 JPEG     565    600 sRGB       FALSE   106233 72x72   1        tissue_lowres
## 2 JPEG     600    541 sRGB       FALSE   141710 72x72   2        tissue_lowres

Define image dimensions

The jsonfiles contain scaling factors that allows us to find the original H&E image dimensions. We can load the scale factors with read_json and store them in a tibble. The tibble should also contain a ‘sampleID’ column. Use LoadScaleFactors to load the scaling factors:

# Read scalefactors
scalefactors <- LoadScaleFactors(jsonfiles)
scalefactors

## # A tibble: 2 × 5
##   spot_diameter_fullres tissue_hires_scalef fiducial_diameter_fullres
##                   <dbl>               <dbl>                     <dbl>
## 1                 143.                0.104                      215.
## 2                  67.0               0.202                      108.
## # ℹ 2 more variables: tissue_lowres_scalef <dbl>, sampleID <chr>

Once we have the scale factors, we can add additional columns to image_info which specify the dimensions of the original H&E images. Use UpdateImageInfo to update image_info with the scaling factors:

# Add additional columns to image_info using scale factors
image_info <- UpdateImageInfo(image_info, scalefactors)
image_info

## # A tibble: 2 × 10
##   format width height full_width full_height colorspace filesize density
##   <chr>  <int>  <int>      <dbl>       <dbl> <chr>         <int> <chr>  
## 1 JPEG     565    600     18120.      19242. sRGB         106233 72x72  
## 2 JPEG     600    541      9901.       8927. sRGB         141710 72x72  
## # ℹ 2 more variables: sampleID <chr>, type <chr>

Finally, we are ready to create the Staffli object. Here we’ll provide the he_imgs so that we can load the images later with LoadImages. The coordinates are stored in the meta_data slot. image_info will be used by the plot functions provided in semla to define the dimensions of the plot area.

# Create Staffli object
staffli_object <- CreateStaffliObject(imgs = he_imgs, 
                                      meta_data = coordinates, 
                                      image_info = image_info, 
                                      scalefactors = scalefactors)
staffli_object

## An object of class Staffli 
## 5164 spots across 2 samples.

Use a `Staffli` object in `Seurat`

Now let’s see how we can incorporate our Staffli object in Seurat. First, we’ll load the gene expression matrices for our samples and merge them.

# Get paths for expression matrices
expr_matrix_files <- he_imgs <- c(system.file("extdata/mousebrain", 
                         "filtered_feature_bc_matrix.h5", 
                          package = "semla"),
             system.file("extdata/mousecolon", 
                         "filtered_feature_bc_matrix.h5", 
                         package = "semla"))

Before merging the matrices, it’s important to rename the barcodes so that they match the barcodes in Staffli_object.

exprMatList <- lapply(seq_along(expr_matrix_files), function(i) {
  exprMat <- Seurat::Read10X_h5(expr_matrix_files[i])
  colnames(exprMat) <- gsub(pattern = "-\\d*", # Replace barcode suffix with sampleID
                          replacement = paste0("-", i), 
                          x = colnames(exprMat))
  return(exprMat)
})

# Merge expression matrices
mergedExprMat <- SeuratObject::RowMergeSparseMatrices(exprMatList[[1]], exprMatList[[2]])

Alternatively, use the LoadAndMergeMatrices function from semla:

mergedExprMat <- LoadAndMergeMatrices(samplefiles = expr_matrix_files, verbose = FALSE)

# Create Seurat object
se <- SeuratObject::CreateSeuratObject(counts = mergedExprMat)

Rearrange Staffli object to match Seurat object. If the barcode are mismatched, make sure to subset the two object to contain intersecting barcodes.

staffli_object@meta_data <- staffli_object@meta_data[match(colnames(se), staffli_object@meta_data$barcode), ]

Check that the Staffli object matches Seurat object:

all(staffli_object@meta_data$barcode == colnames(se))

## [1] TRUE

Place Staffli object in the Seurat object:

se@tools$Staffli <- staffli_object

Now the semla functions should be compatible with the Seurat object!

MapFeatures(se, features = "nFeature_RNA")

se <- LoadImages(se, verbose = FALSE)
MapFeatures(se, features = "nFeature_RNA", image_use = "raw", override_plot_dims = TRUE)

Package version

semla: 1.4.0

Session info

sessionInfo()

## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20.0.0
## Running under: macOS Sequoia 15.5
## 
## Matrix products: default
## BLAS/LAPACK: /Users/javierescudero/miniconda3/envs/r-semlaupd/lib/libopenblas.0.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Stockholm
## tzcode source: system (macOS)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tibble_3.2.1       jsonlite_1.9.0     magick_2.8.7       semla_1.4.0       
## [5] ggplot2_3.5.2      dplyr_1.1.4        Seurat_5.3.0       SeuratObject_5.1.0
## [9] sp_2.2-0          
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3     rstudioapi_0.17.1      magrittr_2.0.3        
##   [4] spatstat.utils_3.1-4   farver_2.1.2           rmarkdown_2.29        
##   [7] fs_1.6.5               ragg_1.3.3             vctrs_0.6.5           
##  [10] ROCR_1.0-11            spatstat.explore_3.4-3 htmltools_0.5.8.1     
##  [13] forcats_1.0.0          sass_0.4.9             sctransform_0.4.2     
##  [16] parallelly_1.42.0      KernSmooth_2.23-26     bslib_0.9.0           
##  [19] htmlwidgets_1.6.4      desc_1.4.3             ica_1.0-3             
##  [22] plyr_1.8.9             plotly_4.11.0          zoo_1.8-13            
##  [25] cachem_1.1.0           igraph_2.1.4           mime_0.12             
##  [28] lifecycle_1.0.4        pkgconfig_2.0.3        Matrix_1.7-2          
##  [31] R6_2.6.1               fastmap_1.2.0          fitdistrplus_1.2-3    
##  [34] future_1.34.0          shiny_1.10.0           digest_0.6.37         
##  [37] colorspace_2.1-1       patchwork_1.3.1        tensor_1.5.1          
##  [40] RSpectra_0.16-2        irlba_2.3.5.1          textshaping_0.4.0     
##  [43] labeling_0.4.3         progressr_0.15.1       spatstat.sparse_3.1-0 
##  [46] httr_1.4.7             polyclip_1.10-7        abind_1.4-5           
##  [49] compiler_4.4.2         bit64_4.5.2            withr_3.0.2           
##  [52] fastDummies_1.7.5      MASS_7.3-64            tools_4.4.2           
##  [55] lmtest_0.9-40          httpuv_1.6.15          future.apply_1.11.3   
##  [58] goftest_1.2-3          glue_1.8.0             dbscan_1.2.2          
##  [61] nlme_3.1-167           promises_1.3.2         grid_4.4.2            
##  [64] Rtsne_0.17             cluster_2.1.8          reshape2_1.4.4        
##  [67] generics_0.1.3         hdf5r_1.3.12           gtable_0.3.6          
##  [70] spatstat.data_3.1-6    tidyr_1.3.1            data.table_1.17.0     
##  [73] utf8_1.2.4             spatstat.geom_3.4-1    RcppAnnoy_0.0.22      
##  [76] ggrepel_0.9.6          RANN_2.6.2             pillar_1.10.1         
##  [79] stringr_1.5.1          spam_2.11-1            RcppHNSW_0.6.0        
##  [82] later_1.4.1            splines_4.4.2          lattice_0.22-6        
##  [85] bit_4.5.0.1            survival_3.8-3         deldir_2.0-4          
##  [88] tidyselect_1.2.1       miniUI_0.1.1.1         pbapply_1.7-2         
##  [91] knitr_1.50             gridExtra_2.3          scattermore_1.2       
##  [94] xfun_0.53              matrixStats_1.5.0      stringi_1.8.4         
##  [97] lazyeval_0.2.2         yaml_2.3.10            evaluate_1.0.5        
## [100] codetools_0.2-20       cli_3.6.4              uwot_0.2.3            
## [103] xtable_1.8-4           reticulate_1.42.0      systemfonts_1.2.1     
## [106] munsell_0.5.1          jquerylib_0.1.4        Rcpp_1.1.0            
## [109] globals_0.16.3         spatstat.random_3.4-1  zeallot_0.2.0         
## [112] png_0.1-8              spatstat.univar_3.1-3  parallel_4.4.2        
## [115] pkgdown_2.1.1          dotCall64_1.2          listenv_0.9.1         
## [118] viridisLite_0.4.2      scales_1.3.0           ggridges_0.5.6        
## [121] purrr_1.0.4            rlang_1.1.5            cowplot_1.1.3         
## [124] shinyjs_2.1.0

Last compiled: 08 September 2025

Load libraries

Load coordinates

Fetch image info

Define image dimensions

Use a Staffli object in Seurat

Use a `Staffli` object in `Seurat`