Read spaceranger output files

This function serves as a wrapper for LoadAndMergeMatrices and LoadSpatialCoordinates to load spaceranger output files and create a Seurat object. The spatial information, i.e. images and spot coordinates, are stored inside the tools slot of the Seurat object in Staffli object.

Usage

ReadVisiumData(
  infoTable,
  assay = "Spatial",
  remove_spots_outside_HE = FALSE,
  remove_spots_outside_tissue = TRUE,
  verbose = TRUE,
  ...
)

Arguments

infoTable: A data.frame or tbl with paths to spaceranger output files
assay: Assay name (default = "Spatial")
remove_spots_outside_HE: Should spots outside the H&E be removed? This option can be useful for CytAssist data when the H&E image only cover a smaller part of the entire tissue section.
remove_spots_outside_tissue: Should spots outside the tissue be removed?
verbose: Print messages
...: Parameters passed to CreateSeuratObject

Value

A Seurat object with additional spatial information stored in a Staffli object

Details

ReadVisiumData takes a data.frame like table as input that should hold certain spaceranger output file paths. The table should consist of four columns: "samples", "imgs", "spotfiles" and "json".

"samples" : file paths to expression matrices, e.g. filtered_bc_matrix.h5
"imgs" : file paths to images, e.g. tissue_hires_image.jpg
"spotfiles" : file paths to spot coordinate CSV files tissue_positions_list.csv
"json" : file paths to scale factor JSON files, e.g. scalefactors_json.json. It is also possible to provide custom scale factors (more info below).

Custom scale factors

You can provide an additional column named 'scalefactor' with custom scale factor values. The values should be between 0 to 1, where a 1 would correspond to the original H&E image used as input for spaceranger. For instance, if your image was scaled from a height of 30,000 pixels to a height of 3,000 pixels (with the same aspect ratio), the scalefactor would be 0.1.

Load data outside tissue section

Sometimes it can be useful to load data for all spots in a 10x Visium dataset, if you need to explore transcripts captured outside of the tissue. In this case, you can provide paths to the raw_feature_bc_matrix.h5 files in the spaceranger output folders and set remove_spots_outside_tissue = FALSE.

Filter data

If you want to filter out spots and features, you can pass the min.cells and min.features parameters (see CreateSeuratObject for more details); however, it is recommended to use the SubsetSTData function for filtering after the object has been created.

Additional annotations

You can add a column named 'annotation_files' with paths to CSV files containing annotations. These files should have the same structure as the CSV the files exported by Loupe Browser. One column with the barcode IDs and one column with the annotation labels. The annotations will be added to the meta.data slot of the returned Seurat object.

Examples

# Assemble spaceranger output files
samples <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/filtered_feature_bc_matrix.h5"))
imgs <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/spatial/tissue_lowres_image.jpg"))
spotfiles <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/spatial/tissue_positions_list.csv"))
json <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/spatial/scalefactors_json.json"))

# Create a tibble/data.frame with file paths
library(tibble)
infoTable <- tibble(samples, imgs, spotfiles, json, 
                    sample_id = c("mousebrain", "mousecolon"))

# Create Seurat object
se <- ReadVisiumData(infoTable = infoTable)
#> 
#> ── Reading 10x Visium data ──
#> 
#> ℹ Loading matrices:
#> →   Finished loading expression matrix 1
#> →   Finished loading expression matrix 2
#> ! There are only 188 gene shared across all matrices:
#> →   Are you sure that the matrices share the same gene IDs?
#> →   Are the datasets from the same species?
#> 
#> ℹ Merging expression matrices:
#> ✔ There are 188 features and 5164 spots in the merged expression matrix.
#> ℹ Loading coordinates:
#> →   Finished loading coordinates for sample 1
#> →   Finished loading coordinates for sample 2
#> ℹ Collected coordinates for 5164 spots.
#> 
#> ── Creating `Seurat` object 
#> ✔ Expression matrices and coordinates are compatible
#> ℹ Created `Seurat` object
#> ℹ Created `Staffli` object
#> ✔ Returning a `Seurat` object with 188 features and 5164 spots
se
#> An object of class Seurat 
#> 188 features across 5164 samples within 1 assay 
#> Active assay: Spatial (188 features, 0 variable features)
#>  1 layer present: counts

# Add additional annotations from CSV files
annotation_file <- 
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/galt_spots.csv"))
annotation_files <- c(NA_character_, annotation_file)

# Create a tibble/data.frame with file paths
infoTable <- tibble(samples, imgs, spotfiles, json, 
                    sample_id = c("mousebrain", "mousecolon"), annotation_files)

se <- ReadVisiumData(infoTable)
#> 
#> ── Reading 10x Visium data ──
#> 
#> ℹ Loading annotatons from CSV files
#> ℹ Loading matrices:
#> →   Finished loading expression matrix 1
#> →   Finished loading expression matrix 2
#> ! There are only 188 gene shared across all matrices:
#> →   Are you sure that the matrices share the same gene IDs?
#> →   Are the datasets from the same species?
#> 
#> ℹ Merging expression matrices:
#> ✔ There are 188 features and 5164 spots in the merged expression matrix.
#> ℹ Loading coordinates:
#> →   Finished loading coordinates for sample 1
#> →   Finished loading coordinates for sample 2
#> ℹ Collected coordinates for 5164 spots.
#> 
#> ── Creating `Seurat` object 
#> ✔ Expression matrices and coordinates are compatible
#> ℹ Created `Seurat` object
#> ℹ Created `Staffli` object
#> ✔ Returning a `Seurat` object with 188 features and 5164 spots