Digital unrolling
Last compiled: 18 September 2024
digital_unrolling.rmd
In this tutorial, we will look at how to perform digital unrolling of folded “swiss roll” tissue sections.
The data set comes from a “swiss roll” of a mouse colon which was published by M. Parigi et al. (including yours truly) The spatial transcriptomic landscape of the healing mouse intestine following damage.
Mouse colon data
Let’s load and have a quick look at the data.
se_mcolon <- readRDS(system.file("extdata/mousecolon",
"se_mcolon",
package = "semla"))
se_mcolon <- se_mcolon |>
LoadImages()
##
## ── Loading H&E images ──
##
## ℹ Loading image from /Users/lovisafranzen/Library/Caches/org.R-project.R/R/renv/library/semla-41c099bc/R-4.4/aarch64-apple-darwin20/semla/extdata/mousecolon/spatial/tissue_lowres_image.jpg
## ℹ Scaled image from 541x600 to 400x444 pixels
## ℹ Saving loaded H&E images as 'rasters' in Seurat object
ImagePlot(se_mcolon)
In the H&E image, we can clearly see the organization of large part of the lower intestine. The outermost part of the roll is the ascending colon and the center of the roll is the rectum.
We can identify genes with a regionalized pattern pretty easily but it would we great if we could somehow unroll the tissue so that we can investigate gene expression patterns along the proximal-distal axis.
Below is a spatial map of the expression of Fxyd4 which is up-regulated closer to the distal part of the colon.
MapFeatures(se_mcolon, features = "Fxyd4",
image_use = "raw", override_plot_dims = TRUE,
colors = c("darkblue", "cyan", "yellow", "red", "darkred"))
Pre-processing
The goal of this tutorial is to show how semla
can be
used to find an “unrolled” coordinate system. To achieve this, we can
start a shiny application with CutSpatialNetwork()
which
will allow us to do some interesting stuff. More about that later.
Before we get started, we need to do a little bit of pre-processing.
First, we will tile the H&E image using TileImage()
.
Tiling is necessary to make it possible to visualize the H&E images
in a more interactive way. The path to the raw H&E image can be
found with GetStaffli(se_mcolon)@imgs
.
NB: If the image file has been moved, TileImage
the path
might no longer be valid. Make sure that the path is valid before
proceeding. To get decent resolution, we you should use the
“tissue_hires_image.png” output by spaceranger.
im <- GetStaffli(se_mcolon)@imgs[1]
sprintf("File exists: %s", file.exists(im))
## [1] "File exists: TRUE"
Now we can load the image with magick
, tile the image
and export it to a local directory. The function returns the path to the
directory the tiles were exported to. We will save this to a variable
called tilepath
library(magick)
im <- image_read(im)
tilepath <- TileImage(im = im, outpath = "~/Downloads/")
The CutSpatialNetwork()
function requires another file
to work properly. This file holds information about the “spatial
network” (see GetSpatialNetwork()
for details) which will
be used in the shiny application to define how spots in the Visium data
are connected to each other.
To demonstrate what this means, we can generate a spatial network and visualize it.
spatnet <- GetSpatialNetwork(se_mcolon)[[1]]
library(ggplot2)
ggplot() +
geom_segment(data = spatnet, aes(x = x_start, xend = x_end, y = y_start, yend = y_end)) +
scale_y_reverse() +
coord_fixed()
In the plot above, you can see the edges that connects neighboring spots to each other.
Now let’s export this spatial network to make it available for our shiny application.
export_graph(se_mcolon, sampleID = 1, outdir = tilepath$datapath)
Prepare network
Now we are ready to run the app! Once the app loads, you should see
the H&E image with spots and edges. The goal is to disconnect the
layer of the roll by cutting the edges between spots in different
layers. Cutting can be done by holding the SHIFT
key while
move the cursor across edges which will make them red. You can also
“repair” edges by holding the CTRL
key.
The image below illustrates how edges between two layers have been sliced.
Once all edges has been cut, you have to press the
Quit & Save
button. If you just close the app without
pressing “Quit & Save”, the results will not be returned back to
R.
NB: We need to make sure to save the output to a variable. Here, we
will save the output to tidy_network
.
tidy_network <- CutSpatialNetwork(se_mcolon, datadir = tilepath$datapath)
For downstream steps, it is crucial that the layers have been separated appropriately. Even a single edge between two layer will make it impossible to run down stream steps reliably. The image below demonstrates a situation where the separation will fail.
Luckily, any changes you made in the app are by default written to the spatial network file, so if you miss something, you can simply run the app again and fix the issue.
Let’s plot our full network and our filtered network:
library(tidygraph)
library(patchwork)
library(dplyr)
edges_full <- tidy_network |>
activate(edges) |>
as_tibble()
edges_filtered <- tidy_network |>
activate(edges) |>
as_tibble() |>
filter(keep)
p1 <- ggplot(edges_full,
aes(x, xend = x_end, y, yend = y_end)) +
geom_segment() +
scale_y_reverse() +
coord_fixed() +
ggtitle("Full network")
p2 <- ggplot(edges_filtered,
aes(x, xend = x_end, y, yend = y_end)) +
geom_segment() +
scale_y_reverse() +
coord_fixed() +
ggtitle("Filtered network")
p1 + p2
And just like that, we have separated the the layers!
Unrolling the network
Now comes the part where we do the actual “unrolling” using
AdjustTissueCoordinates()
. There are a few important things
to note here. AdjustTissueCoordinates
applies an algorithm
to sort nodes in the spatial graph between the end points. Distances
between nodes are used as a proxy for the actual distances in the
tissue. These distances between nodes are the shortest paths, i.e. the
minimum number of edges needed to visit (geodesic). For obvious reasons,
distances cannot be calculated between nodes that are not connected, and
for this reason, the function only handles connected graphs.
Unfortunately, our roll is not fully connected, so the function will only deal with the largest sub graph.
all_networks <- to_components(tidy_network |>
activate(edges) |>
filter(keep))
edges_labeled <- do.call(bind_rows, lapply(seq_along(all_networks), function(i) {
all_networks[[i]] |>
activate(edges) |>
as_tibble() |>
mutate(graph = paste0("sub graph", i))
}))
ggplot(edges_labeled, aes(x, xend = x_end, y, yend = y_end)) +
geom_segment() +
scale_y_reverse() +
coord_fixed() +
facet_grid(~graph)
The second, smaller sub graph will be discarded.
Another thing to note is that this function relies on a few assumptions about the shape of the folded tissue. If you were to try it on somethings completely different, let’s say a tissue folded in a zig-zag shape, it might not work as well.
tidy_network_adjusted <- AdjustTissueCoordinates(full_graph = tidy_network)
## ℹ Removing 620 edges
## ℹ Checking for disconnected graphs
## ℹ More than 1 subgraph identified. Keeping the largest subgraph
## ℹ Calculating pairwise geodesics between nodes in graph
## ℹ Identifying end points
## ℹ Finding shortest path beetween end points
## ℹ Checking location of nodes relative to shortest path nodes
## ✔ Finished!
Our tidy_network_adjusted
variable holds a
tibble
with the node coordinates and the new
x_dist
and y_dist
coordinates. As a sanity
check, we can color our original spatial map by the new distance values
to see if they make sense.
p1 <- ggplot(tidy_network_adjusted, aes(x, y, color = x_dist)) +
geom_point() +
scale_color_gradientn(colours = RColorBrewer::brewer.pal(n = 9, name = "Spectral") |>
rev()) +
coord_fixed() +
scale_y_reverse()
p2 <- ggplot(tidy_network_adjusted, aes(x, y, color = y_dist)) +
geom_point() +
scale_color_gradientn(colours = RColorBrewer::brewer.pal(n = 9, name = "Spectral") |>
rev()) +
coord_fixed() +
scale_y_reverse()
p1 + p2
The x_dist
values, which is arguably the most relevant
for regionalization patterns, looks quite good. y_dist
corresponds to the geodesic distances from the nodes along the shortest
path, where negative values indicate that the spots are located outside
the shortest path and positive values indicate that the spots are
located inside the shortest path.
Now that we have our new coordinate system, we can start looking at regionalization of gene expression along the proximal-distal axis.
library(tibble)
library(tidyr)
selected_genes <- c("Fxyd4", "Slc37a2", "Hmgcs2", "Cyp4b1")
se_mcolon <- ScaleData(se_mcolon)
gg <- FetchData(se_mcolon, vars = selected_genes, slot = "scale.data") |>
rownames_to_column(var = "name") |>
left_join(y = tidy_network_adjusted, by = "name") |>
pivot_longer(all_of(selected_genes), names_to = "variable", values_to = "value")
ggplot(gg, aes(x_dist, value, color = variable)) +
geom_smooth()
With these coordinates in our hands, it is quite straightforward to extract genes whose expression depend on distance along the proximal-distal axis. This will not be part of this tutorial, but below is a rather simple way of sorting genes by expression in a heatmap. Here, it becomes quite clear what genes are regionalized. There is one part of the heatmap that stands out, just on the right side of the center, which are the genes that are expressed in a lymphoid structure located in that part of the colon.
sorted_spots <- tidy_network_adjusted |>
arrange(x_dist) |>
pull(name)
# Get scale data
scale_data <- GetAssayData(se_mcolon,
slot = "scale.data",
assay = "Spatial")[VariableFeatures(se_mcolon), sorted_spots] |>
as.matrix()
# Bin data
bins <- cut(1:ncol(scale_data), breaks = 50)
binMat <- do.call(cbind, lapply(levels(bins), function(lvl) {
rowMeans(scale_data[, bins == lvl])
}))
pheatmap::pheatmap(binMat, cluster_cols = FALSE, border_color = FALSE,
breaks = seq(-1, 1, length.out = 50),
color = scico::scico(palette = "bam", n = 51))
Other applications
The tools described here could also be used for other applications.
Maybe we are interested in a specific part of a tissue where we want to
define an axis. Let’s make a manual selection of the outermost layer of
the mouse brain and subset the Seurat
object to contain
spots from this layer.
se_mbrain <- readRDS(system.file("extdata/mousebrain", "se_mbrain", package = "semla"))
se_mbrain <- LoadImages(se_mbrain)
se_mbrain$region <- "NA"
se_mbrain <- FeatureViewer(se_mbrain)
se_mbrain_subset <- SubsetSTData(se_mbrain, expression = region == "outer_layer")
MapLabels(se_mbrain_subset, column_name = "region", image_use = "raw")
Then we can create a spatial network from this layer and modify it a
bit to make it compatible with AdjustTissueCoordinates
:
spatial_network <- GetSpatialNetwork(se_mbrain_subset)[[1]]
nodes <- spatial_network |> select(from, x_start, y_start) |> group_by(from) |>
slice_head(n = 1) |> rename(name = from, x = x_start, y = y_start)
edges <- spatial_network |> select(from, to) |> mutate(keep = TRUE)
spatial_network <- tidygraph::tbl_graph(nodes = nodes, edges = edges, directed = FALSE)
Now that we have the spatial network, we can run
AdjustTissueCoordinates
to compute distances between the
end points of the selected region:
# Run adjust tissue coordinates
test <- AdjustTissueCoordinates(full_graph = spatial_network)
se_mbrain_subset$x_dist <- test[match(test$name, colnames(se_mbrain_subset)), "x_dist", drop = TRUE]
# Plot x_dist
MapFeatures(se_mbrain_subset, features = "x_dist", image_use = "raw",
colors = RColorBrewer::brewer.pal(n = 9, name = "Spectral"),
override_plot_dims = TRUE, pt_size = 2.5) &
theme(plot.title = element_blank())
Package version
-
semla
: 1.1.6
Session info
## R version 4.4.0 (2024-04-24)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Stockholm
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices datasets utils methods base
##
## other attached packages:
## [1] tidyr_1.3.1 tibble_3.2.1 patchwork_1.3.0 tidygraph_1.3.1
## [5] semla_1.1.6 ggplot2_3.5.0 dplyr_1.1.4 SeuratObject_4.1.4
## [9] Seurat_4.3.0.1
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 rstudioapi_0.16.0 jsonlite_1.8.8
## [4] magrittr_2.0.3 magick_2.8.4 spatstat.utils_3.1-0
## [7] farver_2.1.2 rmarkdown_2.28 fs_1.6.4
## [10] ragg_1.3.3 vctrs_0.6.5 ROCR_1.0-11
## [13] spatstat.explore_3.3-2 forcats_1.0.0 htmltools_0.5.8.1
## [16] sass_0.4.9 sctransform_0.4.1 parallelly_1.38.0
## [19] KernSmooth_2.23-24 bslib_0.8.0 htmlwidgets_1.6.4
## [22] desc_1.4.3 ica_1.0-3 plyr_1.8.9
## [25] plotly_4.10.4 zoo_1.8-12 cachem_1.1.0
## [28] igraph_2.0.3 mime_0.12 lifecycle_1.0.4
## [31] pkgconfig_2.0.3 Matrix_1.7-0 R6_2.5.1
## [34] fastmap_1.2.0 fitdistrplus_1.2-1 future_1.34.0
## [37] shiny_1.9.1 digest_0.6.37 colorspace_2.1-1
## [40] tensor_1.5 irlba_2.3.5.1 textshaping_0.4.0
## [43] labeling_0.4.3 progressr_0.14.0 fansi_1.0.6
## [46] spatstat.sparse_3.1-0 mgcv_1.9-1 httr_1.4.7
## [49] polyclip_1.10-7 abind_1.4-8 compiler_4.4.0
## [52] withr_3.0.1 highr_0.11 MASS_7.3-60.2
## [55] tools_4.4.0 lmtest_0.9-40 httpuv_1.6.15
## [58] future.apply_1.11.2 goftest_1.2-3 glue_1.7.0
## [61] dbscan_1.2-0 nlme_3.1-164 promises_1.3.0
## [64] grid_4.4.0 Rtsne_0.17 cluster_2.1.6
## [67] reshape2_1.4.4 generics_0.1.3 gtable_0.3.5
## [70] spatstat.data_3.1-2 data.table_1.16.0 sp_2.1-4
## [73] utf8_1.2.4 spatstat.geom_3.3-2 RcppAnnoy_0.0.22
## [76] ggrepel_0.9.6 RANN_2.6.2 pillar_1.9.0
## [79] stringr_1.5.1 spam_2.10-0 later_1.3.2
## [82] splines_4.4.0 lattice_0.22-6 renv_1.0.2
## [85] survival_3.6-4 deldir_2.0-4 tidyselect_1.2.1
## [88] miniUI_0.1.1.1 pbapply_1.7-2 knitr_1.48
## [91] gridExtra_2.3 scattermore_1.2 xfun_0.47
## [94] matrixStats_1.4.1 pheatmap_1.0.12 stringi_1.8.4
## [97] lazyeval_0.2.2 yaml_2.3.10 evaluate_0.24.0
## [100] codetools_0.2-20 BiocManager_1.30.25 cli_3.6.3
## [103] uwot_0.2.2 xtable_1.8-4 reticulate_1.39.0
## [106] systemfonts_1.1.0 munsell_0.5.1 jquerylib_0.1.4
## [109] Rcpp_1.0.13 globals_0.16.3 spatstat.random_3.3-1
## [112] zeallot_0.1.0 png_0.1-8 spatstat.univar_3.0-1
## [115] parallel_4.4.0 pkgdown_2.1.0 dotCall64_1.1-1
## [118] listenv_0.9.1 viridisLite_0.4.2 scales_1.3.0
## [121] ggridges_0.5.6 leiden_0.4.3.1 purrr_1.0.2
## [124] scico_1.5.0 rlang_1.1.4 cowplot_1.1.3
## [127] shinyjs_2.1.0