Subset and merge
Last compiled: 18 September 2024
subset_and_merge.rmd
Subsetting and merging data is done using the two functions
SubsetSTData()
and MergeSTData()
.
If you use the generic functions subset
and
merge
, these will work fine on Seurat
objects
but they will not be able to handle the spatial data that has been
placed inside the Seurat
object by semla
,
causing the spatial data to be lost or corrupted.
Let’s load an example mouse brain 10x Visium data:
library(semla)
se_mcolon <- readRDS(system.file("extdata/mousecolon",
"se_mcolon",
package = "semla"))
se_mcolon
## An object of class Seurat
## 188 features across 2604 samples within 1 assay
## Active assay: Spatial (188 features, 182 variable features)
MapFeaturesSummary(se_mcolon, features = "nFeature_Spatial", subplot_type = "histogram")
Subset by selecting spots
The data can be subseted by specifying which spots to keep using their barcode ID.
spots_to_keep <- c("AAACAAGTATCTCCCA-1", "AAACACCAATAACTGC-1",
"AAACATTTCCCGGATT-1", "AAACCCGAACGAAATC-1",
"AAACCGGGTAGGTACC-1", "AAACCGTTCGTCCAGG-1")
# Subset using selected spots
se_mcolon_small <- SubsetSTData(se_mcolon, spots = spots_to_keep)
se_mcolon_small
## An object of class Seurat
## 188 features across 3 samples within 1 assay
## Active assay: Spatial (188 features, 182 variable features)
MapFeaturesSummary(se_mcolon_small, features = "nFeature_Spatial", subplot_type = "histogram")
Subset by selecting features
We can also extract data corresponding to a few selected genes of interest.
genes_to_keep <- c("Hbb-bs", "Hba-a1", "Hba-a2", "Hbb-bt", "Slc6a3", "Th")
# Subset using selected genes
se_mcolon_fewgenes <- SubsetSTData(se_mcolon, features = genes_to_keep)
se_mcolon_fewgenes
## An object of class Seurat
## 6 features across 2604 samples within 1 assay
## Active assay: Spatial (6 features, 6 variable features)
MapFeaturesSummary(se_mcolon_fewgenes, features = "nFeature_Spatial", subplot_type = "histogram")
Subset with an expression
If we want to subset data using information from the
meta.data
slot, it might be easier to use an expression
instead. This comes in handy when we want to perform QC filtering on our
data.
# Filter by number of unique genes
se_mcolon_filtered <- SubsetSTData(se_mcolon, expression = nFeature_Spatial > 30)
se_mcolon_filtered
## An object of class Seurat
## 188 features across 2545 samples within 1 assay
## Active assay: Spatial (188 features, 182 variable features)
MapFeaturesSummary(se_mcolon_filtered, features = "nFeature_Spatial", subplot_type = "histogram")
Merge two data sets
Finally, we can use MergeSTData()
to join two objects.
In this example, our colon data set will be merged with a brain data
set.
se_mbrain <- readRDS(system.file("extdata/mousebrain",
"se_mbrain",
package = "semla"))
se_merged <- MergeSTData(se_mcolon, se_mbrain)
unique(se_merged$sample_id)
## [1] "mousecolon" "mousebrain"
se_merged
## An object of class Seurat
## 188 features across 5164 samples within 1 assay
## Active assay: Spatial (188 features, 0 variable features)
MapFeatures(se_merged, features = "nFeature_Spatial")
Package version
-
semla
: 1.1.6
Session info
## R version 4.4.0 (2024-04-24)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Stockholm
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices datasets utils methods base
##
## other attached packages:
## [1] semla_1.1.6 ggplot2_3.5.0 dplyr_1.1.4 SeuratObject_4.1.4
## [5] Seurat_4.3.0.1
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 rstudioapi_0.16.0 jsonlite_1.8.8
## [4] magrittr_2.0.3 magick_2.8.4 spatstat.utils_3.1-0
## [7] farver_2.1.2 rmarkdown_2.28 fs_1.6.4
## [10] ragg_1.3.3 vctrs_0.6.5 ROCR_1.0-11
## [13] spatstat.explore_3.3-2 forcats_1.0.0 htmltools_0.5.8.1
## [16] sass_0.4.9 sctransform_0.4.1 parallelly_1.38.0
## [19] KernSmooth_2.23-24 bslib_0.8.0 htmlwidgets_1.6.4
## [22] desc_1.4.3 ica_1.0-3 plyr_1.8.9
## [25] plotly_4.10.4 zoo_1.8-12 cachem_1.1.0
## [28] igraph_2.0.3 mime_0.12 lifecycle_1.0.4
## [31] pkgconfig_2.0.3 Matrix_1.7-0 R6_2.5.1
## [34] fastmap_1.2.0 fitdistrplus_1.2-1 future_1.34.0
## [37] shiny_1.9.1 digest_0.6.37 colorspace_2.1-1
## [40] patchwork_1.3.0 tensor_1.5 irlba_2.3.5.1
## [43] textshaping_0.4.0 labeling_0.4.3 progressr_0.14.0
## [46] fansi_1.0.6 spatstat.sparse_3.1-0 httr_1.4.7
## [49] polyclip_1.10-7 abind_1.4-8 compiler_4.4.0
## [52] withr_3.0.1 highr_0.11 MASS_7.3-60.2
## [55] tools_4.4.0 lmtest_0.9-40 httpuv_1.6.15
## [58] future.apply_1.11.2 goftest_1.2-3 glue_1.7.0
## [61] dbscan_1.2-0 nlme_3.1-164 promises_1.3.0
## [64] grid_4.4.0 Rtsne_0.17 cluster_2.1.6
## [67] reshape2_1.4.4 generics_0.1.3 gtable_0.3.5
## [70] spatstat.data_3.1-2 tidyr_1.3.1 data.table_1.16.0
## [73] sp_2.1-4 utf8_1.2.4 spatstat.geom_3.3-2
## [76] RcppAnnoy_0.0.22 ggrepel_0.9.6 RANN_2.6.2
## [79] pillar_1.9.0 stringr_1.5.1 spam_2.10-0
## [82] later_1.3.2 splines_4.4.0 lattice_0.22-6
## [85] renv_1.0.2 survival_3.6-4 deldir_2.0-4
## [88] tidyselect_1.2.1 miniUI_0.1.1.1 pbapply_1.7-2
## [91] knitr_1.48 gridExtra_2.3 scattermore_1.2
## [94] xfun_0.47 matrixStats_1.4.1 stringi_1.8.4
## [97] lazyeval_0.2.2 yaml_2.3.10 evaluate_0.24.0
## [100] codetools_0.2-20 tibble_3.2.1 BiocManager_1.30.25
## [103] cli_3.6.3 uwot_0.2.2 xtable_1.8-4
## [106] reticulate_1.39.0 systemfonts_1.1.0 munsell_0.5.1
## [109] jquerylib_0.1.4 Rcpp_1.0.13 globals_0.16.3
## [112] spatstat.random_3.3-1 zeallot_0.1.0 png_0.1-8
## [115] spatstat.univar_3.0-1 parallel_4.4.0 pkgdown_2.1.0
## [118] dotCall64_1.1-1 listenv_0.9.1 viridisLite_0.4.2
## [121] scales_1.3.0 ggridges_0.5.6 leiden_0.4.3.1
## [124] purrr_1.0.2 rlang_1.1.4 cowplot_1.1.3
## [127] shinyjs_2.1.0