Load and merge multiple gene expression matrices
LoadAndMergeMatrices.Rd
Gene expression matrices should have features in rows and spots in columns.
Details
The merging process makes sure that all genes detected are present in the merged output. This means that if a gene is missing in a certain dataset, the spots in that dataset will be assigned with 0 expression.
Spot IDs are renamed to be unique. Usually, the spots are named something similar to:
"ACGCCTGACACGCGCT-1", "TACCGATCCAACACTT-1"
Since spot barcodes are shared across datasets, there is a risk that some of the spot IDs will be duplicated after merging. To avoid this, the prefix (e.g. "-1") is replaced by a unique prefix for each loaded matrix: "-1", "-2", "-3", ...
IF data
If the provided h5 files store antibody capture data, LoadAndMergeMatrices
will
return a list of matrices. If multiple samples are loaded, the RNA expression matrices and
antibody capture matrices will be merged and returned as separate elements of the list.
Note that if one or more samples only have RNA expression data, the function will add empty
values for those samples in the merged antibody capture matrix.
See also
Other pre-process:
LoadAnnotationCSV()
,
LoadImageInfo()
,
LoadImages()
,
LoadScaleFactors()
,
LoadSpatialCoordinates()
,
ReadVisiumData()
,
UpdateImageInfo()
Examples
# Load and merge two gene expression matrices
samples <-
c(
system.file(
"extdata/mousebrain",
"filtered_feature_bc_matrix.h5",
package = "semla"
),
system.file(
"extdata/mousecolon",
"filtered_feature_bc_matrix.h5",
package = "semla"
)
)
mergedMatrix <- LoadAndMergeMatrices(samples)
#> ℹ Loading matrices:
#> → Finished loading expression matrix 1
#> → Finished loading expression matrix 2
#> ! There are only 188 gene shared across all matrices:
#> → Are you sure that the matrices share the same gene IDs?
#> → Are the datasets from the same species?
#>
#> ℹ Merging expression matrices:
#> ✔ There are 188 features and 5164 spots in the merged expression matrix.