Skip to contents

Annotate single cells using scmap.

Usage

RunScmap(
  srt_query,
  srt_ref,
  ref_group = NULL,
  query_assay = "RNA",
  ref_assay = "RNA",
  method = "scmapCluster",
  nfeatures = 500,
  threshold = 0.5,
  k = 10
)

Arguments

srt_query

An object of class Seurat to be annotated with cell types.

srt_ref

An object of class Seurat storing the reference cells.

ref_group

A character vector specifying the column name in the `srt_ref` metadata that represents the cell grouping.

query_assay

A character vector specifying the assay to be used for the query data. Defaults to the default assay of the `srt_query` object.

ref_assay

A character vector specifying the assay to be used for the reference data. Defaults to the default assay of the `srt_ref` object.

method

The method to be used for scmap analysis. Can be any of "scmapCluster" or "scmapCell". The default value is "scmapCluster".

nfeatures

The number of top features to be selected. The default value is 500.

threshold

The threshold value on similarity to determine if a cell is assigned to a cluster. This should be a value between 0 and 1. The default value is 0.5.

k

Number of clusters per group for k-means clustering when method is "scmapCell".

Examples

data("panc8_sub")

genenames <- make.unique(
  capitalize(rownames(panc8_sub),
    force_tolower = TRUE
  )
)
names(genenames) <- rownames(panc8_sub)
panc8_sub <- RenameFeatures(
  panc8_sub,
  newnames = genenames
)
#>  [2025-07-02 03:04:53] Rename features for the assay: RNA
panc8_sub <- check_srt_merge(
  panc8_sub,
  batch = "tech"
)[["srt_merge"]]
#>  [2025-07-02 03:04:54]  Spliting srt_merge into srt_list by column tech...
#>  [2025-07-02 03:04:55] Checking srt_list...
#>  [2025-07-02 03:04:55] Data 1/5 of the srt_list has been log-normalized.
#>  [2025-07-02 03:04:55] Perform FindVariableFeatures on the data 1/5 of the srt_list...
#>  [2025-07-02 03:04:55] Data 2/5 of the srt_list has been log-normalized.
#>  [2025-07-02 03:04:55] Perform FindVariableFeatures on the data 2/5 of the srt_list...
#>  [2025-07-02 03:04:56] Data 3/5 of the srt_list has been log-normalized.
#>  [2025-07-02 03:04:56] Perform FindVariableFeatures on the data 3/5 of the srt_list...
#>  [2025-07-02 03:04:56] Data 4/5 of the srt_list has been log-normalized.
#>  [2025-07-02 03:04:56] Perform FindVariableFeatures on the data 4/5 of the srt_list...
#>  [2025-07-02 03:04:57] Data 5/5 of the srt_list has been log-normalized.
#>  [2025-07-02 03:04:57] Perform FindVariableFeatures on the data 5/5 of the srt_list...
#>  [2025-07-02 03:04:57] Use the separate HVF from srt_list...
#>  [2025-07-02 03:04:58] Number of available HVF: 2000
#>  [2025-07-02 03:04:58] Finished checking.

data("pancreas_sub")
pancreas_sub <- standard_scop(pancreas_sub)
#>  [2025-07-02 03:05:01] Start standard_scop
#>  [2025-07-02 03:05:01] Checking srt_list...
#>  [2025-07-02 03:05:03] Data 1/1 of the srt_list has been log-normalized.
#>  [2025-07-02 03:05:03] Perform FindVariableFeatures on the data 1/1 of the srt_list...
#>  [2025-07-02 03:05:04] Use the separate HVF from srt_list...
#>  [2025-07-02 03:05:04] Number of available HVF: 2000
#>  [2025-07-02 03:05:04] Finished checking.
#>  [2025-07-02 03:05:04] Perform ScaleData on the data...
#>  [2025-07-02 03:05:04] Perform linear dimension reduction (pca) on the data...
#>  [2025-07-02 03:05:04] linear_reduction(pca) is already existed. Skip calculation.
#>  [2025-07-02 03:05:05] Perform FindClusters (louvain) on the data...
#>  [2025-07-02 03:05:05] Reorder clusters...
#> ! [2025-07-02 03:05:05] Using 'Seurat::AggregateExpression()' to calculate pseudo-bulk data for 'Assay5'.
#>  [2025-07-02 03:05:05] Perform nonlinear dimension reduction (umap) on the data...
#>  [2025-07-02 03:05:05] Non-linear dimensionality reduction(umap) using Reduction(Standardpca, dims:1-50) as input
#>  [2025-07-02 03:05:09] Non-linear dimensionality reduction(umap) using Reduction(Standardpca, dims:1-50) as input
#>  [2025-07-02 03:05:14] Run standard_scop done
#>  [2025-07-02 03:05:14] Elapsed time:12.22 secs
pancreas_sub <- RunScmap(
  srt_query = pancreas_sub,
  srt_ref = panc8_sub,
  ref_group = "celltype",
  method = "scmapCluster"
)
#>  [2025-07-02 03:05:14] Detected srt_query data type: log_normalized_counts
#>  [2025-07-02 03:05:16] Detected srt_ref data type: log_normalized_counts
#>  [2025-07-02 03:05:18] Perform selectFeatures on the data...
#>  [2025-07-02 03:05:18] Perform indexCluster on the data...
#>  [2025-07-02 03:05:18] Perform scmapCluster on the data...
#> Warning: Features Mt-atp6, Mt-co1, Mt-co2, Mt-co3, Mt-nd1, Mt-nd2, Mt-nd4, Mt-nd4l, Mt-nd5 are not present in the 'SCESet' object and therefore were not set.
CellDimPlot(
  pancreas_sub,
  group.by = "scmap_annotation"
)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.


pancreas_sub <- RunScmap(
  srt_query = pancreas_sub,
  srt_ref = panc8_sub,
  ref_group = "celltype",
  method = "scmapCell"
)
#>  [2025-07-02 03:05:19] Detected srt_query data type: log_normalized_counts
#>  [2025-07-02 03:05:21] Detected srt_ref data type: log_normalized_counts
#>  [2025-07-02 03:05:23] Perform selectFeatures on the data...
#>  [2025-07-02 03:05:23] Perform indexCell on the data...
#>  [2025-07-02 03:05:23] Perform scmapCell on the data...
#>  [2025-07-02 03:05:24] Perform scmapCell2Cluster on the data...
CellDimPlot(
  pancreas_sub,
  group.by = "scmap_annotation"
)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.