Skip to contents

scVI_integrate

Usage

scVI_integrate(
  srt_merge = NULL,
  batch = NULL,
  append = TRUE,
  srt_list = NULL,
  assay = NULL,
  do_normalization = NULL,
  normalization_method = "LogNormalize",
  do_HVF_finding = TRUE,
  HVF_source = "separate",
  HVF_method = "vst",
  nHVF = 2000,
  HVF_min_intersection = 1,
  HVF = NULL,
  scVI_dims_use = NULL,
  nonlinear_reduction = "umap",
  nonlinear_reduction_dims = c(2, 3),
  nonlinear_reduction_params = list(),
  force_nonlinear_reduction = TRUE,
  neighbor_metric = "euclidean",
  neighbor_k = 20L,
  cluster_algorithm = "louvain",
  cluster_resolution = 0.6,
  model = "SCVI",
  SCVI_params = list(),
  PEAKVI_params = list(),
  num_threads = 1,
  verbose = TRUE,
  seed = 11
)

Arguments

srt_merge

A merged `Seurat` object that includes the batch information.

batch

A character string specifying the batch variable name.

append

The integrated data will be appended to the original Seurat object (srt_merge). Default is TRUE.

srt_list

A list of Seurat objects to be checked and preprocessed.

assay

The name of the assay to be used for downstream analysis.

do_normalization

Whether data normalization should be performed. Default is TRUE.

normalization_method

The normalization method to be used. Possible values are "LogNormalize", "SCT", and "TFIDF". Default is "LogNormalize".

do_HVF_finding

Whether highly variable feature (HVF) finding should be performed. Default is TRUE.

HVF_source

The source of highly variable features. Possible values are "global" and "separate". Default is "separate".

HVF_method

The method for selecting highly variable features. Default is "vst".

nHVF

The number of highly variable features to select. Default is 2000.

HVF_min_intersection

The feature needs to be present in batches for a minimum number of times in order to be considered as highly variable. The default value is 1.

HVF

A vector of highly variable features. Default is NULL.

scVI_dims_use

A vector specifying the dimensions returned by scVI that will be utilized for downstream cell cluster finding and non-linear reduction. If set to NULL, all the returned dimensions will be used by default.

nonlinear_reduction

The nonlinear dimensionality reduction method to use. Options are "umap", "umap-naive", "tsne", "dm", "phate", "pacmap", "trimap", "largevis", or "fr". Default is "umap".

nonlinear_reduction_dims

The number of dimensions to keep after nonlinear dimensionality reduction. If a vector is provided, different numbers of dimensions can be specified for each method. Default is c(2, 3).

nonlinear_reduction_params

A list of parameters to pass to the nonlinear dimensionality reduction method.

force_nonlinear_reduction

Whether to force nonlinear dimensionality reduction even if the specified reduction is already present in the Seurat object. Default is TRUE.

neighbor_metric

The distance metric to use for finding neighbors. Options are "euclidean", "cosine", "manhattan", or "hamming". Default is "euclidean".

neighbor_k

The number of nearest neighbors to use for finding neighbors. Default is 20.

cluster_algorithm

The clustering algorithm to use. Options are "louvain", "slm", or "leiden". Default is "louvain".

cluster_resolution

The resolution parameter to use for clustering. Larger values result in fewer clusters. Default is 0.6.

model

A string indicating the scVI model to be used. Options are "SCVI" and "PEAKVI". Default is "SCVI".

SCVI_params

A list of parameters for the SCVI model. Default is an empty list.

PEAKVI_params

A list of parameters for the PEAKVI model. Default is an empty list.

num_threads

An integer setting the number of threads for scVI. Default is 8.

verbose

Whether to print the message. Default is TRUE.

seed

An integer specifying the random seed for reproducibility. Default is 11.