The scVI integration function

Usage

scVI_integrate(
  srt_merge = NULL,
  batch = NULL,
  append = TRUE,
  srt_list = NULL,
  assay = NULL,
  do_normalization = NULL,
  normalization_method = "LogNormalize",
  do_HVF_finding = TRUE,
  HVF_source = "separate",
  HVF_method = "vst",
  nHVF = 2000,
  HVF_min_intersection = 1,
  HVF = NULL,
  scVI_dims_use = NULL,
  nonlinear_reduction = "umap",
  nonlinear_reduction_dims = c(2, 3),
  nonlinear_reduction_params = list(),
  force_nonlinear_reduction = TRUE,
  neighbor_metric = "euclidean",
  neighbor_k = 20L,
  cluster_algorithm = "louvain",
  cluster_resolution = 0.6,
  model = "SCVI",
  SCVI_params = list(),
  PEAKVI_params = list(),
  num_threads = 1,
  verbose = TRUE,
  seed = 11
)

Arguments

srt_merge: A merged `Seurat` object that includes the batch information.
batch: A character string specifying the batch variable name.
append: Whether the integrated data will be appended to the original Seurat object (srt_merge). Default is TRUE.
srt_list: A list of Seurat objects to be checked and preprocessed.
assay: Which assay to use. If NULL, the default assay of the Seurat object will be used.
do_normalization: Whether data normalization should be performed. Default is TRUE.
normalization_method: The normalization method to be used. Possible values are "LogNormalize", "SCT", and "TFIDF". Default is "LogNormalize".
do_HVF_finding: Whether to perform high variable feature finding. If TRUE, the function will force to find the highly variable features (HVF) using the specified HVF method.
HVF_source: The source of highly variable features. Possible values are "global" and "separate". Default is "separate".
HVF_method: The method to use for finding highly variable features. Options are "vst", "mvp", or "disp". Default is "vst".
nHVF: The number of highly variable features to select. If NULL, all highly variable features will be used. Default is 2000.
HVF_min_intersection: The feature needs to be present in batches for a minimum number of times in order to be considered as highly variable. Default is 1.
HVF: A vector of feature names to use as highly variable features. If NULL, the function will use the highly variable features identified by the HVF method.
scVI_dims_use: A vector specifying the dimensions returned by scVI that will be utilized for downstream cell cluster finding and non-linear reduction. If set to NULL, all the returned dimensions will be used by default.
nonlinear_reduction: The nonlinear dimensionality reduction method to use. Options are "umap", "umap-naive", "tsne", "dm", "phate", "pacmap", "trimap", "largevis", or "fr". Default is "umap".
nonlinear_reduction_dims: The number of dimensions to keep after nonlinear dimensionality reduction. If a vector is provided, different numbers of dimensions can be specified for each method. Default is c(2, 3).
nonlinear_reduction_params: A list of parameters to pass to the nonlinear dimensionality reduction method.
force_nonlinear_reduction: Whether to force nonlinear dimensionality reduction even if the specified reduction is already present in the Seurat object. Default is TRUE.
neighbor_metric: The distance metric to use for finding neighbors. Options are "euclidean", "cosine", "manhattan", or "hamming". Default is "euclidean".
neighbor_k: The number of nearest neighbors to use for finding neighbors. Default is 20.
cluster_algorithm: The clustering algorithm to use. Options are "louvain", "slm", or "leiden". Default is "louvain".
cluster_resolution: The resolution parameter to use for clustering. Larger values result in fewer clusters. Default is 0.6.
model: A string indicating the scVI model to be used. Options are "SCVI" and "PEAKVI". Default is `"SCVI"`.
SCVI_params: A list of parameters for the SCVI model. Default is `list()`.
PEAKVI_params: A list of parameters for the PEAKVI model. Default is `list()`.
num_threads: An integer setting the number of threads for scVI. Default is `8`.
verbose: Whether to print the message. Default is TRUE.
seed: Random seed for reproducibility. Default is 11.