Skip to contents

Run GRNBoost2-like regulatory network inference and return a standardized adjacency table with columns `TF`, `target`, and `importance`.

Usage

RunGRNBoost2(object, ...)

# S3 method for class 'Seurat'
RunGRNBoost2(
  object,
  assay = NULL,
  layer = "counts",
  regulators = NULL,
  targets = NULL,
  backend = c("cpp", "python"),
  max_edges_per_target = Inf,
  n_rounds = 5000,
  learning_rate = 0.01,
  max_depth = 3,
  max_features = 0.1,
  subsample = 0.9,
  early_stop_window_length = 25,
  exclude_self = TRUE,
  importance_norm_power = 0,
  correlation_fill = FALSE,
  boost_weight = 0.5,
  correlation_weight = 0.8,
  covariance_weight = 0.4,
  correlation_method = c("pearson", "spearman"),
  output_file = NULL,
  work_dir = tempdir(),
  prefix = "grnboost2",
  envname = "scenic_env",
  conda = "auto",
  prepare_env = FALSE,
  cores = 1,
  seed = 1234,
  force = FALSE,
  verbose = TRUE,
  ...
)

# S3 method for class 'matrix'
RunGRNBoost2(object, ...)

# Default S3 method
RunGRNBoost2(
  object,
  regulators = NULL,
  targets = NULL,
  genes_in = c("rows", "columns"),
  backend = c("cpp", "python"),
  max_edges_per_target = Inf,
  n_rounds = 5000,
  learning_rate = 0.01,
  max_depth = 3,
  max_features = 0.1,
  subsample = 0.9,
  early_stop_window_length = 25,
  exclude_self = TRUE,
  importance_norm_power = 0,
  correlation_fill = FALSE,
  boost_weight = 0.5,
  correlation_weight = 0.8,
  covariance_weight = 0.4,
  correlation_method = c("pearson", "spearman"),
  seed = 1234,
  output_file = NULL,
  work_dir = tempdir(),
  prefix = "grnboost2",
  envname = "scenic_env",
  conda = "auto",
  prepare_env = FALSE,
  cores = 1,
  force = FALSE,
  verbose = TRUE,
  ...
)

Arguments

object

A Seurat object or expression matrix.

...

Additional backend-specific arguments.

assay

Assay used when `object` is a Seurat object.

layer

Assay layer used when `object` is a Seurat object.

regulators

Candidate transcription factor genes.

targets

Optional target genes. If `NULL`, all genes in the GRN matrix are considered as candidate targets.

backend

Runtime backend. Supports `"cpp"` and `"python"`.

max_edges_per_target

Maximum incoming regulator edges retained per target. The default `Inf` keeps all positive-importance links, matching arboreto GRNBoost2 output.

n_rounds

Number of boosting rounds for GRNBoost2-like tree ensemble inference. The default follows arboreto `SGBM_KWARGS`.

learning_rate

GRNBoost2-like tree ensemble learning rate.

max_depth

Maximum depth of each regression tree.

max_features

Fraction of candidate regulators sampled at each tree split.

subsample

Fraction of cells sampled for each boosting round.

early_stop_window_length

Out-of-bag improvement window used for GRNBoost2 early stopping.

exclude_self

Whether native GRNBoost2-like inference excludes a target gene from its own regulator feature set.

importance_norm_power

Power used to normalize native GRNBoost2-like edge importance by the total importance for each target. Set to `0` to disable normalization.

correlation_fill

Whether native GRNBoost2-like inference should fill missing TF-target candidates with expression-correlation support before applying `max_edges_per_target`.

boost_weight

Weight of the native boosting score used when correlation fill is enabled.

correlation_weight

Weight of the correlation-fill score relative to native boosting importance.

covariance_weight

Weight of the covariance score used when correlation fill is enabled.

correlation_method

Correlation method used for native GRNBoost2-like candidate filling.

output_file

Optional path where the adjacency table is written.

work_dir

Working directory used by Python backends.

prefix

Prefix for temporary Python backend files.

envname

Python environment used by Python backends.

conda

Conda-compatible executable used by Python backends.

prepare_env

Whether to prepare the Python environment before running Python backends.

cores

Number of workers used by native/Python GRNBoost2.

seed

Random seed passed to supported backends.

force

Whether to rebuild existing `output_file`.

verbose

Whether to print progress messages.

genes_in

Matrix orientation for matrix inputs. `"rows"` means genes x cells; `"columns"` means cells x genes.

Value

A data frame with columns `TF`, `target`, and `importance`.