Computes the regularization path for the specified loss function and penalty function.
Usage
sparse_regression(
x,
y,
penalty = "L0",
algorithm = c("CD", "CDPSI"),
regulators_num = ncol(x),
cross_validation = FALSE,
n_folds = 5,
seed = 1,
loss = "SquaredError",
nLambda = 100,
nGamma = 5,
gammaMax = 10,
gammaMin = 1e-04,
partialSort = TRUE,
maxIters = 200,
rtol = 1e-06,
atol = 1e-09,
activeSet = TRUE,
activeSetNum = 3,
maxSwaps = 100,
scaleDownFactor = 0.8,
screenSize = 1000,
autoLambda = NULL,
lambdaGrid = list(),
excludeFirstK = 0,
intercept = TRUE,
lows = -Inf,
highs = Inf,
verbose = TRUE,
...
)
Arguments
- x
The matrix of regulators.
- y
The vector of target.
- penalty
The type of regularization, default is
L0
. This can take either one of the following choices:L0
,L0L1
, andL0L2
. For high-dimensional and sparse data,L0L2
is more effective.- algorithm
The type of algorithm used to minimize the objective function, default is
CD
. CurrentlyCD
andCDPSI
are supported. TheCDPSI
algorithm may yield better results, but it also increases running time.- regulators_num
The number of non-zore coefficients, this value will affect the final performance. The maximum support size at which to terminate the regularization path.
- cross_validation
Logical value, default is
FALSE
, whether to use cross-validation.- n_folds
The number of folds for cross-validation, default is
5
.- seed
The random seed for cross-validation, default is
1
.- loss
The loss function.
- nLambda
The number of Lambda values to select.
- nGamma
The number of Gamma values to select.
- gammaMax
The maximum value of Gamma when using the
L0L2
penalty. For theL0L1
penalty this is automatically selected.- gammaMin
The minimum value of Gamma when using the
L0L2
penalty. For theL0L1
penalty, the minimum value of gamma in the grid is set to gammaMin * gammaMax. Note that this should be a strictly positive quantity.- partialSort
If
TRUE
, partial sorting will be used for sorting the coordinates to do greedy cycling. Otherwise, full sorting is used.- maxIters
The maximum number of iterations (full cycles) for
CD
per grid point.- rtol
The relative tolerance which decides when to terminate optimization, based on the relative change in the objective between iterations.
- atol
The absolute tolerance which decides when to terminate optimization, based on the absolute L2 norm of the residuals.
- activeSet
If
TRUE
, performs active set updates.- activeSetNum
The number of consecutive times a support should appear before declaring support stabilization.
- maxSwaps
The maximum number of swaps used by
CDPSI
for each grid point.- scaleDownFactor
This parameter decides how close the selected Lambda values are.
- screenSize
The number of coordinates to cycle over when performing initial correlation screening.
- autoLambda
Ignored parameter. Kept for backwards compatibility.
- lambdaGrid
A grid of Lambda values to use in computing the regularization path.
- excludeFirstK
This parameter takes non-negative integers.
- intercept
If
FALSE
, no intercept term is included in the model.- lows
Lower bounds for coefficients.
- highs
Upper bounds for coefficients.
- verbose
Logical value, default is
TRUE
, whether to print progress messages.- ...
Parameters for other methods.
References
Hazimeh, Hussein et al. “L0Learn: A Scalable Package for Sparse Learning using L0 Regularization.” J. Mach. Learn. Res. 24 (2022): 205:1-205:8.
Hazimeh, Hussein and Rahul Mazumder. “Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms.” Oper. Res. 68 (2018): 1517-1537.
https://github.com/hazimehh/L0Learn/blob/master/R/fit.R
Examples
data("example_matrix")
fit <- sparse_regression(
example_matrix[, -1],
example_matrix[, 1]
)
head(coef(fit))
#> 6 x 9 sparse Matrix of class "dgCMatrix"
#>
#> intercepts 1.985357 2.123864 1.668191 1.664802 1.66545500 1.66245399
#> . . . . . .
#> . . . . . .
#> . . . . 0.01047924 0.01345722
#> . . . . . .
#> . . . . . .
#>
#> intercepts 1.66218384 1.662335607 1.662162452
#> 0.01029233 0.007523953 0.006073601
#> . 0.005339557 0.006290166
#> 0.01271126 0.012662684 0.010752371
#> . -0.002983770 -0.001518502
#> -0.01337606 -0.013514219 -0.013052947