Regression algorithms
Regression is the task of predicting a continuous quantity. QuickML features multiple regression algorithms, including:
AdaBoost Regression
This regression begins by fitting a regressor on the original dataset, followed by fitting additional copies of the regressor on the same dataset. The weights of these instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.
Adaboost is a machine-learning algorithm that builds a series of small, one-step (one level) decision trees, adapting each tree to predict difficult cases missed by the previous trees and combining all trees into a single model.
Boosting in machine learning is a way of combining multiple simple models into a single composite model. This is also why boosting is known as an additive model, since simple models (also known as weak learners) are added one at a time, while keeping existing trees in the model unchanged. As we combine more and more simple models, the complete final model becomes a stronger predictor.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
base_estimator | The base estimator from which the boosted ensemble is built. If None, then the base estimator is DecisionTreeRegressor initialized with max_depth=3. | object | Any regression model | None |
n_estimators (number of estimators) |
The maximum number of estimators at which boosting is terminated. In case of perfect fit, the learning procedure is stopped early. | int | [1, 500] | 50 |
learning_rate | Weight applied to each regressor at each boosting iteration. A higher learning rate increases the contribution of each regressor. | float | (0.0, +Inf) | 1.0 |
loss | The loss function to use when updating the weights after each boosting iteration. | string | {‘linear’, ‘square’, ‘exponential’} | "linear" |
CatBoost Regression
CatBoost is based on gradient boosted decision trees. During training, a set of decision trees is built consecutively. Each successive tree is built with reduced loss compared to the previous trees. The number of trees is controlled by the starting parameters.
It has much less prediction time compared to others.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
learning_rate | The learning rate used for training. | float | (0,1] | 0.03 |
l2_leaf_reg (l2_leaf_regularization) | Coefficient at the L2 regularization term of the cost function. | float | [0,+Inf) | 3.0 |
rsm (random subspace method) | The percentage of features to use at each split selection, when features are selected over again at random. | float | (0,1] | None |
loss_function | The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters. | string | {'RMSE', 'MAE', 'Quantile:alpha=value, 'LogLinQuantile: alpha=value', 'Poisson', 'MAPE', 'Lq:q=value', 'SurvivalAft:dist=value; scale=value'} Note : range of value = [0, 1] | 'RMSE' |
nan_mode | The method for processing missing values in the input dataset. | string | {'Forbidden', 'Min', 'Max'} | Min |
leaf_estimation_method | The method used to calculate the values in leaves. | string | {"Newton", "Gradient"} | None |
score_function | The score type used to select the next split during the tree construction. | string | {L2, Cosine} | Cosine |
max_depth | Maximum depth of the tree. | int | [1,+Inf) | None |
n_estimators (number of estimators) |
The maximum number of trees that can be built when solving machine learning problems. When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter. | int | [1, 500] | None |
Decision-Tree Regression
Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. Decision trees can handle both categorical and numerical data. When predicting the output value of a set of features, it will predict the output based on the subset that the set of features falls into.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
criterion | The function to measure the quality of a split. | string | {"mse", "friedman_mse", "mae"} | "mse” |
splitter | The strategy used to choose the split at each node. | string | {“best”, “random”} | ”best” |
max_depth | The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. | int | (0, +Inf) | None |
min_samples_split | The minimum number of samples required to split an internal node | int or float | [2, +Inf) or (0, 1.0] | 2 |
min_samples_leaf | The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least min_samples_leaf training samples in each of the left and right branches. | int or float | [1, +Inf) or (0, 0.5] | 1 |
min_weight_fraction_leaf | The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. | float | [0, 0.5] | 0 |
max_features | The number of features to consider when looking for the best split | int, float or string | (0, n_features] or { “sqrt”, “log2”}, | None |
max_leaf_nodes | Grow a tree with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. | int | (1, +Inf) | None |
min_impurity_decrease | A node will be split if this split induces a decrease of the impurity greater than or equal to this value. | float | [0, +Inf) | 0.0 |
ElasticNet Regression
Elastic net is a popular type of regularized linear regression that combines two popular penalties, specifically the L1 (Lasso Regression) and L2 (Ridge Regression) penalty functions. Elastic Net is an extension of linear regression that adds regularization penalties to the loss function during training.
Regularization is a technique to prevent the model from over-fitting by adding extra information to it. In regularization technique, we reduce the magnitude of the features by keeping the same number of features.
Sometimes, the lasso regression can cause a small bias (difference between predicted and actual value) in the model where the prediction is too dependent upon a particular variable. In these cases, elastic bet proves to be better performing by combining the regularization of both lasso and ridge regression.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
alpha | Constant that multiplies the penalty terms. | float | (0, +Inf) | 1.0 |
l1_ratio | The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2. |
float | [0, 1] | 0.5 |
fit_intercept | Whether the intercept should be estimated or not. | bool | True or False | True |
normalize | This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. | bool | True or False | False |
tol (tolerance) | The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol. | float | [0.0, +Inf) | 1e-4 |
warm_start | When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. | bool | True or False | False |
positive | When set to True, it forces the coefficients to be positive. | bool | True or False | False |
selection | If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. | string | {"cyclic", "random"} | "cyclic" |
GB Regression
Gradient-boosting regression calculates the difference between the current prediction and the known correct target value.
This difference is called residual. After obtaining this value, gradient-boosting regression trains a weak model (Decision Tree) that maps features to that residual. This residual predicted by a weak model is added to the existing model input, nudging the model towards the correct target. Repeating this step multiple times improves the overall model prediction.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
loss | Loss function to be optimized. ‘ls’ refers to least squares regression. ‘lad’ (least absolute deviation) is a highly robust loss function solely based on order information of the input variables. ‘huber’ is a combination of the two. ‘quantile’ allows quantile regression (use alpha to specify the quantile). | string | {'ls', 'lad', 'huber', 'quantile'} | ’ls’ |
learning_rate | Learning rate shrinks the contribution of each tree by learning_rate. | float | (0.0, +inf) | 0.1 |
n_estimators (number of estimators) |
The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance. | int | [1, 500) | 100 |
criterion | The function to measure the quality of a split. | string | {'friedman_mse', 'mse', 'mae'} | ’friedman_mse’ |
subsample | The fraction of samples to be used for fitting the individual base learners. | float | (0.0, 1.0] | 1.0 |
max_depth | Maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree. | int | (0, +Inf) | None |
min_samples_split | The minimum number of samples required to split an internal node | int or float | [2, +Inf) or (0, 1.0] | 2 |
min_samples_leaf | The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least min_samples_leaf training samples in each of the left and right branches. | int or float | [1, +Inf) or (0, 0.5] | 1 |
min_weight_fraction_leaf | The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. | float | [0, 0.5] | 0 |
max_features | The number of features to consider when looking for the best split | int, float or string | (0, n_features] or { “sqrt”, “log2”} | None |
max_leaf_nodes | Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. | int | (1, +Inf) | None |
min_impurity_decrease | A node will be split if this split induces a decrease of the impurity greater than or equal to this value. | float | [0, +Inf) | 0.0 |
init | An estimator object that is used to compute the initial predictions. init has to provide fit and predict. If ‘zero’, the initial raw predictions are set to zero. | object | estimator (Regression model except cat boost ) or ‘zero’ | None |
warm_start | When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just erase the previous solution. | bool | True or False | False |
tol (tolerance) | Tolerance for the early stopping. When the loss is not improving by at least tol for n_iter_no_change iterations (if set to a number), the training stops. | float | [0.0, +Inf) | 1e-4 |
KNN Regression
KNN Regression works by finding the distances between a query (data instance) and all the examples in the data, selecting the specified number examples (K) closest to the query, then votes for the point that is the average of the observations in the same neighbourhood.
In other words, it approximates the association between independent variables (input variables) and the continuous outcome (target) by averaging the observations in the same neighbourhood.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
n_neighbors (number of neighbours) |
Number of neighbors to use by default for kneighbors queries. | int | [1, n] n = Total number of records in dataset |
5 |
weights | Weight function used in prediction.
|
string | {‘uniform’, ‘distance’} | ’uniform’ |
algorithm | Algorithm used to compute the nearest neighbors | string | {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’} | ’auto’ |
leaf_size | Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem. | int | (1, +Inf) | 30 |
p | Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used. |
int | [1,3] | 2 |
metric | Metric to use for distance computation. Default is “minkowski”, which results in the standard Euclidean distance when p = 2. | str | {‘cityblock’, ‘cosine’, 'euclidean', 'l1', 'l2', 'manhattan', 'nan_euclidean', ’minkowski’} | ’minkowski’ |
This regression simply fits a line to a scatter plot. Kernel values are used to derive weights to predict outputs from given inputs. Kernel regression is a non-parametric technique to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables X and Y.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
alpha | Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. | float | [0, +Inf) | 1.0 |
kernel | Kernel mapping used internally. This parameter is directly passed to pairwise_kernel. If kernel is a string, it must be one of the metrics in pairwise. PAIRWISE_KERNEL_FUNCTIONS or “precomputed”. If kernel is “precomputed”, X is assumed to be a kernel matrix. | string | {‘additive_chi2’,'chi2' ‘linear’, ‘poly’, ‘polynomial’, ‘rbf’, ‘laplacian’, ‘sigmoid’, 'cosine’} | ”linear” |
gamma | Gamma parameter for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. | float | [0, +Inf) | None |
degree | Degree of the polynomial kernel. | float | [0, +Inf) | 3 |
coef0 | Zero coefficient for polynomial and sigmoid kernels. | float | (-Inf, +Inf) | 1 |
LGBM Regression
LGBM works by starting with an initial estimate that is updated using the output of each tree. The learning parameter controls the magnitude of this change in the estimates. It can be used on any data and provides a high degree of accuracy, as it contains many built-in preprocessing steps.
The LightGBM algorithm grows vertically, meaning it grows leaf-wise, while other algorithms grow level-wise. LightGBM chooses the leaf with the largest loss to grow. It can lower more loss than a level-wise algorithm when growing the same leaf.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
boosting_type | Method of Boosting. | string | {‘gbdt’, ‘dart’, ‘goss’} | 'gbdt' |
num_leaves | Maximum tree leaves for base learners. | int | (1, +Inf) | 31 |
max_depth | Maximum tree depth for base learners, <= 0 means no limit. | int | (-Inf, +Inf) | -1 |
learning_rate | Boosting learning rate. | float | (0.0, +Inf) | 0.1 |
n_estimators (number of estimators) |
Number of boosted trees to fit. | int | [1, 500] | 100 |
subsample_for_bin | Number of samples for constructing bins. | int | (0, +Inf) | 200000 |
min_split_gain | Minimum loss reduction required to make a further partition on a leaf node of the tree. | float | [0.0, +Inf) | 0.0 |
min_child_weight | Minimum sum of instance weight (Hessian) needed in a child (leaf). | float | [0.0, +Inf) | 1e-3 |
min_child_samples | Minimum number of data needed in a child (leaf). | int | [0, +Inf) | 20 |
subsample | Subsample ratio of the training instance. | float | (0.0, 1.0] | 1.0 |
subsample_freq (subsample_frequency) | Frequency of subsample, <= 0 means no enable. | int | (-Inf, +Inf) | 0 |
colsample_bytree (column sample by tree) | Subsample ratio of columns when constructing each tree. | float | (0.0, 1.0] | 1.0 |
reg_alpha (alpha) | L1 regularization term on weights. | float | (0.0, +Inf) | 0.0 |
reg_lambda (lambda) | L2 regularization term on weights. | float | (0.0, +Inf) | 0.0 |
importance_type | The type of feature importance to be filled into featureimportances. If ‘split’, result contains numbers of times the feature is used in a model. If ‘gain’, result contains total gains of splits which use the feature. | string | { ‘gain’, 'split'} | 'split' |
Lasso Regression
Lasso regression is a regularization technique. It is used over regression methods for a more accurate prediction. Lasso regression is a type of linear regression that uses shrinkage. Shrinkage is where data values are shrunk towards a central point, like mean. The lasso procedure encourages simple, sparse models (i.e. models with fewer parameters).
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
alpha | Constant that multiplies the L1 term, controlling regularization strength. alpha must be a non-negative float |
float | (0, +Inf) | 1.0 |
fit_intercept | Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations | bool | True or False | True |
normalize | This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. | bool | True or False | False |
tol (tolerance) | The tolerance for the optimization: if the updates are smaller than tol, the optimization code checks the dual gap for optimality and continues until it is smaller than tol. | float | [0.0, +Inf) | 1e-4 |
warm_start | When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. | bool | True or False | False |
positive | When set to True, forces the coefficients to be positive. | bool | True or False | False |
selection | If set to ‘random’, a random coefficient is updated every iteration rather than looping over features sequentially by default. | string | {"cyclic", "random"} | "cyclic" |
Linear Regression
Linear regression is a regression model that estimates the linear relationship between independent variable (input) and dependent variable (target) using a straight line. It is the basic algorithm for regression type of problems.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
fit_intercept | Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations. | bool | True or False | True |
normalize | This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. |
bool | True or False | False |
Random-Forest Regression
The random forest is a classification and regression algorithm consisting of many decisions trees. It uses bagging and feature randomness when building individual trees to try to create an uncorrelated forest of trees whose prediction by committee is more accurate than that of any individual tree.
A Bagging is an ensemble meta-estimator that fits base classifiers/regressors on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
n_estimators | The number of trees in the forest. | int | [1, 500] | 100 |
criterion | The function to measure the quality of a split. Supported criteria are “squared_error” for the mean squared error, which is equal to variance reduction as feature selection criterion, “absolute_error” for the mean absolute error. | string | {"mse", "mae"} | ”mse” |
max_depth | The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. | int | (0, +Inf) | None |
min_samples_split | The minimum number of samples required to split an internal node | int or float | [2, +Inf) or (0, 1.0] | 2 |
min_samples_leaf | The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least min_samples_leaf training samples in each of the left and right branches. | int or float | [1, +Inf) or (0, 0.5] | 1 |
min_weight_fraction_leaf | The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Samples have equal weight when sample_weight is not provided. | float | [0, 0.5] | 0.0 |
max_features | The number of features to consider when looking for the best split | int, float or string | (0, n_features] or { “sqrt”, “log2”}, None | None |
max_leaf_nodes | Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. | int | (1, +Inf) | None |
min_impurity_decrease | A node will be split if this split induces a decrease of the impurity greater than or equal to this value. | float | [0, +Inf) | 0.0 |
bootstrap | Whether bootstrap samples are used when building trees. If False, the whole dataset is used to build each tree. | bool | True or False | True |
oob_score (out of bag score) | Whether to use out-of-bag samples to estimate the generalization score. Only available if bootstrap=True. | bool | True or False | False |
warm_start | When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest. | bool | True or False | False |
Ridge Regression
Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where independent variables are highly correlated. It can be used when the input variables are highly correlated with the target.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
alpha | Constant that multiplies the L2 term, controlling regularization strength. | float | (0, +Inf) | 1.0 |
fit_intercept | Whether to fit the intercept for this model. If set to false, no intercept will be used in calculations. | bool | True or False | True |
normalize | This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. |
bool | True or False | False |
tol (tolerance) | Precision of the solution. | float | [0.0, +Inf) | 1e-4 |
solver | Solver to use in the computational routines: | string | {‘auto’, ‘svd’, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’, ‘saga’} | ’auto’ |
- ‘auto’ chooses the solver automatically based on the type of data.
- ‘svd’ uses a Singular Value Decomposition of X to compute the Ridge coefficients. It is the most stable solver, in particular more stable for singular matrices than ‘cholesky’ at the cost of being slower.
- ‘cholesky’ uses the standard scipy.linalg.solve function to obtain a closed-form solution.
- ‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for large-scale data (possibility to set tol and max_iter).
- ‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative procedure.
- ‘sag’ uses a Stochastic Average Gradient descent, and ‘saga’ uses its improved, unbiased version named SAGA. Both methods also use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.
SVM Regression
Support vector regression is used to predict discrete values. Support vector regression uses the same principle as the SVMs. The basic idea behind SVM is to find the best fit line. In SVM, the best fit line is the hyperplane that has the maximum number of points.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
C | Regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. | float | (0.0, +Inf) | 1.0 |
kernel | Specifies the kernel type to be used in the algorithm. If none is given, rbf will be used. If a callable is given it is used to precompute the kernel matrix. | string | {‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’} | ’rbf’ |
degree | Degree of the polynomial kernel function (‘poly’). | int | [0, +Inf) | 3 |
gamma | Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. | string or float | {‘scale’, ‘auto’} or (0.0, +Inf) | ’scale’ |
coef0 | Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’. | float | (-Inf, +Inf) | 0.0 |
shrinking | Whether to use the shrinking heuristic. | bool | True or False | True |
tol (tolerance) | Tolerance for stopping criterion. | float | [0.0, +Inf) | 1e-3 |
epsilon | Epsilon in the epsilon-SVM model. It specifies the epsilon-tube within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value. |
float | [0, +Inf) | 0.1 |
XGB Regression
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. It implements machine-learning algorithms under the gradient-boosting framework. It provides a parallel tree boosting to solve many data science problems quickly and accurately. It uses L1 and L2 regularization to predict points and trains quickly.
Hyper Parameters:
Parameter | Description | Data Type | Possible Values | Default Values |
---|---|---|---|---|
booster | Decides which booster to use. | string | {‘gbtree', 'gblinear', 'dart' } | ’gbtree’ |
learning_rate | Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and eta shrinks the feature weights to make the boosting process more conservative. | float | [0,1] | 0.1 |
n_estimators (number of estimators) |
Number of trees to fit. | int | [1, 500] | 100 |
objective | Logistic regression for binary classification. | string | Mentioned below the table . | "reg:linear" |
subsample | Control the sample's proportion. | int | (0,1] | 1 |
max_depth | Maximum depth of a tree. | int | (0, +Inf) | 3 |
max_delta_step | If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. | int or float | [0, +Inf) | 0 |
colsample_bytree (column sample by tree) | Column's fraction of random samples. | float | (0, 1] | 1.0 |
colsample_bylevel (column sample by level) | It is the subsample ratio of columns for each level. Subsampling occurs once for every new depth level reached in a tree. Columns are subsampled from the set of columns chosen for the current tree. | float | (0, 1] | 1.0 |
min_child_weight | Minimum sum of weights. | int | [0, +Inf) | 1 |
reg_alpha (alpha) | L1 regularization term on weights. | float | [0.0, +Inf) | 0.0 |
reg_lambda (lambda) | L2 regularization term on weights. | float | [0.0, +Inf) | 0.0 |
scale_pos_weight (scale positive weight) | Control the balance of positive and negative weights, useful for unbalanced classes. | int | [0, +Inf) | 1 |
POSSIBLE VALUES FOR “OBJECTIVE” PARAM :
{ “rank:pairwise”, reg:tweedie, “reg:gamma”, “reg:linear”, “count:poisson”}
Last Updated 2023-06-15 17:14:14 +0530 +0530
Yes
No
Send your feedback to us