ML Algorithms in QuickML
QuickML mainly focuses on powering ML Pipelines with Machine Learning Operations in an effortless manner to provide smooth pipeline execution environment. Hence, it has been integrated with a wide range of ML algorithms & features to provide the best analytical results out of data.
ML algorithms are programs that can learn from data and improve from experience, without any external intervention. The following algorithms and operations are all available in QuickML as stages that can be configured in one or more pipeline executions.
The most widely used algorithms in the data science domain are,
 Classification algorithms
 Regression algorithms
Classification Algorithms
Classification is the task of predicting a discrete class label. QuickML features following classification algorithms

AdaBoost Classification
AdaBoost is a machinelearning algorithm that builds a series of small, onestep (one level) decision trees, adapting each tree to predict difficult cases missed by the previous trees and combining all trees into a single model. This classification begins by fitting a classifier on the original dataset, followed by additional copies of the classifier on the same dataset. The weights of these instances are adjusted according to the error of the current prediction. So that, subsequent classifiers focus more on difficult cases.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values base_estimator The base estimator from which the boosted ensemble is built. If none, then the base estimator is DecisionTreeClassifier initialized with max_depth=1. object Any classification model except KNN Classification model None n_estimators
(number of estimators)The maximum number of estimators at which boosting is terminated. In case of perfect fit, the learning procedure is stopped early. int [1, 500] 50 learning_rate Weight applied to each classifier at each boosting iteration. A higher learning rate increases the contribution of each classifier. float (0.0, +Inf) 1.0 algorithm If ‘SAMME.R’ then use the SAMME.R real boosting algorithm. base_estimator must support calculation of class probabilities. If ‘SAMME’ then use the SAMME discrete boosting algorithm. The SAMME.R algorithm typically converges faster than SAMME, achieving a lower test error with fewer boosting iterations. string {‘SAMME’, ‘SAMME.R’} ’SAMME.R’ 
CatBoost Classification
CatBoost is based on gradientboosted decision trees. During training, a set of decision trees is built consecutively. Each successive tree is built with reduced loss compared to the previous trees. The number of trees is controlled by the starting parameters.
This classification has much less prediction time compared to others.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values learning_rate Used for reducing the gradient step. float (0,1] 0.03 l2_leaf_reg (l2_leaf_regularization) Coefficient at the L2 regularization term of the cost function. float [0,+inf) 3.0 rsm (random subspace method) The percentage of features to use at each split selection, when features are selected over again at random float (0,1] None loss_function The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters. string {'Logloss', 'CrossEntropy', 'MultiClass', 'MultiClassOneVsAll'} 'MultiClass' nan_mode The method for processing missing values in the input dataset. string {'Forbidden', 'Min', 'Max'} Min leaf_estimation_method The method used to calculate the values in leaves. string {"Newton", "Gradient"} None score_function The score type used to select the next split during the tree construction. string {L2, Cosine} Cosine max_depth Maximum depth of the tree. int [1,+Inf) None n_estimators
(number of estimators)The maximum number of trees that can be built when solving machine learning problems. When using other parameters that limit the number of iterations, the final number of trees may be less than the number specified in this parameter int [1, 500] None 
DecisionTree Classification
Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets, while at the same time an associated decision tree is incrementally developed.
Decision trees can handle both categorical and numerical data. when predicting the output value of a set of features, it will predict the output based on the subset that the set of features falls into.
Hyper Parameters
Parameter Description Data Type Possible Values Default Values criterion The function to measure the quality of a split. string {“gini”, “entropy”} ”gini” splitter The strategy used to choose the split at each node. string {“best”, “random”} ”best” max_depth The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. int (0, +Inf) None min_samples_split The minimum number of samples required to split an internal node. int or float [2, +Inf) or (0, 1.0] 2 min_samples_leaf The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least min_samples_leaf training samples in each of the left and right branches. int or float [1, +Inf) or (0, 0.5] 1 min_weight_fraction_leaf The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. float [0, 0.5] 0 max_features The number of features to consider when looking for the best split int, float or string (0, n_features] or { “sqrt”, “log2”} None max_leaf_nodes Grow a tree with max_leaf_nodes in bestfirst fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes. int (1, +Inf) None min_impurity_decrease A node will be split if this split induces a decrease of the impurity greater than or equal to this value. float [0, +Inf) 0.0 
GB Classification
Gradientboosting classification calculates the difference between the current prediction and the known correct target value. This difference is called residual. After finding this value, the gradientboosting classifier trains a weak model (Decision Tree) that maps features to that residual. This residual predicted by a weak model is added to the existing model input and thus this process nudges the model towards the correct target. Repeating this step multiple times improves the overall model prediction.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values loss The loss function to be optimized. ‘deviance’ refers to deviance (= logistic regression) for classification with probabilistic outputs. string {'deviance', 'exponential'} 'deviance' learning_rate Learning rate shrinks the contribution of each tree by learning_rate. There is a tradeoff between learning_rate and n_estimators. float (0.0, +Inf) 0.1 n_estimators
(number of estimators)The number of boosting stages to perform. int [1, 500] 100 criterion The function to measure the quality of a split. string {'friedman_mse', 'mse', 'mae'} ’friedman_mse’ subsample The fraction of samples to be used for fitting the individual base learners. float (0.0, 1.0] 1.0 max_depth The maximum depth of the individual regression estimators. int (0, +Inf) None min_samples_split The minimum number of samples required to split an internal node int or float [2, +Inf) or (0, 1.0] 2 min_samples_leaf The minimum number of samples required to be at a leaf node. int or float [1, +Inf) or (0, 0.5] 1 min_weight_fraction_leaf The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. float [0, 0.5] 0 max_features The number of features to consider when looking for the best split int, float or string (0, n_features] or { “sqrt”, “log2”} None max_leaf_nodes Grow trees with max_leaf_nodes in bestfirst fashion. Best nodes are defined as relative reduction in impurity. int (1, +Inf) None min_impurity_decrease A node will be split if this split induces a decrease of the impurity greater than or equal to this value. float [0, +Inf) 0.0 init An estimator object that is used to compute the initial predictions. object or string estimator (Any classification model except SVM classification and catboost) or ‘zero’ None warm_start When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just erase the previous solution. bool True or False False tol (tolerance) Tolerance for the early stopping. When the loss is not improving by at least tol for n_iter_no_change iterations (if set to a number), the training stops. float [0.0, +Inf) 1e4 
KNN Classification
KNN works by finding the distances between a query (data instance) and all the examples in the data, selecting the specified number examples (K) closest to the query, then voting for the most frequent label in the neighbourhood.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values n_neighbors
(number of neighbours)Number of neighbors to use by default for kneighbors queries. int [1, n]
n = Total number of records in dataset5 weights Weight function used in prediction. Possible values string {‘uniform’, ‘distance’} ’uniform’ algorithm Algorithm used to compute the nearest neighbors. string {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’} ’auto’ leaf_size Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. int (1, +Inf) 30 p Power parameter for the Minkowski metric. int [1,3] 2 metric Metric to use for distance computation. Default is “minkowski”, which results in the standard Euclidean distance when p = 2. string {‘cityblock’, ‘cosine’, 'euclidean', 'l1', 'l2', 'manhattan', 'nan_euclidean', ’minkowski’} ’minkowski’ 
LGBM Classification
LGBM works by starting with an initial estimate that is updated using the output of each tree. The learning parameter controls the magnitude of this change in the estimates. It can be used on any data and provides a high degree of accuracy, as it contains many builtin preprocessing steps.
The LightGBM algorithm grows vertically, meaning it grows leafwise, while other algorithms grow levelwise. LightGBM chooses the leaf with the largest loss to grow. It can lower more loss than a levelwise algorithm when growing the same leaf.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values boosting_type Method of boosting. string {‘gbdt’, ‘dart’, ‘goss’ } 'gbdt' num_leaves Maximum tree leaves for base learners. int (1, +Inf) 31 max_depth Maximum tree depth for base learners, <= 0 means no limit. int (Inf, +Inf) 1 learning_rate Boosting learning rate. You can use callbacks parameter of fit method to shrink/adapt learning rate in training using reset_parameter callback. float (0.0, +Inf) 0.1 n_estimators
(number of estimators)Number of boosted trees to fit. int [1, 500] 100 subsample_for_bin Number of samples for constructing bins. int (0, +Inf) 200000 min_split_gain Minimum loss reduction required to make a further partition on a leaf node of the tree. float [0.0, +Inf) 0.0 min_child_weight Minimum sum of instance weight (Hessian) needed in a child (leaf). float [0.0, +Inf) 1e3 min_child_samples Minimum number of data needed in a child (leaf). int [0, +Inf) 20 subsample Subsample ratio of the training instance. float (0.0, 1.0] 1.0 subsample_freq (subsample_frequency) Frequency of subsample, <= 0 means no enable. int (Inf, +Inf) 0 colsample_bytree (column sample by tree) Subsample ratio of columns when constructing each tree. float (0.0, 1.0] 1.0 reg_alpha (alpha) L1 regularization term on weights. float (0.0, +Inf) 0.0 reg_lambda (lambda) L2 regularization term on weights. float (0.0, +Inf) 0.0 importance_type The type of feature importance to be filled into featureimportances. If ‘split’, result contains numbers of times the feature is used in a model. If ‘gain’, result contains total gains of splits which use the feature. string { ‘gain’, 'split'} 'split' 
Logistic Regression
When the target is binary value, we can use logistic classification. It maps the value between 0 and 1.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values penalty Specify the norm of the penalty:  'none': no penalty is added;
 'l2': add a L2 penalty term and it is the default choice;
 'l1': add a L1 penalty term;
 'elasticnet': both L1 and L2 penalty terms are added.
string { ‘l1’, ‘l2’, ‘elasticnet’, ‘none’} ’l2’ dual Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. bool True or False False tol (tolerance) Tolerance for stopping criteria. float [0.0, +Inf) 1e4 C Inverse of regularization strength; must be a positive float. float [0.0, +Inf) 1.0 solver Algorithm to use in the optimization problem. string { ‘newtoncg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’ } ’lbfgs’ fit_intercept Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function. bool True or False True l1_ratio The ElasticNet mixing parameter, with
0 <= l1_ratio <= 1. Only used if penalty='elasticnet'.float [0, 1] None multi_class If the option chosen is ‘ovr’, then a binary problem is fit for each label. For ‘multinomial’ the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. string {'auto', 'ovr', 'multinomial'} 'auto' intercept_scaling Useful only when the solver ‘liblinear’ is used and self.fit_intercept is set to True. The intercept becomes intercept_scaling * synthetic_feature_weight. float (0, +Inf) 1.0 Note: The values in the "solver" parameter support only few of the values in the "penalty" parameter. So the supported penalties by the solver are mentioned below: newtoncg’  [‘l2’, ‘none’]
 ‘lbfgs’  [‘l2’, ‘none’]
 ‘liblinear’  [‘l1’, ‘l2’]
 ‘sag’  [‘l2’, ‘none’]
 ‘saga’  [‘elasticnet’, ‘l1’, ‘l2’, ‘none’]

Naive Bayes Classification
Naive Bayes is a classifier that uses the Bayes Theorem. It predicts membership probabilities for each class, such as the probability that a given record or data point belongs to a particular class. The class with the highest probability is considered as the most likely class.

RandomForest Classification
The random forest is a classification algorithm consisting of many decisions trees. It uses bagging and feature randomness when building individual trees to try to create an uncorrelated forest of trees whose prediction by committee is more accurate than that of any individual tree.
A Bagging is an ensemble metaestimator that fits base classifiers/regressors on random subsets of the original dataset, then aggregates their individual predictions (either by voting or by averaging) to form a final prediction.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values n_estimators
(number of estimators)The number of trees in the forest. int [1, 500] 100 criterion The function to measure the quality of a split. string {“gini”, “entropy”} ”gini” max_depth The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. int (0, +Inf) None min_samples_split The minimum number of samples required to split an internal node int or float [2, +Inf) or (0, 1.0] 2 min_samples_leaf The minimum number of samples required to be at a leaf node. A split point at any depth will only be considered if it leaves at least min_samples_leaf training samples in each of the left and right branches. int or float [1, +Inf) or (0, 0.5] 1 min_weight_fraction_leaf The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. float [0, 0.5] 0.0 max_features The number of features to consider when looking for the best split int, float or string (0, n_features] or { “sqrt”} None max_leaf_nodes Grow trees with max_leaf_nodes in bestfirst fashion. Best nodes are defined as relative reduction in impurity. int (1, +Inf) None min_impurity_decrease A node will be split if this split induces a decrease of the impurity greater than or equal to this value. float [0, +Inf) 0.0 bootstrap Whether bootstrap samples are used when building trees. If False, the whole dataset is used to build each tree. bool True or False True oob_score(out of bag score) Whether to use outofbag samples to estimate the generalization score. Only available if bootstrap=True. bool True or False False warm_start When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest. bool True or False False 
SVM Classification
SVM, or Support Vector Machine, is a linear model for classification and regression problems. It can solve linear and nonlinear problems and work well for many practical problems. The idea of SVM is simple: The algorithm creates a line or a hyperplane that separates the data into classes.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values C Regularization parameter. The strength of the regularization is inversely proportional to C. float [0.0, +Inf) 1.0 kernel Specifies the kernel type to be used in the algorithm. If none is given, ‘rbf’ will be used. string {‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’} ’rbf’ degree Degree of the polynomial kernel function (‘poly’). Ignored by all other kernels. int [0, +Inf) 3 gamma Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. string or float {‘scale’, ‘auto’} or (0.0, +Inf) ’scale’ coef0 Independent term in kernel function. It is only significant in ‘poly’ and ‘sigmoid’. float (Inf, +Inf) 0.0 shrinking Whether to use the shrinking heuristic. bool True or False True probability Whether to enable probability estimates. bool True or False False tol (tolerance) Tolerance for stopping criterion. float [0.0, +Inf) 1e3 decision_function_shape Whether to return a onevsrest (‘ovr’) decision function of shape (n_samples, n_classes) as all other classifiers, or the original onevsone (‘ovo’) decision function of libsvm which has shape (n_samples, n_classes * (n_classes  1) / 2). string {‘ovo’, ‘ovr’} ’ovr’ break_ties If true, decision_function_shape='ovr', and number of classes > 2, predict will break ties according to the confidence values of decision_function; otherwise the first class among the tied classes is returned. bool True or False False 
XGB Classification
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machinelearning algorithms under the Gradient Boosting framework. It provides a parallel tree boosting to solve many data science problems quickly and accurately. It uses L1 and L2 regularisation to predict points and it fast in training.
Each model has Fit (to train model), predict (to predict new data), get metrics (to get model’s accuracy and other metrics), and feature_importances (importances of the input features for the prediction).
Adaboost, CatBoost, Decision tree, Gradient boost(GB), LGBM, RandomForest, SVM, and XGB’s basic working principles are almost identical for both regression and classification.
Hyper Parameters:
Parameter Description Data Type Possible Values Default Values booster Decides which booster to use. string {‘gbtree', 'gblinear', 'dart' } ’gbtree’ learning_rate Step size shrinkage used in update to prevents over fitting. After each boosting step, we can directly get the weights of new features, and eta shrinks the feature weights to make the boosting process more conservative. float [0,1] 0.1 n_estimators
(number of estimators)Number of trees to fit. int [1, 500] 100 objective Logistic regression for binary classification. string Mentioned below the table . "binary:logistic" subsample Control the sample's proportion. int (0,1] 1 max_depth Maximum depth of a tree. int (0, +Inf) 3 max_delta_step If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. int or float [0, +Inf) 0 colsample_bytree (column sample by tree) Column's fraction of random samples. float (0.0, 1.0] 1.0 colsample_bylevel (column sample by level) It is the subsample ratio of columns for each level. Subsampling occurs once for every new depth level reached in a tree. Columns are subsampled from the set of columns chosen for the current tree. float (0.0, 1.0] 1.0 min_child_weight Minimum sum of weights. int [0, +Inf) 1 reg_alpha (alpha) L1 regularization term on weights. float [0.0, +Inf) 0.0 reg_lambda (lambda) L2 regularization term on weights. float [0.0, +Inf) 0.0 scale_pos_weight (scale positive weight) Control the balance of positive and negative weights, useful for unbalanced classes. int [0, +Inf) 1 POSSIBLE VALUES FOR “OBJECTIVE” PARAM :
{binary:logistic, binary:logitraw, binary:hinge, multi:softmax, multi:softprob}
Last Updated 20231009 18:18:15 +0530 +0530
Yes
No
Send your feedback to us