Create an ML pipeline

In this section, we will be building a prediction ML model using the pre-processed dataset from the previous section. This dataset will be the input to the ML pipeline builder which enables you to define the model’s architecture and select a target column for prediction.

To create an ML pipeline, please make sure to follow the below steps.

  1. Navigate to the Pipelines component in the left menu and click Create Pipeline. ML Pipeline Creation 1

  2. In the pop-up that appears, name the pipeline “Deal Prediction ML Pipeline” and choose the input dataset as Zoho_CRM_Deal_Prediction_Sample. In our case, the target column should be “Stage”. The model name will be auto-populated based on the pipeline name. Click Create Pipeline. ML Pipeline Creation Meta

  3. The ML pipeline builder interface page will be displayed as shown in the screenshot below. ML Pipeline Creation 3

Now that we have created our ML pipeline, we will proceed to configure the pipeline by defining the nodes in the ML pipeline builder interface.

Encoding categorical columns

Since our target column “Stage”, Type", and “Lead Source” contains categorical data of type String, we will encode them for further ML training standards. Please make sure to follow the below steps to encode the columns.

  1. In the Operations menu, navigate to ML operations->Encoding->Ordinal Encoder. Drag and drop the Ordinal Encoder node to the ML pipeline builder interface. In the Ordinal Encoder configuration section in the right panel, choose the column as “Stage” and click Save. Ordinal Encoding

  2. In the same manner, navigate to ML operations->Encoding->One-Hot Encoder. Drag and drop the One-Hot encodernode to the ML pipeline builder interface. In the configuration section on the right panel, choose the columns as “Type”, and “Lead Source”, and click Save. One-Hot Encoding
    These encoding operations will convert the column values of type String to Integer, while maintaining the order and preserving data accuracy.

Normalize the columns

Since the values of the all the features are in various ranges, we will use the Mean - Std Normalization component to scale down the values of the features to a common range, typically between 0 and 1. Navigate to ML operations->Normalization. Drag and drop the Mean-Std Normalizationnode to the ML pipeline builder interface. In the configuration box on the right panel, choose all the columns except “Stage” and click Save.

Once normalization is applied, the ML pipeline builder page will be displayed as below: Mean-Std Normalization

ML algorithm and Hyperparameter tuning

For any ML model, it’s necessary to implement an ML algorithm based on which the model will be trained. In this tutorial, we will be implementing the Random-Forest Classification algorithm to configure the tuning parameters for the ML model to ensure it is optimized for our pre-processed dataset.

  1. In the Operations menu, expand ML operations->Algorithm->Classification. Drag and drop the Random-Forest Classification node into the pipeline builder. The node will be automatically connected to the Destination node. Make an input connection with the Mean-Std Normalization and the Random-Forest Classification node. complete-ml-pipeline

  2. For the Random-Forest Classification node, we will go with the default configuration and click Save.

Now, we have completed making the required node connections and configurations. We can proceed to execute the pipeline by clicking Execute for further evaluation and deployment. executed-ml-pipeline

Click Execution Stats to view more details about each stage of the execution in detail. execution-stats-ml-pipeline

Upon successful execution of the ML pipeline, the Deal Prediction model is created and will be displayed under the Models section.

You can view the details of the model on the model’s details page by clicking on the model name.

Model

Last Updated 2024-04-05 23:41:34 +0530 +0530