Create ML Pipeline

In this section, we will be building a prediction ML model using the preprocessed datasets in the previous section. The datasets will be the input to the ML Pipeline Builder which enables you to define the model’s architecture and select a target column for prediction.

To create a ML pipeline:

  1. Navigate to the Pipelines component in the left menu and click Create Pipeline.

create-ml-pipeline

  1. In the pop-up that appears, provide the name of the pipeline as “Pipeline_B” and choose the input dataset as Cancer_detection_A. In our case, the target column should be “diagnosis”. The model name will be auto-populated based on the pipeline name. Click Create Pipeline.

ml-pipeline-name

The Retrain model when the datasset is updated is for retraing the pipeline created, everytime when the dataset is updated, checkout this document on periodic-sync, where as the Create an Auto-generated pipeline using AutoML will create a ML pipeline automatically reference document on AutoML pipeline.

  1. The pipeline details page will be displayed, as shown in the screenshot below.

ml-pipeline-interface

Now that we have created our ML pipeline, we will proceed to configure the pipeline by defining the nodes in the ML Pipeline Builder interface.

Data Type Conversion

Since our target column “diagnosis” contains categorical data of type String, we will encode it for further ML training standards.

  1. In the Operations menu, navigate to ML operations->Encoding->Label Encoder. Drag and drop the Label Encoder node to the ML Pipeline Builder Interface. Label encoding can only be applied to the target column. Hence, it is executed automatically.

ordinal-encoder

This operation will convert the column values of type String to Integer, while maintaining the order and preserving data accuracy.

Hyperparameter Tuning

For any ML model, it’s mandatory to implement an ML algorithm based on which the model will be trained. In this tutorial, we will be implementing the logistic classification algorithm to configure the tuning parameters for the ML model to ensure it is optimized for our preprocessed dataset.

  1. In the Operations menu, expand ML operations->Algorithm->Classification->Logistic Regression. Drag and drop the Logistic Regression node in the Pipeline Builder. The node will be connected to the Destination node automatically. Make a connection with the Label Encoder and the Logistic Regression node.

logistic-reg

  1. For the Logistic Regression node, we will go with the default configuration and click Save.

logistic-reg

We have now completed making the required node connections and configurations. We can proceed to execute the pipeline by clicking on Execute for further evaluation and deployment.

status-success

Click Execution Stats to view more details about each stage of the execution in detail.

pipelineb-stats

Upon successful execution of the ML Pipeline, the prediction model is created and will be displayed under the Models section.

You can view the details of the model in the Models details page by clicking on the model name.

select-ml-pipeline

Additionally, the accuracy of the generated model can be evaluated and viewed in the Metrics section of the Models details page. This provides valuable insights on the performance and effectiveness of the model in making predictions on the data.

metrics

Last Updated 2024-10-10 12:38:19 +0530 +0530