Implementation

This implementation section acts as a step-by-step procedure guide to working with Zia AutoML. As discussed earlier, you can configure and train your model from the Catalyst console. Refer to the SDKand API documentation sections for implementing AutoML in your application.

Access AutoML

To access AutoML in your Catalyst console:

  1. Navigate to Zia Services in the left pane of the Catalyst console and click AutoML. This will open the AutoML feature where you can create the training models as required.
    automl

Create a Model

The process of creating a model and training it is divided into three stages. We will discuss them in order.

To create and train an AutoML model in Catalyst:

  1. Click Create Model from the AutoML page.
    catalyst_automl_blank_page
  2. The first step is to upload a dataset. You can import a dataset by either selecting a CSV file from one of your folders in the File Store or uploading a CSV file from your computer.
    You can select a CSV file from the File Store by navigating to the folder it is in and clicking it.
    catalyst_automl_create_model_1
    You can upload a CSV file from your computer by browsing for it or dragging it to the drop box.
    catalyst_automl_create_model_2
    You must then save the CSV file in one of your folders in the File Store. You can select an existing folder or create a new folder.
    catalyst_automl_create_model_3
Notes:
  • The File Store selection window only displays CSV files in your folders.
  • There is no size limit for the CSV file that you upload through your computer. The storage limits of the File Store apply to storing the CSV file in the File Store.
  1. Click Save and Next.

The next stage is to fix the target column.

catalyst_automl_create_model_4

As discussed earlier, Zia analyzes the dataset you uploaded and determines the data type of each column in the CSV file after it is uploaded. The data types are determined based on the values in the column. For example, if a column has only two distinct values repeated in all the records, Zia determines its data type to be binary-class categorical.

AutoML also calculates and displays the values for missing percentage and distinct entries for all columns, mean and SD for numerical columns. You can hover over the tooltip of a column for a short description of it.

You can also find other information such as the name, total number of columns, and the number of records in the dataset in the page.

  1. Select a target column from the dataset in the dropdown list. catalyst_automl_create_model_5
    As mentioned earlier, you can only select a numerical or a categorical column as the target column to be predicted. The dropdown list only displays these columns.
Note: You can only select a column as a target if its missing percentage is zero. Any column that has incomplete data cannot be selected as the target for prediction.

Once you have selected a target column, the value of the correlation with the target will be calculated and displayed for every other column, except for the columns of the String type.
catalyst_automl_create_model_6
You can change the data type of a column if you think Zia predicted it incorrectly. However, if the data type you choose does not match the values of the column, you will receive an “invalid update” error message. For example, if you select the data type of a column whose values are completely numerical as Date, you will receive the error message.
catalyst_automl_create_model_7

Note: You cannot change the data type of a column once you have set it as a target column.
You can also view the overall stats of the dataset in a pictorial graph by clicking View Overall Stats in the page.

catalyst_automl_create_model_overall_stats
You can filter the columns displayed in the page by clicking on Filters.
catalyst_automl_create_model_filter_1
You can select the filters for each column characteristic based on the results you require.

catalyst_automl_create_model_filter_2

Select the filters and click Apply Filters to view the results.

catalyst_automl_create_model_filter_3

  1. After you have configured the target column, click Next.

The final stage is to select the inputs. This page displays the model type and the name of the target column. As mentioned earlier, the model type is based on the data type of the target column that you select.

catalyst_automl_create_model_8

  1. Enter a name for the model.
  2. Select the columns to be used for training the model from the dropdown list. All columns in the dataset, other than the String type, are selected for training by default.

catalyst_automl_create_model_9

Note: You can exclude columns from being used for training based on the percentage of missing data in them, their correlation to the target, or other factors. This is because, columns with high missing percentages or low correlations to the target can alter the accuracy of a model prediction.
  1. After you have selected the required columns for training, click Train Model.

The console displays a training progress message while the model training is in progress.

catalyst_automl_create_model_10

When the training is completed, you will get a notification in your Catalyst console alerting you of the success or failure of the model’s training.

catalyst_automl_notification

You can now view the Evaluation Report and Model Prediction sections of the model. We will discuss these at the end of this section.

The created model is listed in the AutoML page. A unique Model ID is created for the model, which is used to refer to the model working with the API.

The page also displays details like the name of the dataset that is associated with the model, model type, created time, and status for each model. The status is shown as Completed for the models that have completed the training successfully. You can search for a model by its name using the search bar.

catalyst_automl_page

Rename a Model

To rename an AutoML model:

  1. Click the ellipsis icon for the model you need to rename from the AutoML page and click Rename.

  1. Enter the new name for the model and press Enter.

catalyst_automl_rename_2

Delete a Model

To delete an AutoML model:

  1. Click the ellipsis icon for the model you need to delete from the AutoMl page and click Delete.

  2. Click Yes, Proceed in the confirmation window.
    catalyst_automl_delete_confirmation

View a Model’s Evaluation Report

You will be automatically navigated to a model’s Evaluation Report section once the training process is complete. You can also open it by clicking on the model’s name from the AutoML page.

As discussed earlier, the information provided in the evaluation report differs for each model type.

The following evaluation report was generated for the multi-class classification model that we created earlier in this section.

catalyst_automl_eval_report_multi

You can refer to this section to learn about the accuracy, F1 score, precision, recall, and log loss statistics. The report also displays the columns that were used for the training under Selected Columns.

catalyst_automl_eval_report_selected_columns

The Evaluation Report section also provides you a shortcut to train a new model with the same dataset that you used for this model. You can train a new model if you find the accuracy of this model’s prediction to be too low for your purposes. You can make changes to the new model, such as including or excluding different columns in the dataset this time or changing the data type of a column while training it.

If you click Train, Catalyst will redirect you to the Fix a Target stage of the model creation. The same dataset will be included again automatically. You can then make the necessary changes and train the new model.

The Evaluation Report section displays a graph named Feature Importance for all three types of model. This displays the importance of each feature or column in the dataset for training this model, in terms of relative percentages. That is, it shows which features AutoML found the most useful while building the model.

catalyst_automl_eval_report_feature_importance

The evaluation report of a binary-class classification model includes additional information, such as the confusion matrix with the TP, TN, FP, FN values, as discussed in this Key Concepts section.

catalyst_automl_eval_report_binary

Similarly, the evaluation report of a regression model includes the Mean Absolute Error, Mean Squared Error, and Root Mean Squared Error statistics.

catalyst_automl_eval_report_regression

Predict Values

The Model Prediction section provides a test console for you to provide input values and obtain the predictive output.

catalyst_automl_model_prediction

The model applies the knowledge it obtained during its training with the dataset to predict the value of the target for a selection of input that you provide. The accuracy level of this prediction is displayed in the Evaluation Report section. The test console will display the columns that were found to be required for building the model as input fields.

To predict the value of the target of an AutoML model for a set of conditions:

  1. Click the model’s name from the AutoML page and click Model Prediction.
  2. Enter input values for the fields in the JSON format.
    catalyst_automl_model_prediction_2
Notes:
  • If you enter a value in a format that does not match the data type of the column, such as a numerical value for the date type, the console will display a “cannot parse” error. Ensure that you provide the data in the right format.
  • You must provide the value for atleast one valid column while testing the prediction.
  • If you don’t enter the value for an input field, a default value will be entered for the column by Zia automatically. However, this will affect the accuracy of the prediction.
  1. Click Predict.

AutoML will display the predicted value of the target column in a pop-up window.

The prediction of a multi-class classification model will contain the possibility of the occurrence of each class in the target, as a percentage value. For example, the model we created earlier predicts the following outcome of the payment type by the customers for the given input data.

catalyst_automl_model_prediction_window_multi

The prediction of a binary-class classification model will contain the possibility of the occurrence of the positive class and the negative class in the target. For example, in a model predicting the possibility of one of the two delivery types, Standard and Express, where the target is an “Express Delivery” positive class, the prediction results estimate that 81% of the time the customers will not use the express delivery, and they will use it only 19% of the time, for a given set of input conditions.

catalyst_automl_model_prediction_window_binary

The prediction of a regression model will contain a single numerical result as the value of the target. For example, a model predicts the number of customers of a business for the next year. The Predict Label is the name of the target column, and the Prediction Result column holds the value of the predicted number of customers as 4,017,890 for a given set of input conditions.

catalyst_automl_model_prediction_window_regression

AutoML also provides you with the API request template that you can use from your terminal in cURL format.

catalyst_automl_model_prediction_api

You can copy this code using the copy icon and paste it in your terminal to test your model’s prediction from your local system. You can also implement this request code in your Catalyst application to enable predictions.

You must replace the values of the Project ID and Model ID in the request URL, and the value of the Zoho authorization token in the code. You can refer to the API documentation for more information. You must provide the data for all the columns that are requested in the test console as key-value pairs in JSON format in this request API query.

Note: You cannot create, configure, or train a model using APIs. This can only be done from the Catalyst console.

Last Updated 2023-06-09 21:17:32 +0530 +0530