Understanding stage reconfiguration in QuickML pipelines
Building machine learning pipelines often involves chaining multiple operations, each dependent on the outcome of the previous. In complex pipelines with 20+ stages, modifying a single stage may trigger unintended configuration changes in downstream stages. This can lead to tedious reconfiguration work, loss of previous settings, and disrupted data flow.
To address this challenge and simplify the maintenance of complex workflows, QuickML introduces the pipeline Stage Reconfiguration, an automated mechanism that runs in the background as you build pipelines.
What is Stage Reconfiguration?
Stage Reconfiguration is a smart, user-guided mechanism in QuickML that identifies downstream stages impacted by a change in a previous stage. Rather than automatically resetting the configurations of all successor stages, the system prompts users with the affected stages and options to reconfigure, skip, or reset only the necessary stages, enabling efficient and controlled pipeline management.
This ensures:
- Reduced impact of configuration changes on subsequent stages, enabling a smoother pipeline building process.
- Transparency about which stages are affected
- Flexibility to handle changes as per business needs.
Let’s consider a scenario to understand the use of stage reconfiguration in real-time.
Imagine a retail company building a sales forecasting pipeline in QuickML to analyze product performance across different regions. The pipeline includes over 25 stages, covering everything from data ingestion to algorithm selection. In the initial pipeline version, the team selects the amount column in Stage 4 (Select Columns) and later applies a Type conversion to it in Stage 7.
During refinement, a team member decides to remove the amount column in Stage 4, believing it’s no longer needed for analysis. QuickML’s Stage Reconfiguration mechanism instantly detects that this change impacts Stage 7, which still depends on the amount for type conversion. Instead of resetting all subsequent stages, the system pinpoints Stage 7 as affected and presents options. Since the column is no longer required, the team simply chooses to skip and merge Stage 7. This keeps the pipeline valid and avoids unnecessary disruptions, demonstrating how Stage Reconfiguration helps teams make precise changes without reworking the entire flow.
Pipeline Types and Stage Reconfiguration Compatibility
Stage Reconfiguration is implemented across various pipeline types in QuickML, depending on whether the pipeline supports editable and user-configurable stages. The mechanism is especially useful in Classic mode pipelines, where changes made in one stage can directly affect subsequent stages and require reconfiguration. This reconfiguration is typically triggered only when there is a Schema change (such as adding, removing, or renaming columns) or Value change that affects how the data is processed in the following stages. However, in Smart mode pipelines, this feature is typically not required, because the stages do not involve manual configuration. Instead of resetting or reconfiguring, Smart mode handles updates through simple re-execution since the logic remains consistent and controlled by the system.
The table below outlines which pipeline types support Stage Reconfiguration and explains the reason for its applicability or exclusion in each case:
Pipeline Type | Stage Reconfiguration Applicable | Reason |
---|---|---|
Data Transformation | Yes | Stages are user-configurable; reconfiguration is supported |
Prediction | Yes | Stages are user-configurable; reconfiguration is supported |
Text Analytics | Only in Classic Mode | In Smart mode, no stage-level edits required |
Recommendation | Yes | Stages are user-configurable; reconfiguration is supported |
Forecasting | No | Uses Smart mode by default; no stage-level edits required |
Clustering | Yes | Stages are user-configurable; reconfiguration is supported |
Anomaly Detection | No | Uses Smart mode by default; no stage-level edits required |
Working process of stage reconfiguration
When you make a change in a pipeline stage, QuickML automatically checks whether this change impacts the configurations of the stages that follow. Depending on what you changed, there are two possible outcomes.
Case 1: The change has no Impact on subsequent stage configurations
If your change doesn’t alter the schema of the dataset used in subsequent stages, QuickML will recognize that those stages are still valid. However, to maintain consistency in the pipeline, the system will re-execute the subsequent stages. In this case, you’ll see two options:
- Ignore and Reset All – This resets the configurations of all subsequent stages, even though they aren’t affected by your change.
- Proceed – This re-executes the subsequent stages using their existing configurations without requiring any rework.
For example, suppose in Stage 2 you use a Select/Drop Columns operation to drop a column named membership_category, which is not used in any of the subsequent stages. Since none of the later stages rely on this column, your change doesn’t impact their configuration. In this case, QuickML recognizes that the schema of subsequent stages remains intact. You can simply choose to proceed, and the pipeline will re-execute without requiring any reconfiguration.
Case 2: The change impacts the subsequent stage configurations
If your change affects the schema of the dataset used in later stages, QuickML will highlight which stages are impacted. You’ll then be given a set of options mentioned below to control how you want to handle the affected stages:
- Ignore and Reset All – This ignores the configurations of all impacted stages and resets them.
- Configure Now – This opens a pop-up that lists the affected stages and shows you what exactly has changed. Within the configuration pop-up, you can:
- Review and Edit each impacted stage to update the configuration based on your recent change.
- Cancel to undo the change you just made, taking the pipeline back to its previous state.
- Skip and Merge a stage if it’s no longer required in the new flow.
- Discard – This reverts the change made in the current stage itself and restores it to its original configuration, effectively undoing the edit you just applied. This is useful when you want to cancel the recent change due to its impact on the rest of the pipeline.
Here’s an example: Suppose in Stage 2 you use a Select Columns operation to include the churn_risk_score column, and in Stage 4 you apply a Type Conversion to that column. If you remove the churn_risk_score column in Stage 2, Stage 4 will break because it depends on that column. QuickML will detect this and show you that Stage 4 is affected. You’ll then have the choice to either re-add the column in Stage 2, update Stage 4 to use a different column, or skip Stage 4 entirely if it’s no longer necessary.
This approach gives you full control over how to handle changes without losing previous work unnecessarily. It also helps you avoid time-consuming reconfiguration by focusing only on what truly needs your attention.
Stage-Specific Edge Cases
Behavior of Added Columns in the Select/Drop Stage: Columns added during reconfiguration will not impact the configuration of subsequent stages. However, they will be listed in the Select Columns dropdown of any affected stages, allowing you to include them if needed.
Behavior of Dropped Columns in the Merge Columns Stage: If a dropped column was previously used in a Merge Columns operation, you must initiate reconfiguration starting from the Merge Columns stage to clear the dependency and update the configuration accordingly.
Behavior of Dropped Columns in the Split Columns Stage: If a dropped column is used in Split Columns and shared across two outputs, two separate reconfiguration layers will be shown based on how the column was modified.
Add Dataset Impact: Changes in the Add Dataset stage can impact the entire pipeline.
Points to remember
To make the most of the stage reconfiguration feature, it’s important to keep the following key points in mind during its usage.
- Page Reload Handling: Reloading the page during stage configuration will prompt you to either revert all changes or resume from where you left off.
- Effect of Delete Actions: Delete Connection and Delete Stage reset the configurations of successor stages to default.
- Reconfiguration Cancellation: Cancelling the reconfiguration during re-execution will revert all intermediate changes.
Last Updated 2025-08-11 15:44:23 +0530 IST
Yes
No
Send your feedback to us