Conditional Split
The Conditional Split node provides users with the ability to divide data into different segments based on predefined conditions. This segmentation process can help users filter and organize data according to specific criteria.
Configuration
Upon selecting the Conditional Split node, users are presented with the following configuration options along with a condition opened by default.
Add Rule: Users can add individual conditions to define segmentation criteria.
Add Group: Users can group conditions together using logical operators like AND, OR, or NOT to create more complex segmentation rules.
Creating Conditions
Field Selection: Users choose the column or field from the dataset on which the condition will be applied.
Logic Condition: Users select from a range of logical operators such as
equal to (==)
not equal to (!=)
Contains
Not contains
Is empty
Is not empty
Is null
Is not null.
Input Value: Users input the value against which the selected column will be evaluated.
Logic Representation
The SQL Query Display section provides users with a view of the SQL query generated based on the conditions they have created. This allows users to understand the underlying SQL logic behind their segmentation rules.
Adding New Conditions
If required, users can click on + Add New Condition to define more conditions.
Data Segregation
When the node is configured with different split conditions and connected to another node, it shows the condition on the edge with a dropdown menu consisting of the defined conditions in that node, along with a default 'others' condition. By selecting an option from the dropdown, data will be filtered according to the selected condition.
Example Usage
Let’s consider a scenario where a business wants to segment their customers based on their purchase behavior. Here is the dataset:
Dataset
CustomerID | CustomerName | PurchaseAmount | PurchaseDate |
|---|---|---|---|
1 | John Doe | 1000 | 15/06/2023 |
2 | Jane Smith | 1500 | 20/07/2023 |
3 | Michael Johnson | 800 | 05/08/2023 |
4 | Emily Brown | 1200 | 10/09/2023 |
5 | Sarah Lee | 750 | 15/09/2023 |
6 | David Wilson | 500 | 10/10/2023 |
7 | Chris Black | 3000 | 20/11/2023 |
8 | Anna White | 1100 | 25/12/2023 |
Problem Statement: We need to segment our customers into three groups based on their purchase amounts: high spenders, moderate spenders, and low spenders.
High Spenders: Customers who have spent 1200 or more.
Moderate Spenders: Customers who have spent between 800 and 1199.
Low Spenders: Customers who have spent less than 800.
Condition 1: High Spenders
Field Selection: PurchaseAmount
Logic Condition: Greater than or equal to (>=)
Value: 1200
Logic Representation
Resultant Output
CustomerID | CustomerName | PurchaseAmount | PurchaseDate |
|---|---|---|---|
2 | Jane Smith | 1500 | 20/07/2023 |
4 | Emily Brown | 1200 | 10/09/2023 |
7 | Chris Black | 3000 | 20/11/2023 |
Condition 2: Moderate Spenders
Field Selection: PurchaseAmount
Logic Condition: between
Value: 800 and 1119
Logic Representation
Resultant Output
CustomerID | CustomerName | PurchaseAmount | PurchaseDate |
|---|---|---|---|
1 | John Doe | 1000 | 15/06/2023 |
3 | Michael Johnson | 800 | 05/08/2023 |
8 | Anna White | 1100 | 25/12/2023 |
Condition 3: Low Spenders
Field Selection: PurchaseAmount
Logic Condition: Less than (<)
Value: 800
Logic Representation
Resultant Output
CustomerID | CustomerName | PurchaseAmount | PurchaseDate |
|---|---|---|---|
5 | Sarah Lee | 750 | 15/09/2023 |
6 | David Wilson | 500 | 10/10/2023 |