xflow Help

Join

The Join component is utilized to merge rows from two or more datasets based on specified criteria. It combines columns from input tables using one or multiple joining columns, offering various join modes to tailor the merging process according to user requirements.

Types of Joins

In the dropdown menu for the Join Type section, users can choose from the following types of joins:

  • Inner Join: Includes only rows that have matching values in both input tables.

  • Outer Join: Includes all rows from both input tables, filling in missing values with NULL where necessary.

  • Left Join: Includes all rows from the left input table and matching rows from the right input table.

  • Right Join: Includes all rows from the right input table and matching rows from the left input table.

Configuration

Upon clicking the Join node, users are presented with the following fields in the configuration section:

Defining Join Conditions and Selectors:

  • Operation Type: Select the type of join from a dropdown menu.

  • Source Column: Choose the left and right data sources for the join.

  • On Conditions: Define the condition for datasets based on specified column relationships.

  • Selectors: Selectors allow users to specify what should be included in the output by providing aliases for both data sources.

    • Users must provide an alias and enter a unique prefix for both the left and right aliases to ensure column names are distinct. These prefixes are automatically appended to each column name in the result.

Example Usage

Let's consider a scenario where we have two datasets: one containing sales transaction details and another containing customer information.

Transaction Data

Transaction ID

Customer ID

Product

Quantity

Total Amount

1

101

Laptop

2

140000

2

102

Smartphone

1

56000

3

103

Tablet

3

105000

4

104

Headphones

2

7500

Customer Data

Customer ID

Customer Name

Age

City

101

Alice

30

New York

102

Bob

28

Los Angeles

103

Charlie

35

Chicago

105

Emily

27

Boston

We want to merge these datasets to gain insights into customer behavior and their corresponding sales activities. To accomplish this, we'll explore various types of joins to combine the data effectively.

Configuration:

Upon clicking the Join node, users are presented with the following fields in the configuration section:

  • Operation Type: Select 'Join Type' from the dropdown menu.

  • Source Column: Choose the left source as "Transaction Data" and the right source as "Customer Data".

  • On Conditions: Define the condition to match records based on the common column "Customer ID".

  • Selectors: Use 'L_' as the alias prefix for the left source and 'R_' for the right source to ensure distinct column names in the result.

Inner Join

Operation Type: Select "Inner Join" from the dropdown menu.

Combines rows where there's a matching Customer ID in both datasets. We get a dataset with only transactions made by existing customers.

Resultant Output

L_Trans ID

L_Customer ID

L_Product

L_Quantity

L_Total Amount

R_Customer ID

R_Customer Name

R_Age

R_City

1

101

Laptop

2

140000

101

Alice

30

New York

2

102

Smartphone

1

56000

102

Bob

28

Los Angeles

3

103

Tablet

3

105000

103

Charlie

35

Chicago

Outer Join

Operation Type: Select "Outer Join" from the dropdown menu.

Includes all rows from both datasets, ensuring no information is lost. Provides a complete view of sales transactions and customer details.

Resultant Output

L_Transaction ID

L_Customer ID

L_Product

L_Quantity

L_Total Amount

R_Customer ID

R_Customer Name

R_Age

R_City

1.0

101.0

Laptop

2.0

140000.0

101.0

Alice

30.0

New York

2.0

102.0

Smartphone

1.0

56000.0

102.0

Bob

28.0

Los Angeles

3.0

103.0

Tablet

3.0

105000.0

103.0

Charlie

35.0

Chicago

4.0

104.0

Headphones

2.0

7500.0

Null

Null

Null

Null

Null

105.0

Emily

27.0

Boston

Left Join

Operation Type: Select "Left Join" from the dropdown menu.

Includes all sales transactions and adds customer details where available. Gives us sales information along with customer demographics.

Resultant Output

L_Transaction ID

L_Customer ID

L_Product

L_Quantity

L_Total Amount

R_Customer ID

R_Customer Name

R_Age

R_City

1

101

Laptop

2

140000

101.0

Alice

30.0

New York

2

102

Smartphone

1

56000

102.0

Bob

28.0

Los Angeles

3

103

Tablet

3

105000

103.0

Charlie

35.0

Chicago

4

104

Headphones

2

7500

Null

Null

Null

Null

Right Join

Operation Type: Select "Right Join" from the dropdown menu.

Includes all customer details and adds sales information where available. Provides insights into customer profiles alongside transaction history.

Resultant Output

L_Transaction ID

L_Customer ID

L_Product

L_Quantity

L_Total Amount

R_Customer ID

R_Customer Name

R_Age

R_City

1.0

101.0

Laptop

2.0

140000.0

101

Alice

30

New York

2.0

102.0

Smartphone

1.0

56000.0

102

Bob

28

Los Angeles

3.0

103.0

Tablet

3.0

105000.0

103

Charlie

35

Chicago

Null

Null

Null

Null

Null

105

Emily

27

Boston

Last modified: 21 February 2025