xflow Help

Lineage

What is Lineage? Lineage in data management refers to the ability to trace and understand the flow of data from its origin (source) through all transformations and calculations to its final destination (output). It captures the relationships and dependencies between different data elements within a process.

Benefits of Lineage

Insight into Data Flow
Lineage provides a clear view of how data moves through various stages of processing, ensuring transparency and accountability in data transformations.

Tracking Data Origin and History
It helps in tracking the origin and history of data, ensuring data quality and compliance with regulations.

Troubleshooting and Optimization
By visualizing data lineage, users can quickly identify bottlenecks, errors, or discrepancies, enabling efficient troubleshooting and optimization of workflows.

How to Create Lineage for a Process?

Method 1: Toggle Option during Process Run

  • While running a process, enable the lineage toggle.

Method 2: Using Process Runs

  • Navigate to process runs and select a process with lineage.

  • Click on a node, open its configuration, switch to the preview tab, right-click on a value, and select "Investigate."

Method 3: From Source Data Version

  • Select the data version and check if it has a row_lineage_id.

  • Right-click on a value and select "Investigate" to analyze the data flow.

How to Analyze Lineage?

After running a process, the lineage status appears beside the process name in the navbar (e.g., failed, in queue, available).

  • Click on any node (source or specific node), open its configuration, go to the preview tab.

  • Right-click on a record to investigate, view the flow of data.

Investigation Details

  • Step Details: Shows node configuration details.

  • Data: Displays entire row details with affected columns highlighted.

  • Affected Columns: Highlights new columns created during calculations, aggregations, etc., and shows how they impact the data flow.

Last modified: 21 February 2025