Data Lineage vs Business Lineage

Hi all…

Thought it would be a good idea to go into some detail about Data Lineage and Business Lineage.

Data Lineage is a more “technical” detailed lineage from sources to targets that includes ETL Jobs, FTP processes and detailed column level flow activity. It should trace everything from source to target, and be flexible enough to encompass legacy sources, all RDBMS, abstract sources like and targets like message queues and realtime transactions, and every sort of transformation and movement process that touches the data along the way.

Business Lineage is a summary of that lineage — showing primarily only the sources, targets and reporting assets. I think of those assets as sort of “floating” to the top for display in Business Lineage (the same lineage is created under the covers). Business Lineage is often more useful to business members of your teams who aren’t interested in all of the gory detail at the lower level, but want to see a few critical business rules along with the ultimate sources for their reports and portal displays.

With Information Server and Foundation Tools, Lineage between sources and targets (Shared Tables) is established in a variety of ways:

a) via DataStage Jobs that illustrate the transformation of data from source(s) to target(s).

b) via parsing of SQL in rdbms views, or as an extension of the above when custom SQL is used in a Job.

c) via Extension Mappings, an 8.1.1 feature of Metadata Workbench (June, 2009) that supports the illustration and lineage for any source and target.

DataStage Jobs are just “there”. As a developer builds and edits a Job, lineage just “happens”. Either as a matter of course as you follow the links thru a complex Job, or via the algorithms that Metadata Workbench uses to parse through Jobs and find their relationships (ie…JobA writes to a target that becomes the source for JobB). Views are picked up during import from an RDBMS catalog. Metadata for a view is contained in the SQL SELECT that defines it. This SQL is parsed by Metadata Workbench to determine the actual base tables used. Extension Mappings allow you to be creative and define anything you want. This is important for illustrating any type of source-to-target relationship, and can represent anything from a cobol program to a shell script, or nothing at all — it simply may be important to illustrate a from/to relationship between two or more objects in your enterprise.

Business Lineage was an 8.1.2 capability (December 2009) that expands on the lineage concept by displaying only the sources and targets and reporting resources. The DataStage Jobs or Extension Mappings are “just under the surface”, but their detail, which could be extensive, is suppressed by the Business Lineage display. This makes consumption of those screens for lineage review simpler for certain categories of users.

Ernie

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: