Open IGC is here!

Hi Everyone….

Been awhile since I’ve posted anything — been too busy researching and supporting many new things that have been added in the past year — for data lineage, for advanced governance (stewardship and workflow), and now “Open IGC”.  This is the ability to create nearly “any” type of new object within the Information Governance Catalog and then connect it to other objects with a whole new lineage paradigm.    If you are a user of Extensions (Extension Mapping Documents and Extended Data Sources), think of Open IGC as the “next evolution” for extending the Information Server repository.   If you are a user of DataStage, think of what it would be like to create your own nested objects and hierarchies, with their own icons, and their own level of “Expand” (like zoom) capability for drilling into further detail.

This new capability is available at Fix Central for 11.3 with Roll-up 16 (RU 16) and all of its pre-requisites (FP 2 among other things) and is immediately available in 11.5.

[…and is formally documented here: http://www-01.ibm.com/support/docview.wss?uid=swg21699130 ]

[also see here for tips on the IGC API for general management of assets:  http://www-01.ibm.com/support/docview.wss?uid=swg27047054&aid=1  ]

So exactly what is this “Open IGC”?

Open IGC (you may also hear or see “Open IGC for Lineage” or “Open IGC API”), is providing us with the ability to entirely define our “own” object types.   This means having them exist with their own names, their own icons, and their own set of dedicated properties.     They can have their own containment relationships and define just about “anything” you want. They are available via the detailed “browse” option, and appear in the query tool. They can be assigned to Terms and vice versa, and participate in Collections and be included in Extension Mappings        …and then…once you have defined them, you can describe you own lineage among these objects, also via the same API, and define what you perceive as “Operational” vs “Design” based lineage (lineage without needing to use Extensions, and supporting “drill down” capabilities as we see with DataStage lineage).

Here are some use cases:

a) Represent a data integration/transformation process…or “home grown” ETL.    This is the classic use case.  Define what you call a “process” (like a DataStage Job)….and its component parts…the subparts like columns and transformations, and properties that are critical.   Outline the internal and external flows between such processes and their connections to other existing objects (tables, etc.) in the repository.

b)  Represent some objects that are “like” Extended Data Sources, but you want more definition…..such as (for example) all the parts of an MQ Series or other messaging system configuration…objects for the Servers, the Queue Managers, and individual Queues.  Give them their own icons, and their own “containment” depths and relationships.   Yes — you could use Extensions for this, but at some point it becomes desirable to have your own custom properties, your own object names for the user interface, and your own creative icons!

c)  Overload the catalog and represent some logical “concept” that lends itself to IGCs graphical layout features, but isn’t really in the direct domain of Information Integration.   One site I know of wants to show something with “ownership”…but illustrate it graphically.  They are interested in having “responsibility roles” illustrated as objects…whose “lineage” is really just relationships to the objects that they control.  Quite a stretch, and would need some significant justification vs using tooling more appropriate for this use case, but very do-able via this API.

It’s all done based on XML and REST, and does not require that you re-install or otherwise re-configure the repository.  You design and register a “bundle” with your new assets and their properties, and then use other REST invocations to “POST” new instances of the objects you are representing.

Quite cool…….and more to come…..I will be documenting my experiences with the API and the various use cases that I encounter.

What use cases do YOU have in mind?    🙂

Next post in this series: Open IGC: a Simple Messaging Use Case

Ernie

Advertisements

11 Responses to “Open IGC is here!”

  1. Wallace Wong Says:

    To accomplish lineage we are leveraging SDLC artifacts ,i.e. source to target mappings, to either input directly into FastTrack or import from spreadsheets into FastTrack. The good thing is that we use physical meta-data for the mappings , and this produces the lineage. The criticism is that things get out of sync between FastTrack and implemented code for whatever reason (quick changes, human error etc). What I described is interim-state. So we are looking for ways to ingest the lineage from various implemented technologies (some non-IBM SQL,SAS, Informatica, etc.) versus relying on manually maintained source to target mappings, but also be able to decipher from a business context what was loaded versus a lot of technical mubo-jumbo that third party bridges seem to bring in. This looks like a potential gap filler in our end-state solution.

    • dsrealtime Says:

      Some of the key things for you to consider is whether the sources and targets are some “abstract” concept (like a “green screen”) or are they just a file or table like any other? …and then for the “processes” that you want to illustrate, how much “internal” structure do you want to illustrate (sub-processes that require calling out, or that you want to “drill down” into)? …and in each case, how important are custom icons and custom names and relationships/properties among those objects.

  2. Christopher Grote (@cmgrote) Says:

    How about NoSQL stores as a use case? e.g. using Cloudant as just one example: a cluster containing databases, containing tables, containing JSON docs, etc.

    • dsrealtime Says:

      Hi Chris. Thanks for the note. The NoSQL stores are certainly a possibility. A lot depends on exactly what you want to represent and whether it truly needs a whole new structure. In the case of many of the NoSQL’s, one could argue that in the end, it is just “columns”, with datatypes and other such things, and could be illustrated by a regular Database Table or Data File. But absolutely — if you wanted a whole new set of objects that actually represent the cluster (and maybe then, they point to real “tables” whose object types already exist — a cool hybrid possibility) this is a perfect use case! Some qualifying questions might be things like: …Is the cluster something that I need/want to separately “govern”? Does it have its own set of characteristics and properties that I would like to manage within IGC? Would it have its own “Stewards”? Would I want to do lineage against the Cluster itself (above the Tables)? Would it be nice to have my own icons for them?

      Keep in mind also that (I haven’t tried this with Cloudant specifically) where a structure “looks” relational, and supports at least ODBC and/or JDBC, we can “automate” the import of the metadata via Connector. In those cases, using the Open IGC for “auxilliary” metadata (like the Cluster, etc. that do not appear in the normal “Host…Database…” tree of the Implementation Model) will work nicely, but we might want to leave regular tabular metadata to the Metadata Asset Manager and an automated Connector based import.

      Ernie

  3. Open IGC (Information Governance Catalog) documentation | IBM Brian Says:

    […] Our friend Ernie has an excellent write-up about this exciting feature released late July 2015: https://dsrealtime.wordpress.com/2015/07/29/open-igc-is-here/ […]

  4. thuan72 Says:

    Great with open igc.
    Do you have bundle documenting AD groups assigned to reports and tables?

    • dsrealtime Says:

      Sorry I missed this! I don’t have a bundle that points to reports and tables, but that shouldn’t be hard to do, with Database Tables being the simplest one to try initially. In the accessGroups bundle that I do have, it only goes down to database — but it should be fairly easy to follow and then extend down to the schema and table level (look at the asset nodes in the “flow” xml upload. Report assets are a bit trickier, but have the same concept — but the identity can be more challenging because they tend to have some fairly extensive folder structures.

      –ernie

  5. IGC_Beginner Says:

    Hello! below is the JSON pay load. I tried to create a new term, category using IBM IGC.
    It is throwing me the “400” error for the both the payloads. However, I was able to update the existing term to add new libraries using lib RID.

    Pay load for creating new term
    {
    “_type”:”Term”,
    “Short_description”:”MR Data elements”,
    “name”:”MRA”
    “status”:”Accepted”,
    “Parent_Category”:”6662c0f2.2enetefl6u6dq9kgonh”,
    }
    Err code is 400- Term not supported

    Pay load for creating new category
    {
    “_type”:”Category”,
    “Short description”:”MR Data elements”,
    “name”:”MRA”
    }
    Err code is 400- Category not supported

    Please advise. Thanks

  6. kumar Says:

    Hello there! Below is the JSON to create terms. I understand that we can create 1 resource with one POST function. Is there any way to create multiple multiple resource with single POST click.

    Below is the JSON. Your help is much appreciated.
    {
    “_type”:”term”,
    “name”:”cast”,
    “parent_category”:”RID of the category”,
    “status”:”CANDIDATE”,
    “assigned_assets”:{
    “items”:[“RID of the tech asset1”,”RID of the asset]
    }

    {
    “_type”:”term”,
    “name”:”ct”,
    “parent_category”:”RID of the category”,
    “status”:”CANDIDATE”,
    “assigned_assets”:{
    “items”:[“RID of the tech asset1”,”RID of the asset]
    }

    • dsrealtime Says:

      There isn’t any way, with the IGC REST API, to create multiple Terms in a single call. Probably best that you create a file and then use istool and import multiple terms from .csv or .xml.

  7. Open IGC (Information Governance Catalog) documentation | Midwest Metadata Group Says:

    […] Our friend Ernie has an excellent write-up about this exciting feature released late July 2015: https://dsrealtime.wordpress.com/2015/07/29/open-igc-is-here/ […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: