Open IGC is here!

Select Posts by Area

Open IGC is here!

July 29, 2015 — dsrealtime

Hi Everyone….

Been awhile since I’ve posted anything — been too busy researching and supporting many new things that have been added in the past year — for data lineage, for advanced governance (stewardship and workflow), and now “Open IGC”. This is the ability to create nearly “any” type of new object within the Information Governance Catalog and then connect it to other objects with a whole new lineage paradigm. If you are a user of Extensions (Extension Mapping Documents and Extended Data Sources), think of Open IGC as the “next evolution” for extending the Information Server repository. If you are a user of DataStage, think of what it would be like to create your own nested objects and hierarchies, with their own icons, and their own level of “Expand” (like zoom) capability for drilling into further detail.

This new capability is available at Fix Central for 11.3 with Roll-up 16 (RU 16) and all of its pre-requisites (FP 2 among other things) and is immediately available in 11.5.

[…and is formally documented here: http://www-01.ibm.com/support/docview.wss?uid=swg21699130 ]

[also see here for tips on the IGC API for general management of assets: http://www-01.ibm.com/support/docview.wss?uid=swg27047054&aid=1 ]

So exactly what is this “Open IGC”?

Open IGC (you may also hear or see “Open IGC for Lineage” or “Open IGC API”), is providing us with the ability to entirely define our “own” object types. This means having them exist with their own names, their own icons, and their own set of dedicated properties. They can have their own containment relationships and define just about “anything” you want. They are available via the detailed “browse” option, and appear in the query tool. They can be assigned to Terms and vice versa, and participate in Collections and be included in Extension Mappings …and then…once you have defined them, you can describe you own lineage among these objects, also via the same API, and define what you perceive as “Operational” vs “Design” based lineage (lineage without needing to use Extensions, and supporting “drill down” capabilities as we see with DataStage lineage).

Here are some use cases:

a) Represent a data integration/transformation process…or “home grown” ETL. This is the classic use case. Define what you call a “process” (like a DataStage Job)….and its component parts…the subparts like columns and transformations, and properties that are critical. Outline the internal and external flows between such processes and their connections to other existing objects (tables, etc.) in the repository.

b) Represent some objects that are “like” Extended Data Sources, but you want more definition…..such as (for example) all the parts of an MQ Series or other messaging system configuration…objects for the Servers, the Queue Managers, and individual Queues. Give them their own icons, and their own “containment” depths and relationships. Yes — you could use Extensions for this, but at some point it becomes desirable to have your own custom properties, your own object names for the user interface, and your own creative icons!

c) Overload the catalog and represent some logical “concept” that lends itself to IGCs graphical layout features, but isn’t really in the direct domain of Information Integration. One site I know of wants to show something with “ownership”…but illustrate it graphically. They are interested in having “responsibility roles” illustrated as objects…whose “lineage” is really just relationships to the objects that they control. Quite a stretch, and would need some significant justification vs using tooling more appropriate for this use case, but very do-able via this API.

It’s all done based on XML and REST, and does not require that you re-install or otherwise re-configure the repository. You design and register a “bundle” with your new assets and their properties, and then use other REST invocations to “POST” new instances of the objects you are representing.

Quite cool…….and more to come…..I will be documenting my experiences with the API and the various use cases that I encounter.

What use cases do YOU have in mind? 🙂

Next post in this series: Open IGC: a Simple Messaging Use Case

Ernie

Posted in RealTime. 26 Comments »

Wallace Wong Says:
August 6, 2015 at 7:59 am

To accomplish lineage we are leveraging SDLC artifacts ,i.e. source to target mappings, to either input directly into FastTrack or import from spreadsheets into FastTrack. The good thing is that we use physical meta-data for the mappings , and this produces the lineage. The criticism is that things get out of sync between FastTrack and implemented code for whatever reason (quick changes, human error etc). What I described is interim-state. So we are looking for ways to ingest the lineage from various implemented technologies (some non-IBM SQL,SAS, Informatica, etc.) versus relying on manually maintained source to target mappings, but also be able to decipher from a business context what was loaded versus a lot of technical mubo-jumbo that third party bridges seem to bring in. This looks like a potential gap filler in our end-state solution.

dsrealtime Says:
August 7, 2015 at 11:52 am
Some of the key things for you to consider is whether the sources and targets are some “abstract” concept (like a “green screen”) or are they just a file or table like any other? …and then for the “processes” that you want to illustrate, how much “internal” structure do you want to illustrate (sub-processes that require calling out, or that you want to “drill down” into)? …and in each case, how important are custom icons and custom names and relationships/properties among those objects.

Christopher Grote (@cmgrote) Says:
August 11, 2015 at 11:44 am

How about NoSQL stores as a use case? e.g. using Cloudant as just one example: a cluster containing databases, containing tables, containing JSON docs, etc.

dsrealtime Says:
August 11, 2015 at 1:54 pm
Hi Chris. Thanks for the note. The NoSQL stores are certainly a possibility. A lot depends on exactly what you want to represent and whether it truly needs a whole new structure. In the case of many of the NoSQL’s, one could argue that in the end, it is just “columns”, with datatypes and other such things, and could be illustrated by a regular Database Table or Data File. But absolutely — if you wanted a whole new set of objects that actually represent the cluster (and maybe then, they point to real “tables” whose object types already exist — a cool hybrid possibility) this is a perfect use case! Some qualifying questions might be things like: …Is the cluster something that I need/want to separately “govern”? Does it have its own set of characteristics and properties that I would like to manage within IGC? Would it have its own “Stewards”? Would I want to do lineage against the Cluster itself (above the Tables)? Would it be nice to have my own icons for them?

Keep in mind also that (I haven’t tried this with Cloudant specifically) where a structure “looks” relational, and supports at least ODBC and/or JDBC, we can “automate” the import of the metadata via Connector. In those cases, using the Open IGC for “auxilliary” metadata (like the Cluster, etc. that do not appear in the normal “Host…Database…” tree of the Implementation Model) will work nicely, but we might want to leave regular tabular metadata to the Metadata Asset Manager and an automated Connector based import.

Ernie

Open IGC (Information Governance Catalog) documentation | IBM Brian Says:
August 13, 2015 at 7:54 am

[…] Our friend Ernie has an excellent write-up about this exciting feature released late July 2015: https://dsrealtime.wordpress.com/2015/07/29/open-igc-is-here/ […]

thuan72 Says:
October 28, 2015 at 1:21 pm

Great with open igc.
Do you have bundle documenting AD groups assigned to reports and tables?

dsrealtime Says:
November 6, 2015 at 5:54 am
Sorry I missed this! I don’t have a bundle that points to reports and tables, but that shouldn’t be hard to do, with Database Tables being the simplest one to try initially. In the accessGroups bundle that I do have, it only goes down to database — but it should be fairly easy to follow and then extend down to the schema and table level (look at the asset nodes in the “flow” xml upload. Report assets are a bit trickier, but have the same concept — but the identity can be more challenging because they tend to have some fairly extensive folder structures.

–ernie

IGC_Beginner Says:
August 1, 2016 at 3:10 pm

Hello! below is the JSON pay load. I tried to create a new term, category using IBM IGC.
It is throwing me the “400” error for the both the payloads. However, I was able to update the existing term to add new libraries using lib RID.

Pay load for creating new term
{
“_type”:”Term”,
“Short_description”:”MR Data elements”,
“name”:”MRA”
“status”:”Accepted”,
“Parent_Category”:”6662c0f2.2enetefl6u6dq9kgonh”,
}
Err code is 400- Term not supported

Pay load for creating new category
{
“_type”:”Category”,
“Short description”:”MR Data elements”,
“name”:”MRA”
}
Err code is 400- Category not supported

Please advise. Thanks

kumar Says:
September 19, 2016 at 3:20 pm

Hello there! Below is the JSON to create terms. I understand that we can create 1 resource with one POST function. Is there any way to create multiple multiple resource with single POST click.

Below is the JSON. Your help is much appreciated.
{
“_type”:”term”,
“name”:”cast”,
“parent_category”:”RID of the category”,
“status”:”CANDIDATE”,
“assigned_assets”:{
“items”:[“RID of the tech asset1”,”RID of the asset]
}

{
“_type”:”term”,
“name”:”ct”,
“parent_category”:”RID of the category”,
“status”:”CANDIDATE”,
“assigned_assets”:{
“items”:[“RID of the tech asset1”,”RID of the asset]
}

dsrealtime Says:
September 21, 2016 at 7:06 am
There isn’t any way, with the IGC REST API, to create multiple Terms in a single call. Probably best that you create a file and then use istool and import multiple terms from .csv or .xml.

Gopal Says:
November 9, 2016 at 12:25 pm

Hi Ernie, Is it possible to have a work-flow for the new object that we create through this Open IGC? Like the same work-flow that we have already for Terms and Categories?
Let’s say we create a new object like a Term to store semantic information of any Tables and Columns and link them with actual physical fields, would it be possible to use the same IGC work-flow process for this?

Open IGC (Information Governance Catalog) documentation | Midwest Metadata Group Says:
November 11, 2016 at 12:25 pm

Neeraj Sinha Says:
August 9, 2017 at 7:23 am

Hello,
I am using IGC 11.5 version and one of my client requirements are to export and import the assets (terms history) development logs from the UAT enviorment to the production. I have PMR open with IBM and they said, Development Glossary can not be exported into the XML or XMI file format. Also I don not see any option to export terms history in the IGC–>Administration–>Tools–>Export Under the Development Glossary Tab, however terms history can be exported for Catalog (Publish items).

Question –
What is the best practices of exporting and importing terms history or development logs to the same version of IGC. Can Open IGC be a viable option. If yes, Can we leverage the REST API to export and import all the terms development logs.

Can you please add this use case with some example into this Open IGC forum – I would appreciate if this can be achieved. I heard from IBM that this is the burning requirement from other customer as well and they may include this feature into the IGC portal in upcoming release.

Thank You!
NKS

dsrealtime Says:
August 21, 2017 at 10:05 am
Sorry I didn’t reply to this sooner —- you probably already have a solution. There is no direct way to perform this operation, though the use of the Dev Glossary Query tool, which was added last year, might be the best approach. Dump the necessary information from the Dev Glossary via the Query Tool (the textual comments, status’, etc.), and then load that into your production glossary, using XML or possibly the IGC API, by putting them into the “Notes” field. That way they will be kept and promoted as more formal textual entries.

- Neeraj Sinha Says:
  August 21, 2017 at 1:39 pm
  Thank you – I appreciate you suggestion and this is really helpful.

Pavan Says:
November 20, 2018 at 7:19 am

Hi Ernie, Can we change the primary key(name) in Open IGC Bundle. We have a requirement where we have System Code as Primary key because names can repeat. Any idea on how to handle such cases.
Problem is we loaded the assets to bundle but it created multiple assets with same name. Since System code is unique, we need to change primary key in Asset XML. is it possible.

dsrealtime Says:
November 20, 2018 at 7:41 am
If you mean the bundle ID, there is no way I know of that you can change it (except to delete the bundle and re-load). It is key to the uniqueness of that bundle, and then also utilized as you know, throughout all of the publish and flowXMLs….

- Pavan Says:
  November 21, 2018 at 2:50 am
  Hi Ernie,Thanks for your response. Not the Bundle ID. Within Asset XML I feel the name of Asset is primary key. Instead of that can we make another field for example “System Code” as primary key in Asset XML.
  
  Scenario:
  2 assets are created in IGC with same name
  Sample Asset1 with System Code AG1276
  Sample Asset1 with System Code GE3483
  
  To avoid this situation, can we make System Code as primary key and load assets to IGC

Laurent Says:
January 22, 2019 at 8:14 am

Hello Ernie,

we use version 11.7 and want to use Open IGC.

Is it possible to use igc-rest-explorer to create new Information Assets entries like, for example, descriptions of Web Services or Data Science Projects?

We do this for Data Classes, for example, but can not do the same for Data Science Projects; the igc-rest-explorer interface allows us to see the attributes of Data classes for the creation, but for Data Science Projects, there is no significant information.
We tried this syntax but the return is 403 Access Denied:
{
   “_type”: “analytics_project”,
   “short_description”: “our description”,
   “name”: “aaaaapLTGT AP”,
   “class_code”: “dsx.AnalyticsProject”
}

Could you tell us what to use in this case, for example?

Thanking you,

Laurent

Tristan Lefloch Says:
March 12, 2019 at 11:08 am

I am currently working on a system using Open IGC, and I don’t see a way to create a link between two bundle-defined assets to have a relation similar to the term relations “Related Terms‎‪” or “Synonyms”. Could you help me with that?

Thank you!
Tristan LEFLOCH

dsrealtime Says:
March 12, 2019 at 11:34 am
Hi Tristan… best way to do that is to establish a Custom Attribute of the “Relationship” type. It is meant for creating your own user managed “one to many” relationships between assets, as you might with Related Terms. Synonyms are a bit different, because they create a group of peers, but you can perfectly mimic, with your own Relationship Name and hyperlink in each direction, the behavior of asset relationships that is most like “Related Terms”.

- Tristan Lefloch Says:
  March 12, 2019 at 11:43 am
  Yes I figured it was a good solution for this type of relation, but unfortunately when I create a Custom Attribute via the Administration area I only have the choice of “Text”, “Predefined Values”, “Date” and “Number” for my attribute type, I don’t see any “Relationship” type like you and the Knowledge Center suggest.

dsrealtime Says:
March 12, 2019 at 11:48 am

Could be you are on an older release. I don’t recall exactly when that came along, but it was probably some time in the wave of updates to 11.5.

dsrealtime Says:
March 12, 2019 at 11:49 am

Best guess I can tell from looking at my archived notes is that it was part of 11.5 rollup 7.

Tristan Lefloch Says:
March 13, 2019 at 3:31 am
Ok I will get a look at that, thank you very much

Igor Franco Says:
May 31, 2019 at 10:48 am

Hi Ernie,
I am working on a Poc in a customer in Brazil, and I am creating a big flow that involves a number of cloud based environment. I am now trying to create a good visual lineage using openIGC, but I have some doubts:
1 – how to create assets that I can select a datafield in the lineage ( like tables can via “select columns”)?
https://ibb.co/JxfxtwY (image)
https://ibb.co/WGPC7vs (image)
2- I have an asset that has NO sourceid field in the subflow, but it appears as source with no target arrow poiting to it. What I am doing wrong?
https://ibb.co/3h24fVR (image)

Thanks in advance
Igor Franco

Tracing Enterprise Data Footsteps! ……celebrating the journey of data!

What’s this Blog about?

Select Posts by Area

Follow dsrealtime via email

Open IGC is here!

26 Responses to “Open IGC is here!”

Leave a comment Cancel reply

please note

Recent posts