IBM and Hortonworks!

Hi everyone…

Some exciting recent news, if you haven’t seen it yet…announced a few days ago at the DataWorks Summit/Hadoop Summit in San Jose, a new relationship between IBM and Hortonworks!   Read about it here to learn how IBM and Hortonworks are partnering to further the efforts of our customers to expand their big data solutions.

More important for this blogger is the increased attention this brings to Apache Atlas.  Apache Atlas, if you aren’t already familiar, is an evolving open source approach to enterprise information governance, metadata management, and lineage […go here for a general overview: ].   One highlight from news above draws particular attention to the contributions IBM and Hortonworks are making to this effort:

“Partnering On Apache

As part of their wide-ranging partnership, the companies will also team to advance the development of Unified Governance (IBM BigIntegrate, IBM BigQuality and IBM Information Governance Catalog) on the Apache Atlas open platform. Information Governance Catalog) on the Apache Atlas open platform. …”

It’s all a work-in-progress, but this is significant news that will hopefully accelerate the initiative.   Have any of you started working heavily with Atlas?   Which release?  Are you using it exclusively with Hadoop, or externally?   Have you interchanged metadata with Atlas and IGC?  Considering it?    Share your experiences!


Related posts:

Evolving Atlas…





Re-defining Data Lineage

Well..not so much “re-defining” as re-fining, and adding clarity to the definition and the discussion.  Please find the time to review this excellent blog entry by my IBM colleague, Distinguished Engineer and thought leader, Mandy Chessell…


OpenIGC Accelerator

Hi Everyone…

Happy Spring! [for those of you in the northern hemisphere  ; )  ].   Great time to start “cleaning out” and “fixing up” things….whether around the house, or in the corners of our special projects.    In that latter category, I have “tidied up” a little utility I have been working on to assist everyone in building their OpenIGC prototypes or to assist in “getting to know” OpenIGC — a “form builder” for the “Publishing XML” needed to realize instances of your newly modeled and registered OpenIGC artifacts.

A lot of you have expressed the desire to get deeper into OpenIGC, but have found it difficult to get your arms around the xml aspects of it.  Either that, or cutting and pasting xml in a text editor is just not your thing.   For those reasons and others, I have been exploring various ways that a user interface could be created for OpenIGC assets — without resorting to an elegant albeit complex and lengthy GUI development effort.

Digging around, I found some open source javascript tooling to assist, and brushed off enough javascript and html skills to put it together.     At the url listed below you will find a tool that allows you to upload your bundle descriptor and generate a self-populating “form” to construct a publishing xml document for OpenIGC.   It also provides options to save the publishing xml to disk (for future use/editing) or to directly cut and paste into the igc-rest-explorer page.

It’s not “perfect” (I suspect it probably has its share of anomalies if you click on things out of order), but is hopefully a “helper” that will accelerate your efforts to implement custom assets for governance within IGC.

Please carefully READ the instructions (there is a link to instructions and a simple screen shot on the initial page).    The tool does not entirely “hide” your xml, and it REQUIRES that you understand your bundle (if you don’t know what I am talking about regarding OpenIGC and bundles, please review the blog series starting with )! ….still, it does a few nice things for you:

  • Performs all the xml tagging/formatting, ensuring that your xml remains “well-formed”
  • Presents a “pull-down” select list for your classNames and attribute enumerations
  • Generates the list of attributes (properties) for whatever class you select
  • Automatically generates the unique “assetIDs” for the asset instances that you define
  • Generates and presents a pull-down list for selecting “parent” assetIDs

As noted above, I can’t promise that it is entirely bug-free, but I can say that it has already helped me accelerate the prototyping of several bundles that I have been building recently to illustrate the power of OpenIGC for extending the repository.    Have fun, good luck, and please let me know how you make out in using this tool!       –ernie


Accessing IGC via cURL

Hi Everyone…

This is a long overdue post …pointing to an article written by one of my IBM colleagues about accessing IGC metadata via its REST APIs — using cURL as your tooling.   He provides some excellent examples, complete with screen shots and recommendations.  Enjoy!





Tech Talk on Information Analyzer: Virtual Tables

Hi all.

Just wanted to pass along news of another Tech Talk.  This one on Information Analyzer and Virtual Tables.   Here are the details and the link to Eventbrite to register…

October 20, 2016
Time: 9:00 EST
Topic:  Virtual Table Feature in Information Analyzer

This presentation will provide a comprehensive overview of Virtual Table feature in Information Analyzer.  A Virtual Table is essentially a way to filter / limit the data from the source repository while performing IA Analysis like Column Analysis, Key Analysis, and Data Rules Analysis etc. The concept of Virtual Table is available from version 8.1 in IA workbench with quite a few limitations. A new type of Virtual Table called ‘SQL Virtual Table’ is introduced from 11.5 which eliminates all the limitations and allows users to define any complex SQL queries to filter the data during IA analysis. It also allows users to query exceptions directly from the source repository with the known queries. A SQL Virtual Table can be only defined using IA REST API / CLI at this moment. In this session, we will also see a demonstration of this feature.

Who should attend this session? – For all skill levels of current and prospective Information Analyzer users from both IT and line of business.

This topic will be presented by Suresh Tirumalasetti, Software Developer, Information Analyzer.  Suresh has extensive experience software development, customer support especially in Information Analyzer.  Suresh is located in Bangalore, India.

To attend you must register here:

Password: Governance

Tech Talk on OpenIGC !

The session outlined below was held last week.    Marc did a fabulous job outlining how OpenIGC and its value for helping you achieve governance for ALL of your important metadata assets.     The recording can be found at



Hi all…  wanted everyone to hear about the upcoming “Tech Talk” that is scheduled for next week.   Marc Haber, Offering Manager for our metadata offerings, will be presenting, while myself and others will be monitoring the chat room for questions and discussion.

Here are the details:

Event Name : Information Governance Catalog
Event Date : Wednesday, Sept 14
Event Time :  1 PM – 2 PM US (EDT) Eastern Daylight Time
Presented by : Marc Haber, Offering Manager
This presentation will provide a comprehensive overview of ability to extend the Information Governance Catalog and support governance across new and alternate Data Sources or Systems. Understand how customers satisfy their requirements for a comprehensive Governance implementation or metadata management system with Information Governance Catalog. We will explore the process for defining and structuring new Asset Types and publishing information specific to Assets. Lastly, explore the process to govern such Assets, lending meaning thru Glossary Terms, documenting requirements thru Governance Rules and mapping information to support Data Lineage and Compliance Reporting. This topic will be presented by Marc Haber, Offering Manager for Information Governance Catalog and Data Governance in general across Information Server.  Marc has extensive experience with Business Glossary, Metadata Workbench and Governance Catalog – helping customers implement governance initiatives or satisfy metadata management requirements. 

Registration –

Password:  Governance

Apache Atlas: GET-ting familiar with the REST API

Hi everyone.  Just posted the second in a series or recordings related to Apache Atlas, the Open Source initiative for metadata management and governance for hadoop.  Many of you have been asking about how to get metadata “out” of Apache Atlas so that you can load it into IGC or other repositories, or just use it for special governance reporting purposes.   In this recording we take a quick look at some of the key “GET” functions of the Apache Atlas REST API, and how you can easily do testing and prototyping of these calls using only your browser.   –ernie