IBM and Hortonworks!

Hi everyone…

Some exciting recent news, if you haven’t seen it yet…announced a few days ago at the DataWorks Summit/Hadoop Summit in San Jose, a new relationship between IBM and Hortonworks!   Read about it here to learn how IBM and Hortonworks are partnering to further the efforts of our customers to expand their big data solutions.

http://www-03.ibm.com/press/us/en/pressrelease/52572.wss?platform=hootsuite

More important for this blogger is the increased attention this brings to Apache Atlas.  Apache Atlas, if you aren’t already familiar, is an evolving open source approach to enterprise information governance, metadata management, and lineage […go here for a general overview:  https://hortonworks.com/apache/atlas/ ].   One highlight from news above draws particular attention to the contributions IBM and Hortonworks are making to this effort:

“Partnering On Apache

As part of their wide-ranging partnership, the companies will also team to advance the development of Unified Governance (IBM BigIntegrate, IBM BigQuality and IBM Information Governance Catalog) on the Apache Atlas open platform. Information Governance Catalog) on the Apache Atlas open platform. …”

It’s all a work-in-progress, but this is significant news that will hopefully accelerate the initiative.   Have any of you started working heavily with Atlas?   Which release?  Are you using it exclusively with Hadoop, or externally?   Have you interchanged metadata with Atlas and IGC?  Considering it?    Share your experiences!

Ernie

Related posts:

Evolving Atlas…

 

 

 

Re-defining Data Lineage

Well..not so much “re-defining” as re-fining, and adding clarity to the definition and the discussion.  Please find the time to review this excellent blog entry by my IBM colleague, Distinguished Engineer and thought leader, Mandy Chessell…  https://poimnotes.blog/2017/03/19/understanding-the-origin-of-data/

–ernie

OpenIGC Accelerator

Hi Everyone…

Happy Spring! [for those of you in the northern hemisphere  ; )  ].   Great time to start “cleaning out” and “fixing up” things….whether around the house, or in the corners of our special projects.    In that latter category, I have “tidied up” a little utility I have been working on to assist everyone in building their OpenIGC prototypes or to assist in “getting to know” OpenIGC — a “form builder” for the “Publishing XML” needed to realize instances of your newly modeled and registered OpenIGC artifacts.

A lot of you have expressed the desire to get deeper into OpenIGC, but have found it difficult to get your arms around the xml aspects of it.  Either that, or cutting and pasting xml in a text editor is just not your thing.   For those reasons and others, I have been exploring various ways that a user interface could be created for OpenIGC assets — without resorting to an elegant albeit complex and lengthy GUI development effort.

Digging around, I found some open source javascript tooling to assist, and brushed off enough javascript and html skills to put it together.     At the url listed below you will find a tool that allows you to upload your bundle descriptor and generate a self-populating “form” to construct a publishing xml document for OpenIGC.   It also provides options to save the publishing xml to disk (for future use/editing) or to directly cut and paste into the igc-rest-explorer page.

It’s not “perfect” (I suspect it probably has its share of anomalies if you click on things out of order), but is hopefully a “helper” that will accelerate your efforts to implement custom assets for governance within IGC.

Please carefully READ the instructions (there is a link to instructions and a simple screen shot on the initial page).    The tool does not entirely “hide” your xml, and it REQUIRES that you understand your bundle (if you don’t know what I am talking about regarding OpenIGC and bundles, please review the blog series starting with https://dsrealtime.wordpress.com/2015/07/29/open-igc-is-here/ )! ….still, it does a few nice things for you:

  • Performs all the xml tagging/formatting, ensuring that your xml remains “well-formed”
  • Presents a “pull-down” select list for your classNames and attribute enumerations
  • Generates the list of attributes (properties) for whatever class you select
  • Automatically generates the unique “assetIDs” for the asset instances that you define
  • Generates and presents a pull-down list for selecting “parent” assetIDs

As noted above, I can’t promise that it is entirely bug-free, but I can say that it has already helped me accelerate the prototyping of several bundles that I have been building recently to illustrate the power of OpenIGC for extending the repository.    Have fun, good luck, and please let me know how you make out in using this tool!       –ernie

http://www.openigcaccelerator.com

 

Accessing IGC via cURL

Hi Everyone…

This is a long overdue post …pointing to an article written by one of my IBM colleagues about accessing IGC metadata via its REST APIs — using cURL as your tooling.   He provides some excellent examples, complete with screen shots and recommendations.  Enjoy!

https://developer.ibm.com/recipes/tutorials/interact-with-your-governance-metadata-in-igc-using-rest-apis-with-curl/

Ernie

 

 

 

Tech Talk on Information Analyzer: Virtual Tables

Hi all.

Just wanted to pass along news of another Tech Talk.  This one on Information Analyzer and Virtual Tables.   Here are the details and the link to Eventbrite to register…

October 20, 2016
Time: 9:00 EST
Topic:  Virtual Table Feature in Information Analyzer

This presentation will provide a comprehensive overview of Virtual Table feature in Information Analyzer.  A Virtual Table is essentially a way to filter / limit the data from the source repository while performing IA Analysis like Column Analysis, Key Analysis, and Data Rules Analysis etc. The concept of Virtual Table is available from version 8.1 in IA workbench with quite a few limitations. A new type of Virtual Table called ‘SQL Virtual Table’ is introduced from 11.5 which eliminates all the limitations and allows users to define any complex SQL queries to filter the data during IA analysis. It also allows users to query exceptions directly from the source repository with the known queries. A SQL Virtual Table can be only defined using IA REST API / CLI at this moment. In this session, we will also see a demonstration of this feature.

Who should attend this session? – For all skill levels of current and prospective Information Analyzer users from both IT and line of business.

This topic will be presented by Suresh Tirumalasetti, Software Developer, Information Analyzer.  Suresh has extensive experience software development, customer support especially in Information Analyzer.  Suresh is located in Bangalore, India.

To attend you must register here:

https://www.eventbrite.com/e/virtual-table-feature-in-information-analyzer-tickets-28227856278

Password: Governance

Tech Talk on OpenIGC !

The session outlined below was held last week.    Marc did a fabulous job outlining how OpenIGC and its value for helping you achieve governance for ALL of your important metadata assets.     The recording can be found at https://youtu.be/0Tzz3fQYpRY

Ernie

 

Hi all…  wanted everyone to hear about the upcoming “Tech Talk” that is scheduled for next week.   Marc Haber, Offering Manager for our metadata offerings, will be presenting, while myself and others will be monitoring the chat room for questions and discussion.

Here are the details:

Event Name : Information Governance Catalog
Event Date : Wednesday, Sept 14
Event Time :  1 PM – 2 PM US (EDT) Eastern Daylight Time
Presented by : Marc Haber, Offering Manager
This presentation will provide a comprehensive overview of ability to extend the Information Governance Catalog and support governance across new and alternate Data Sources or Systems. Understand how customers satisfy their requirements for a comprehensive Governance implementation or metadata management system with Information Governance Catalog. We will explore the process for defining and structuring new Asset Types and publishing information specific to Assets. Lastly, explore the process to govern such Assets, lending meaning thru Glossary Terms, documenting requirements thru Governance Rules and mapping information to support Data Lineage and Compliance Reporting. This topic will be presented by Marc Haber, Offering Manager for Information Governance Catalog and Data Governance in general across Information Server.  Marc has extensive experience with Business Glossary, Metadata Workbench and Governance Catalog – helping customers implement governance initiatives or satisfy metadata management requirements. 

Registration –
https://www.eventbrite.com/e/ibm-tech-talk-is-open-igc-tickets-27329302680

Password:  Governance

Apache Atlas: GET-ting familiar with the REST API

Hi everyone.  Just posted the second in a series or recordings related to Apache Atlas, the Open Source initiative for metadata management and governance for hadoop.  Many of you have been asking about how to get metadata “out” of Apache Atlas so that you can load it into IGC or other repositories, or just use it for special governance reporting purposes.   In this recording we take a quick look at some of the key “GET” functions of the Apache Atlas REST API, and how you can easily do testing and prototyping of these calls using only your browser.   –ernie

https://youtu.be/6Us2zG-WvS8

 

Check out this “Recipe” for integrating Oracle ODI metadata into IGC!

Hi Everyone…

An IBM colleague has published an excellent use case on constructing an OpenIGC bundle  and publishing metadata and lineage for ETL processes represented by Oracle ODI.  She very nicely shows how to illustrate important structures and properties of a 3rd party ETL tool.   Ultimately, this leads to publishing of actual metadata instances so that IGC users can perform lineage reports and also “govern” (assign Terms, Stewards, etc.) their critical metadata.

Enjoy!

-ernie

https://developer.ibm.com/recipes/tutorials/creation-of-new-bundle-on-infosphere-information-governance-catalog/

Apache Atlas: “your first look!”

Hi Everyone.

Just finished uploading the initial video in a series of recordings concerning Apache Atlas, the evolving open source initiative for metadata management and governance in hadoop.

This recording is primarily designed for viewers who aren’t comfortable doing their own builds of open source solutions and also need some guidance on how to get started with vmware images that are available for download.  It introduces the concept and helps validate what needs to be done so that the viewer can be successful with available Apache Atlas resources on the web.  It starts with the download of existing images at the Hortonworks web site, and helps validate your environment so that you can continue with tutorials that are on the Hortonworks site, and/or start playing and exploring on your own.  This is the first in a series of recordings on Apache Atlas that share early experiences and discoveries regarding this important open source initiative for governance and metadata management in hadoop.

Recording can be found at:  https://youtu.be/C4lf_EFduqU

IBM Partners with Creative Solutions Using Open IGC !

Many of you come to these pages to understand how to extend the Information Server repository and use the various Information Governance Catalog APIs to enhance your users’ experiences and increase your governance capabilities.   But for some of you, there are too many interfaces, not enough time, not enough resources (or the right skilled resources) to complete the effort.   Please let me introduce you to various trusted IBM partners who have been trained on, and are using,  Open IGC and related techniques to help customers around the world reach their information governance goals.  Many of these partners have built formal “bridges” from various 3rd party tools, to automate the metadata import process, and most of them also offer expert consulting on IGC and governance strategies in general.

To our partners…thank you for your efforts to spread the word about Open IGC and for helping our customers make even greater progress towards their governance objectives.

To our customers…I invite you to visit these partners’ web pages, ask them about how they can assist you with Open IGC and IGC issues in general, and challenge them to further expand their offerings to extend the repository for all your governance needs.

To our future partnersif you have built or are building a creative solution for achieving governance with the Information Governance Catalog, reach out to myself or my IBM teammates around the world so that we can introduce your efforts to the overall IGC community and ensure your listing is on this page.

Thank you!      –ernie

 

Compact Solutions  http://www.compactbi.com/solutions/data-lineage/

Compact_logo_GIF

 

Lucid  http://www.lucidtechsol.com/get-stronger-data-governance-with-lucids-ibm-info-server-enhancements/

Lucid Logo

 

 

 

 

 

Manta  https://getmanta.com

manta_logo

 

Orion https://www.oriongovernance.com/

orion_logo

 

Prolifics  http://www.prolifics.com/solutions/information-management-analytics

Prolifics_NEWLOGO_BLACK1

 

 

 

INFORMATION-ASSET, LLC http://information-asset.com/

 

Other Vendor partners who have integrated their own direct solutions with the Information Governance Catalog via OpenIGC include:

ZALONI      www.zaloni.com

Data Migrators        www.datamigrators.com www.datamigrators.com

DiYOTTA  www.diyotta.comwww.diyotta.com

Pentaho (Hitachi Vantara)     www.hitachivantara.com

Denodo     https://www.denodo.com/enhttps://www.denodo.com/en

AxiomSL  https://www.axiomsl.com/