Lost in translation?

editor’s note:  It gives me great pleasure to introduce Beate Porst, a good friend and colleague, who is the Offering Manager for DataStage and other parts of the Information Server platform.  Beate will be sharing her insights into Unified Governance and Integration, based on many years of experience with this platform and the issues surrounding data transformation and management.  Today she introduces some of the key new capabilities of Information Server v11.7.  Please welcome Beate to dsrealtime!   –ernie

How IBM Information Server v11.7 could have saved NASA’s 125-million dollar Mars orbiter from becoming lost.

We all know the slogan: Measure twice, cut once. What if we do but don’t know the context of our data?

That is what happened to NASA in 1999. While using the right numbers, their 125-million-dollar Mars orbiter was designed to use the metric system but mission control performed course corrections using the imperial system. This resulted in a too low altitude and contact to the orbiter was lost. An embarrassing moment for NASA.

But it wasn’t the only incident. In 2003, German and Swiss engineers started to build a bridge over the river Rhine in the border town of Laufenburg. Each country started to build the bridge on their side with the goal to meet in the middle. So the plan. Engineers used “sea level” as the reference point. Problem is that sea level in Germany is based on the North Sea where in Switzerland it is based on the Mediterranean, resulting in a 27cm difference. Now, builders in Germany knew the difference but apparently not whether to add or subtract that difference from their base. So they made the wrong choice.


Historical documents show that using out of context, incomplete or inaccurate data has caused problems ever since mankind started to develop different units of measurement.

Now the question is how can you avoid costly incidents such as the above and successfully conquer your data problems and how can IBM Information Server help you in that journey?

Whether you want to build a bridge, send an orbiter to Mars or simply try to identify new markets, you will only be as good as the data you use. This means, it must be complete, in context, trusted and easily accessible in order to drive insights. As if this isn’t challenging enough, your competitiveness also depends on your organizations ability to quickly adapt to changing conditions.

For more than a decade, IBM InfoSphere Information Server has been one of the market-leading platforms for data integration and governance. Users have relied on its powerful and scalable integration, quality and governance capabilities to deliver trusted information to their mission critical business initiatives.

John Muir once wrote: “The power of imagination makes us infinite”.  We have applied our power of imagination to once again reinvent the Information Server platform.

As business agility depends on the flexibility, autonomy, competency, and productiveness of the tools that power your business, we have infused Information Server’s newest release with a number of game changing inventions which include deeper insights into the context and relationship amongst your data, increased automation for your users to complete their work faster and saver, and more flexibility workloads for higher resource optimization. All of those are aimed at making your business more successful when tackling your most challenging data problems.

Let’s look at 4 of those game changing inventions and how they are going to help  your business:

  1. Contextual Search: Out of context data was the leading cause of error for NASA’s failed mission. The new contextual search feature called Enterprise Search provides your users with the context to avoid such costly mistakes. It greatly simplifies and accelerates the understanding, integration, and governance of enterprise data. Users can visually search, explore and easily gain insights through an enriched search experience powered by a knowledge graph. The graph provides context, insight and visibility across enterprise information giving you a much better understanding and awareness of how data is related, linked, and used.
  2. Cognitive Design: Getting trusted data to your end users quickly is an imperative. This process starts with your integration design environment. To help address your data integration, transformation or curation needs quickly, Information Server V11.7 now includes a brand new versatile designer, called DataStage™ Flow Designer. It features an intuitive, modern, and secure interface accessible to all users through a no-install, browser-based experience, accelerating your users’ productivity through automatic schema propagation, highlighted design errors, powerful type ahead search as well as full backwards compatibility to the desktop version of the DataStage™ Designer.
  3. Hybrid Execution: Data Warehouse optimization is one of the leading use cases to address growing data volumes while simplifying and accelerating data analytics. Once again, Information Server V11.7 has strengthened its ability to run on Hadoop with a set of novel features to more efficiently operationalize your Data Lake environment. Amongst those, is an industry unique hybrid execution feature which lets you balance integration workloads across a Hadoop and non-Hadoop environment aimed at minimizing data movements and optimizing your integration resources.
  4. Automation powered by machine learning: Poor data quality is known to cost businesses millions of dollars each year. The inadvertent use of different units of measurements for the Mars orbiter was ultimately a data quality problem. However, the high manual work combined with exponential data growth continues to be an inhibitor for businesses to maintain high data quality. To counter this, Information Server V11.7 is further automating the data quality process, by underpinning data discovery and classification with machine learning, so that you can spent your time focusing on your business goals. The two innovative aspects are:

Automation rules which lets business users define graphical rules which then automatically apply data rule definitions and quality dimensions to data sets based business term assignments and

One-click automated discovery which enables discovery and analysis of all data from a connection in one click providing easy and fast analysis of hundreds or thousands of data sets

Don’t want to get lost in translation? Choose IBM Information Server V11.7 for your next data project.

…another way to load Terms into InfoSphere Business Glossary


Here are a few other Jobs for loading new Terms and Categories into Business Glossary. Like the earlier post on Business Glossary, these DataStage Jobs read a potential source of Terms (just alter the source stage as needed) and then create a target csv file that is in the correct format for loading into Business Glossary using the new 8.1 csv import/export features available at the Information Server Web Console… Glossary tab. The Jobs are fairly well annotated and should be self explanatory. I haven’t yet set them up for Custom Attributes, nor have they been widely tested —– but they are already being implemented at a variety of locations. Please let me know if you find them useful.


(the one with “XML” in the name is the same as the prior blog entry. Each is named .doc, but is actually a .dsx file).

I need “Google House” (how I found some valuable .dsx’s).

I just recently re-installed Google Desktop. What a life saver. A prior installation was causing me some problems with email, so a month ago I uninstalled it, and only this week found the time to download another edition and have it index my machine….. and “lo!” I found some .dsx files (DataStage Exports) that I’d forgotten I created a few years back. In particular, I’ve been meaning to write up some instructions on how to use arrays in Web Services, among other complicated Web Service invocations, and wanted to avoid re-inventing the examples from scratch. I found ’em, and will see if I can clean them up, test them in version 8, and explain them here.

Bless all of you who can keep their hard drive in a perfectly sorted arrangement of easy to locate subdirectories and well documented files! If there was a “Google House,” maybe I’d be able to find some tools I misplaced after the last renovation project….. 😉


learning more about blogging

Whew.  Needed to put an entry in here for myself as a reminder and to keep track of the overwhelming set of concepts and issues that I’m going thru to figure out how best to manage blogging so that I can be productive and yet still enjoy leaving informative bits and pieces here on the web.   In just a two weeks time, I’ve learned a wealth of information and also piled a lot more things on my doorstep.  Thanks to the bloggers and non-bloggers alike who have pointed me in various directions.

Where to put your blog?  Why did you pick WordPress?   Well, it was free, for starters, and seemed to have some good features, after doing a few reviews. 

Are you going to host it yourself?  I’m trying to find time to spend on the blog — host my own web site?  Not happening.  Hats off to all of you who do.

How many blogs?   Personal one, technical, a little bit of both?   Internal to IBM or external?  A lot of thoughts here, and I’m still formulating ideas.   I’m leaning now towards having several.  Maybe no one else will read ’em, but I need a place to rant about the NHL!

Technorati Tags.   That’s a whole new one to me.   Blogging begets blogging…and a need to find other bloggers…and to index your own blogs.  Still learning about this… see www.technorati.com .

Blog Clients.  Who knew?   What if you want to blog “offline?”  There are tools made for the purpose!  About to try BlogDesk.   And here I thought Notepad would be effective for cut/paste.  Ha!

Categories, RSS, etiquitte, security, blogrolls.  It’ll take awhile.  Can’t spend so much time learning about blogging if it takes away from continually learning more about realtime with Information Server!  🙂