You need a command line tool for “event” based real-time

…by the way, in relation to my previous post, especially when triggering ETL for “kinda” real-time functionality from scheduling environments, make sure the tool you are using or choosing has a command-line mechanism for starting jobs and passing values.   All of the major tools have one – a quick search gave me examples in DataStage, Informatica and Business Objects Data Integrator, and Ab-I.   Any data integration tool worth its salt should have a mechanism.  Learn the command line method that is pertinent to your favorite.  I thought this answer on an Ab-I user collaboration site ( ) was great.  I took it a bit out of context, because it was somewhat specific to Ab-I, but the rest makes sense for any of these offerings:

…People then make the mistake of putting these things in the start and end script of the graph – but the best place for them is in a single wrapper that invokes all graphs – effectlvely adapting Ab Initio invokation to Autosys rather than adapting Autosys to Ab Initio. In the end, Ab Initio is along for the ride (as it should be) and Autosys is the controlling harness.

In DataStage, we might suggest that the “first” call from a tool like Autosys would invoke a Graphical Job Sequence, and then endeavor to have as much Job-to-Job-to-Job control and overall process logic manged by the Sequence itself… but it’s still the same concept for complex “event” functions, especially the ones that approximate “always on.” 

 DataStage does all of this with the dsjob command.  You’ll find it document ed in Version 8 in Chapter 23 of the Server Job Developer Guide (i46desjd.pdf). 

 …but there is an exception….what if you want the “event” to be something really creative, like a user pushing a button on a portal built via .NET and running in a remote location?    What if the developers for that button don’t know the first thing about unix or shells?   What if the configuration is so dynamic that the transformation server endpoint changes frequently?  Being able to do this via Web Services or other SOA Service invocation becomes equally imporant.  The subject for my next entry…

2 Responses to “You need a command line tool for “event” based real-time”

  1. Todd Robinson Says:

    I can’t agree with the suggestion of calling a “Master” Sequence from a Scheduling tool and having the Sequence manage the job to job to job of DataStage. The only exception would be if using a Sequence truly represents a Unit of work of more then one Datastage job or if a Sequence brings some functionality that is not easily achieveable from within a DataStage job. Even then only in the most restrictive and reviewed cases. Enterprises buy a Scheduling tool for a reason. Datastage’s Sequence functionality is no Scheduling tool. I can see real-time “event” triggering a Sequence which then follows some process flow from DataStage job to job to job, however if Autosys is being invoked first then the case must be very strong to NOT allow Autosys to control the entire flow from job to job to job.

  2. dsrealtime Says:

    Hi Todd. Thanks for the comment. You are right — I was being too generic. It’s certainly not cut and dried, one call and be done with it. I’ve seen lots of hybrid approaches. The Job Sequencer is definitely NOT a scheduler…on the other hand, it’s worthy of isolated DataStage to DataStage to DataStage flows, and its introduction in release 5 reduced a significant amount of scripting and job tool work that was being performed earlier. Careful review and consideration is needed.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: