Incorporating Java classes into your DataStage Jobs

Java comes up a lot when we talk about “real time.”   Not that Java in particular has any special dibbs on the term, but frequently when a site is interested in things like Service Oriented Architecture (SOA), Web Services, messaging, and XML, they are often also interested in Java, J2EE, Application Servers and other things related to Sun’s language standard. 

Integrating Java with your ETL processing becomes the next logical discussion, whether “real time” even applies.   There may be some functionality, some existing algorithms worth re-using, some remote java-oriented or java managed system or message queue that contains valuable source data (or would be a valuable target), that you’d like to integrate into a data integration flow.   DataStage can easily be extended to include your Java functionality or take advantage of your Java experience.

There are two Stages that used to be referred to as JavaPack that are included with DataStage:  JavaClient and JavaTransformer.   Both allow you to integrate the functionality of a java class into the flow of a DataStage Job.   JavaClient is used for a sources or targets (only an output link or only an input link), and the JavaTransformer is used for row-by-row processing where you have something you’d like to invoke for each row that passes through.

DataStage provides a simple API for including java classes into your Jobs.  This API allows your class to directly interact with the DataStage engine at run-time — to obtain meta data about the columns and links that exist in the current executing job, and to read and write rows from and to those links when called upon to do so.   You define several special methods in your class, such as Process(), that the engine calls whenever it needs a row, or is giving your class control because it’s ready to give you a row.  Within that method you have various calls to make, such as readRow [from an input link] and writeRow [to an output link].    You can control what comes in and goes out, and also process rejections based on logic in your class.  Other than that, your class can do whatever it wants……read messages from JMS queues, invoke remote EJBs….whatever.  

The JavaPack is very well documented, with examples and descriptions of all the API calls.    However, I’ve included an additional example here for anyone who is interested, including java class, source, .dsx and usage notes.    Have fun!

-ernie

btw…I haven’t exactly figured out yet how to best get the names of the files below represented here on this blog, but if you save them from here, each file except the Readme begins with “ExamineRows” and should be ExamineRows.dsx (for the export), ExamineRows.java (for the Source) and ExamineRows.class for the actual compiled class.   I haven’t had a chance to re-try it after downloading from here, so worst case, you’ll need to recompile the class yourself in your environment.  Otherwise, it should run in v8 “as is”.  See the file at the Readme link for details on the expected classpath in the Job, etc., and read the annotations in the Job itself after you import it.  -e

Examine Rows Class, Examine Rows Java Source, Examine Rows Readme, Examine Rows DataStage Export

60 Responses to “Incorporating Java classes into your DataStage Jobs”

  1. Yuan Says:

    Hi ernie, thanks for your post. Does JavaPack need extra purchase in DataStage Enterprise 7.5 version? How could I know if this package has been installed? Thanks!

  2. dsrealtime Says:

    It used to be a cost item, but now is downloadable for no charge if you are at a 7.5 site — but you’ll have to speak with whoever you have your support with to find out where to get it. Once installed there will be two new stages…JavaClient and JavaTransformer, and you will also have a java/bin subdirectory under DSEngine, if I recall correctly….. in it you’ll find server java bits, including tr4j.jar ……

    Ernie

  3. Tracy Says:

    Ernie,
    Thanks for creating this blog, it has helped me understand accessing web services through datastage.

    I’m sorry this is off-topic, but I don’t know if you’re still monitoring the web services posts you did late last year, so I thought I’d post to this very recent thread instead.

    I’m accessing a web service for Daptiv (formerly eProject) which is coded to the SOAP 1.2 standard. Now, the Web Service Metadata Importer can’t import message and metadata information for SOAP 1.2, and even says so in the documentation:

    The following standards and specifications are not supported by the Web Service Meta Data Importer. In the plug-in stages, you can access web services that use these standards and specifications. However, you must manually create table definitions from their WSDL documents.

    My question is: How can you manually create table definitions from WSDL documents, and furthermore, how do you access web services which were not imported from the WS Metadata Importer? It seems like the only place you can bring these web service definitions in is through the Web Service Browser, and that is filled only by the Metadata Importer. There is one place where the documentation on the Web Services Pack is trying to explain how to generate SOAP messages from the WSDL of the service, and it first says that you have to use the metadata importer to import the operation, then tells you what to do if you didn’t import the message that way:

    Generating the SOAP message from the WSDL of the Web Service First use the Web Service Meta Data Importer to import the operation. 1. Click the Input Message tab. 2. To load namespace information and input parameters for the web service operation listed as a Stage property, click Load Message Information. One of the following conditions applies: v If you used the Web Services Meta Data Importer to import the web services, the generated table definition created is automatically selected and loaded. A window opens if there is conflict with an existing column. v If you did not use the Web Services Meta Data Importer to import the web service, see “Loading input table definitions.”

    This all has me quite confused, I guess I just need to know the best way to set up a web service job if you need to address SOAP 1.2. Any help you can provide would be greatly appreciated.

    Thanks,
    Tracy

  4. dsrealtime Says:

    Hi Tracy…

    Fortunately, wordpress does a nice job of notifying me about comments, so I would have seen it either way! The blogging is still new to me, and I’m already finding it desirable to have more ways to review old entries. But just the same, I’d have seen the comment on old or new threads.

    Interesting issues here…SOAP 1.2 is certainly something to consider, although I suspect that all should be ok provided it’s not also using one of the concepts in 1.2 such as “attachments”. …Arrays and complex structures could bite as well…. Give me some time to absorb your thoughts above and together we’ll see what ideas we can come up with…

    Ernie

  5. Ayodeji Olasoji Says:

    WE are trying to compiler the Java code without having any Java experience. The compiler can’t seem to locate the following files.
    import com.ascentialsoftware.jds.Row;
    import com.ascentialsoftware.jds.Stage;
    import com.ascentialsoftware.jds.Column;
    Can you help?. We are currently running on IIS 8.0.1

    Regards

    Deji

  6. dsrealtime Says:

    My bad. This is an old comment. Sorry for missing it. I’m sure you found all of these already. But for other’s reference, you should be able to find them for release 8.x in /IBM/InformationServer/Server/DSEngine/java/lib (or close — I did this from memory). For 7.x it’s nearly the same, but in the main DataStage directory structure instead of /IBM/InformationServer ….

    Ernie

  7. AJ Says:

    where is your tr4j.jar located?

    The path of this JAR should be in your system CLASSPATH variable.

  8. dsrealtime Says:

    Same place. C:\IBM\InformationServer\Server\DSEngine\java\lib

    I don’t have immediate access to a 7.x DS machine, but it should be found in a similar location.

    Ernie

  9. Ricky Bannerjee Says:

    Following on from Ayodeji Olasoji comments:

    We have added the location of the tr4j.jar file to the CLASSPATH, but it still doesn’t seem to compile. The error message we receive is:
    “cannot find symbol”. Can you help? We are currently running on IIS 8.0.1

  10. Ayodeji Olasoji Says:

    Thanks Ernie for your reply.

    I sort of left this for a while to fix other problem, now I am back to it. I was able to compile the code by seeting the classpath as you have instructed.
    Due to my very limited Knowledge of Java and JMS messaging I have ran into another problem. When I tried executing the Java class from within IIS 8.1 (DS) I get the following error :
    “LoadQueue_test..Java_Client_2: TJClient::initialize: unable to create Java Virtual Machine; classpath = /da01/apps/IBM/infoserver/Server/DSEngine/java/lib/tr4j.jar”
    The environment setting of job the last time I ran is as follows:

    DATASTAGE_JRE=/apps/IBM/infoserver/ASBNode/apps
    DATASTAGE_JVM=jre/bin/j9vm
    JAVA_HOME=/apps/IBM/infoserver/ASBNode/apps/jre

    I noticed thst I could not find j9vm in the DATASTAGE_JVM path, I have listed the content of the directory below:

    dsadm>:/apps/IBM/infoserver/ASBNode/apps/jre/bin>ls -1
    ControlPanel
    java
    java_vm
    keytool
    orbd
    policytool
    rmid
    rmiregistry
    servertool
    tnameserv

    The closest file by name is java_vm, I also did a find on the box and I could not find the j9vm library.

    Also within the job my classpath was set as:

    /apps/IBM/infoserver/Server/DSEngine/java/lib/tr4j.jar:/apps/IBM/infoserver/Server/DSEngine/java/lib/api.jar:/apps/IBM/AppServer/profiles/default/classes/log4j.jar:/export/home/dsadm

    Do you have any idea of how to fix this.

    Thanks again for your help.

  11. Andy Sorrell Says:

    Just a note – It took me a couple of tries to get the job to work because I placed the java class file on the client because of the Windows pathname. Once I realized you were probably developing on a stand-alone Windows box I moved them to the Linux server, modified the path in the job and it worked great.

  12. shiv Says:

    I m a websphere Datastage Developer , Can Show any examples how u people integrate that java code using java client or java transformer

    • dsrealtime Says:

      Hi Shiv…did you find the java pack example here on the blog? You can download a simple one that illustrates how it works. Check the table of contents….

      • shiv Says:

        The example here did’nt worked for me, i have a datastage 8.1 server edition .
        i have the class and even in the transformer i have given the path of the classes , but in the transformer there is option called Java virtual Machine Options, I don’t know what to give here . Is there any plugin or Jar files i have to import in the java transformer ? so can u tell me what i have to define in java transformer . If u want more descriptions about my job then tell me, i will let u know
        thanks for ur reply

      • dsrealtime Says:

        You shouldn’t need to put anything in the jvm options….those might only be necessary in case you had to increase heap size or other such things. Depending on your platform, you may need to set some variables for the location of the jre used by JavaPack….. what sorts of errors are you getting?

      • shiv Says:

        i am getting these errors in the director
        1.Java_Transformer_1,0: Warning: java_transformer_try.Java_Transformer_1: ASCL-DSJNI-00001`:`TJStage::initialize: initializing Transformer for Java stage; resource path not found
        2.DB2_UDB_API_2,0: Warning: java_transformer_try.Java_Transformer_1: ASCL-DSJNI-00001`:`TJStage::initialize: initializing Transformer for Java stage; resource path not found
        3.Java_Transformer_1,0: Error: TJClient::initialize: unable to create Java Virtual Machine; classpath = java/lib/tr4j.jar
        ASCL-DSJNI-00011`:`JNIWrapper: load library failed: directory java/jre\bin/classic, name jvm
        The specified module could not be found.
        4.DB2_UDB_API_2,0: Error: TJClient::initialize: unable to create Java Virtual Machine; classpath = java/lib/tr4j.jar
        ASCL-DSJNI-00011`:`JNIWrapper: load library failed: directory java/jre\bin/classic, name jvm
        The specified module could not be found.
        5.APT_CombinedOperatorController,0: Resource bundle corresponding to message key DSTAGE-TODC-00017 not found! Check that DSHOME or APT_RESPATH is set.

        THIS is the error what i m getting in the datastage director

      • dsrealtime Says:

        Yes…those are the initialization errors I was referring to. Contact your support provider for the actual details — it depends a bit on your platform and exact release, but you will probably need to set variables for DATASTAGE_JVM and DATASTAGE_JRE, and maybe one or two others. Otherwise, the example here should run fine and give you an idea of how the api in JavaPack functions.

        Ernie

      • shiv Says:

        the main thing i have to ask here i have implent a java transformer stage in a parallel job or a server job ,

  13. shiv Says:

    As i am IBM employee and datastage is an IBM product so i m using the latest version i.e 8.1 and with all fixes and even if i have to set the variables in datastage_JVM and datastge_jre where exactly i have to set those variables.
    And i m not using the examples what u have shown , i have created my own classes where i have given my options . if u want to know more about my job what i m doing in that then tell me i will let u know

  14. 2010 in review « Real-Time Data Integration Says:

    […] The busiest day of the year was January 5th with 152 views. The most popular post that day was Incorporating Java classes into your DataStage Jobs. […]

    • bhuvnesh2703 Says:

      Hello Ernie,

      Thanks for the post.

      I am trying to work out a test job using a Java transformer stage for learning how to configure it.

      When i try to run the job. I get following error:
      Error: java.lang.ClassNotFoundException: com.ascentialsoftware.jds.test.UpperCase.

      Steps I have done:
      I have taken a java class present in the Datastage Documentation (UpperCase class). Compiled it externally in a tool like eclipse. Included the tr4j.jar file before compiling in the lib. The java class got compiled successfully. Now, i have placed the compiled UpperCase.class file in a directory on the Linux server.

      In the Java transformer, i have given the User’s Classpath as the directory holding the class file.
      Ex: User Classpath:
      /etl/dev/MyDirectory
      Inlcuded the Classpath holding tr4j.jar file too.

      & Class name as
      Ex: Transformer Class name: com.ascentialsoftware.jds.test.UpperCase
      (Is this the right way to provide the class name? If not please suggest. UpperCase is the class name. I have tried giving the class name as simply “UpperCase” also.)

      I have provided the input metadata as a column “EmpName” and output metadata as “EmpOutName”.

      In the Datastage Administrator. I have set the following 2 variables.
      DATASTAGE_JRE
      DATASTAGE_JVM

      Please suggest what shall be done.
      It would be great, if you can mail dsx of a job containing Java Transformer & Java client stage.

      Regards,
      Bhuvnesh
      Email: bhuvnesh2703@gmail.com

  15. bhuvnesh2703 Says:

    Hello Ernie,

    I am able to solve the above issue. I placed the files in a folder in which the DS user does not had permissions. Please ignore the above mail.

    However, Now, i am stuck with the following error:
    Java_Transformer_3,0: Error: java.lang.NoClassDefFoundError: UpperCase (wrong name: com/ascentialsoftware/jds/test/UpperCase)

    The Class File “UpperCase.class” is present in the following directory: /etl/dev/source_files/adw

    In the User’s Classpath: I have added
    – /etl/dev/source_files/adw
    -/opt/IBM/dev/APP/InformationServer/Server/DSEngine/java/lib/tr4j.jar
    -/opt/IBM/dev/APP/InformationServer/ASBNode/apps/jre/bin

    Please share a dsx.

    Regards,
    bhuvnesh2703@gmail.com

  16. Sushi Says:

    Hi Ernie,

    We have a requirement to export data from an API call. it is a URL which is used for bulk export and I get the set file which i need to store on the system and then process it can you please help me on what are the opetions to get that data is there a way i use the java tansformer for this.

    Thanks
    Sushi

    • dsrealtime Says:

      Is it a REST based API call? (one that returns pure xml?)….JavaPack would certainly be one way…but it’s possible that you might be able to leverage the XML Stage in 8.5 for this also…..

  17. tmmcnicol1723 Says:

    Trying to get .dsx for this. How does one become a ‘user’ of this blog. I have a wordpress.com account?

    • dsrealtime Says:

      Not sure what you mean by a “user” of the blog….do you mean of WordPress in general? They do a great job supporting bloggers thru their hosted site, although I know of other folks who have used WordPress’ open source code and hosted their blogs elsewhere…………as for the .dsx, it should be available in the post…but I’ll check it and see if there is an issue.

      Ernie

      • tmmcnicol1723 Says:

        This is the message I get when I click on the example links.

        — 403: Access Denied —

        This file requires authorization:

        You must both be a user of this blog as well as be currently logged into WordPress.com

        I am logged on to wordpress.com

        Reply

        Leave a Reply Cancel reply

        This is the message I get when I click on the example links.

        — 403: Access Denied —

        This file requires authorization:

        You must both be a user of this blog as well as be currently logged into WordPress.com

        I am logged on to wordpress.comThis is the message I get when I click on the example links.

        — 403: Access Denied —

        This file requires authorization:

        You must both be a user of this blog as well as be currently logged into WordPress.com

        I am logged on to wordpress.com

        Guest
        tmmcnicol1723
        tmcnicol1723: You are commenting using your WordPress.com account. (Log Out)

      • dsrealtime Says:

        hmm. ok. Try this instead: hover your mouse over the link, then use your right mouse and choose “save as”…..the files can then be saved wherever you want…..just rename them to their formal suffixes (the DS export should be renamed to .dsx, etc.).

        Ernie

  18. Serina Adickes Says:

    you’re really a good webmaster.The web site loading speed is incredible.It seems that you are doing any unique trick.In addition, The contents are masterpiece.you’ve done a wonderful job on this topic!

  19. tmmcnicol1723 Says:

    Thanks for the guidance as I was able to get the XML transformer configured and it now takes in three inputs from a DS job and calls the Java class files to insert into an LDAP DB. Now I have to capture the log message that is created in the Java program (which is in the director log) and evaluate it to determine if the process was successful.

  20. htamboli Says:

    How Do I get those files. I am getting Access Denied.

    • dsrealtime Says:

      make sure you right mouse click on them and say “save as”…

      • 0352grunt Says:

        I am doing that, but receiving “Access Denied” even when I am logged in.

      • arulkmm Says:

        Hi Ernie,
        This is a great blog! really explains how to integrate the java with DS. This is the first time i am trying it, but i wasnt able to download the source files. Is there anyway I could do it? I get the authentication error as others do. I tried the “save as”, but it didnt work either.
        Thanks,
        Arul

  21. tdeuchler Says:

    nice blog here…. very usefull

    • abhishekumargupta Says:

      good blog .. can you pls tell some links where i can get real scenarios

      • dsrealtime Says:

        There are lots of people using this for not-so-common integrations into DataStage. JMS queues from various vendors is a classic example, but I’ve also seen sites use the JavaPack for integration with custom 3rd party solutions such as RFID readers or for access to home grown “already written and in-production” EJBs.

        Ernie

  22. sheetalkoul Says:

    Can someone help me with downloading this in below link

    1>Examine Rows Class
    2>Examine Rows Java Source
    3>Examine Rows Readme
    4>Examine Rows DataStage Export

    This is the message I get when I click on the example links.

    — 403: Access Denied —

    This file requires authorization:

    Even though i tried to put mouse over the link, then use your right mouse and choose “save as”…..
    but file is getting saved as blank

    Can someone please send me this on my mail_id
    sneha.koul@gmail.com

  23. reeth04 Says:

    Hi Ernie,
    I’m trying to download your sample programs (using right-click->Save link) but it ends up in error as files seems to not exist anymore. It would be much appreciated if you could reup them?
    thanks

  24. padmakm Says:

    Hi Ernie,how to get the ExamineRows files you attached here. I’m a member still when I try to click and download, it says— 403: Access Denied —

    This file requires authorization:

    You must be logged in
    and a member of this blog.
    I am already logged in. Anything more to be done? Please let me know

  25. polliekrismis Says:

    I’m also having difficulty downloading the attached files…

    I’ve registered my own blog, “followed” the blog, am currently logged in and still I’m getting the message that Access is denied.

    What is meant by “be a member of this blog”?

    Thanks

    Paul

  26. kottinaresh Says:

    I’m unable to download, appreciate if someone can mail those samples…. kottinaresh@gmail.com

    Thanks in advance!

  27. arulkmm Says:

    Same here.. I was trying to download, but it says “authetication failed”. I am following this blog, but not sure why

  28. dsrealtime Says:

    Oops. Sorry about that. Glitch in WordPress? No matter…I’ll just put the URL here in text directly…..

    First some background. The samples in that post are for the Java Transformer and Java Client Stages…developed over a decade ago. They haven’t been deprecated, but they have been replaced by a solution that is far more efficient and has greater capabilities. This is called the “Java Integration Stage” and was made fully available with release 9.1 of Information Server and DataStage. That new Stage can even run the old code……. One of these days I will get around to updating or re-writing this post, though some people are still “pre-9.1” (if you are, let me know and we’ll figure out a way to get you that code).

    The formal documentation for the Java Integration Stage can be found at:

    http://www-01.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.javastage.usage.doc/topics/javastage_overview.html

    In addition to that documentation (which has its own links) you should also be able to retrieve some of the samples that are out on Developer Works……

    JDBC

    https://www.ibm.com/developerworks/mydeveloperworks/files/app/person/0600007N2J/file/f3dc889f-7f87-4d3d-8ff0-664e359538f5

    JMS

    https://www.ibm.com/developerworks/mydeveloperworks/files/app/person/0600007N2J/file/b766b89b-5745-4c82-bf0f-d21c95f250f6

    HIVE

    https://www.ibm.com/developerworks/mydeveloperworks/files/app/person/0600007N2J/file/53d98608-99fd-470d-ae17-aa5c03f4ff13

    MongoDB

    https://www.ibm.com/developerworks/mydeveloperworks/files/app/person/0600007N2J/file/eb9cbbd8-d467-43fc-a27f-73c284e0d251

    Ernie

  29. arulkmm Says:

    Thanks, Ernie. We are actually using 8.0.1 still 🙂 and trying to use the java clients in there.

  30. oursamw Says:

    Hi Ernie,

    I am trying to make the example ExamineRows work in Datastage 8.5 on linux environment but facing issues.
    I have placed the files ExamineRows.class and ExamineRows.java inside a directory SampleJavaPlugIns on Datastage server.

    I gave the following parameters and values in the job
    DATASTAGE_JRE=/IBM/InformationServer/Server/_jvm/jre
    DATASTAGE_JVM=bin/j9vm
    UserClassPath=/ourserverpath/SampleJavaPlugIns
    TransformerClassName=SampleJavaPlugIns.ExamineRows

    Despite giving those values I am getting the error as below.

    JTransformer_0: java.lang.ClassNotFoundException: SampleJavaPlugIns.ExamineRows

    Please advice on how I should get rid of the error.

    Thanks!

    • dsrealtime Says:

      It depends on the package name that you used inside of your source code prior to compilation…and then the classpath that you are using. If you are writing a standalone class (no jar file), then point the User Classpath Property to the directory that is the \”parent\” of the java \”package\” directory, and specify package.classname for the Transformer Class Property. (just edited that to make it more clear). –ernie

      • oursamw Says:

        Hi Ernie,
        We now have several different packages and a driver class in one of the packages. By providing the package.classname for the Transformer class name and the jar file path for the class path, it doesn’t seem to be able to access code in other packages in the jar. We are getting the below error. Would you please advice how to handle this?
        Thanks!

  31. oursamw Says:

    Here is error. Forgot to paste it in the earlier post.
    Error: java.lang.NoClassDefFoundError: com.ford.it.properties.PropertyException

  32. kamarthiblog Says:

    Reblogged this on kamarthiblog and commented:
    I am unable to access any of the source codes in this blog. Can any one you please help me get those?

  33. kamarthiblog Says:

    Hello I am unable to access any of the Jar files in this blog. It is giving me access denied even though I am member to wordpress. Can any one you please help me get these attached files? Thank you!

  34. David Simmons Says:

    I have implemented the JMSStage class and it works great if the timeout is > 0; Running in debug, when I set it to -1, the messages are read from the queue and the count on the output link increment. The problem is, the messages do not seem to release and never go to the input on the follow on stage. When the timeout is set to a large value, >100, the messages are stuck until the timeout occurs then the messages flow through the rest of the job. Am I missing a setting in the Java Integration stage? Currently using 9.1 and will be upgrading to 11.5 shortly. Any help would be appreciated.

    Dave Simmons

    • dsrealtime Says:

      I suspext tht it has something to do with end of wave. You may need to play some tricks with the end of wave operator and then see if you can gwt it to combine with the java integration stage under the covers, or simply be placed downstream from your jms implementation. I am not sure if the java int stage has an api call to submit end of wave markers.

      Ernie

  35. datastagetipsblog Says:

    Hi Ernie,

    Sorry to go off-track and post on this block. I have a requirement of Sending XML posts and Receiving the XML response from ActiveMQ over JMS and I need to implement this using DataStage 11.5. Could you please guide me with any documentation regarding this?

    The approach i was thinking of writing a Java Code and integrate the code in using the Java-Integration Stage. I don’t know whether this would help or not.

    In DSXchange i found that you have helped Andrew with some information regarding setting up a connection to ActiveMQ.

    Awaiting your positive reply.

    Thanks,
    Mohsin Khan

    • dsrealtime Says:

      A source code sample for the java integration stage and jms is on developerworks. That is a good place to start….but that is just the source and target….you will also need to work out the construxtion and parsing of your xml.

      I dont have acesss right now to the urls for the devworks site but it should be fairly easy to find.

      Ernie


Leave a reply to bhuvnesh2703 Cancel reply