Incorporating Java classes into your DataStage Jobs

Java comes up a lot when we talk about “real time.”   Not that Java in particular has any special dibbs on the term, but frequently when a site is interested in things like Service Oriented Architecture (SOA), Web Services, messaging, and XML, they are often also interested in Java, J2EE, Application Servers and other things related to Sun’s language standard. 

Integrating Java with your ETL processing becomes the next logical discussion, whether “real time” even applies.   There may be some functionality, some existing algorithms worth re-using, some remote java-oriented or java managed system or message queue that contains valuable source data (or would be a valuable target), that you’d like to integrate into a data integration flow.   DataStage can easily be extended to include your Java functionality or take advantage of your Java experience.

There are two Stages that used to be referred to as JavaPack that are included with DataStage:  JavaClient and JavaTransformer.   Both allow you to integrate the functionality of a java class into the flow of a DataStage Job.   JavaClient is used for a sources or targets (only an output link or only an input link), and the JavaTransformer is used for row-by-row processing where you have something you’d like to invoke for each row that passes through.

DataStage provides a simple API for including java classes into your Jobs.  This API allows your class to directly interact with the DataStage engine at run-time — to obtain meta data about the columns and links that exist in the current executing job, and to read and write rows from and to those links when called upon to do so.   You define several special methods in your class, such as Process(), that the engine calls whenever it needs a row, or is giving your class control because it’s ready to give you a row.  Within that method you have various calls to make, such as readRow [from an input link] and writeRow [to an output link].    You can control what comes in and goes out, and also process rejections based on logic in your class.  Other than that, your class can do whatever it wants……read messages from JMS queues, invoke remote EJBs….whatever.  

The JavaPack is very well documented, with examples and descriptions of all the API calls.    However, I’ve included an additional example here for anyone who is interested, including java class, source, .dsx and usage notes.    Have fun!

-ernie

btw…I haven’t exactly figured out yet how to best get the names of the files below represented here on this blog, but if you save them from here, each file except the Readme begins with “ExamineRows” and should be ExamineRows.dsx (for the export), ExamineRows.java (for the Source) and ExamineRows.class for the actual compiled class.   I haven’t had a chance to re-try it after downloading from here, so worst case, you’ll need to recompile the class yourself in your environment.  Otherwise, it should run in v8 “as is”.  See the file at the Readme link for details on the expected classpath in the Job, etc., and read the annotations in the Job itself after you import it.  -e

Examine Rows Class, Examine Rows Java Source, Examine Rows Readme, Examine Rows DataStage Export

33 Responses to “Incorporating Java classes into your DataStage Jobs”

  1. Yuan Says:

    Hi ernie, thanks for your post. Does JavaPack need extra purchase in DataStage Enterprise 7.5 version? How could I know if this package has been installed? Thanks!

  2. dsrealtime Says:

    It used to be a cost item, but now is downloadable for no charge if you are at a 7.5 site — but you’ll have to speak with whoever you have your support with to find out where to get it. Once installed there will be two new stages…JavaClient and JavaTransformer, and you will also have a java/bin subdirectory under DSEngine, if I recall correctly….. in it you’ll find server java bits, including tr4j.jar ……

    Ernie

  3. Tracy Says:

    Ernie,
    Thanks for creating this blog, it has helped me understand accessing web services through datastage.

    I’m sorry this is off-topic, but I don’t know if you’re still monitoring the web services posts you did late last year, so I thought I’d post to this very recent thread instead.

    I’m accessing a web service for Daptiv (formerly eProject) which is coded to the SOAP 1.2 standard. Now, the Web Service Metadata Importer can’t import message and metadata information for SOAP 1.2, and even says so in the documentation:

    The following standards and specifications are not supported by the Web Service Meta Data Importer. In the plug-in stages, you can access web services that use these standards and specifications. However, you must manually create table definitions from their WSDL documents.

    My question is: How can you manually create table definitions from WSDL documents, and furthermore, how do you access web services which were not imported from the WS Metadata Importer? It seems like the only place you can bring these web service definitions in is through the Web Service Browser, and that is filled only by the Metadata Importer. There is one place where the documentation on the Web Services Pack is trying to explain how to generate SOAP messages from the WSDL of the service, and it first says that you have to use the metadata importer to import the operation, then tells you what to do if you didn’t import the message that way:

    Generating the SOAP message from the WSDL of the Web Service First use the Web Service Meta Data Importer to import the operation. 1. Click the Input Message tab. 2. To load namespace information and input parameters for the web service operation listed as a Stage property, click Load Message Information. One of the following conditions applies: v If you used the Web Services Meta Data Importer to import the web services, the generated table definition created is automatically selected and loaded. A window opens if there is conflict with an existing column. v If you did not use the Web Services Meta Data Importer to import the web service, see “Loading input table definitions.”

    This all has me quite confused, I guess I just need to know the best way to set up a web service job if you need to address SOAP 1.2. Any help you can provide would be greatly appreciated.

    Thanks,
    Tracy

  4. dsrealtime Says:

    Hi Tracy…

    Fortunately, wordpress does a nice job of notifying me about comments, so I would have seen it either way! The blogging is still new to me, and I’m already finding it desirable to have more ways to review old entries. But just the same, I’d have seen the comment on old or new threads.

    Interesting issues here…SOAP 1.2 is certainly something to consider, although I suspect that all should be ok provided it’s not also using one of the concepts in 1.2 such as “attachments”. …Arrays and complex structures could bite as well…. Give me some time to absorb your thoughts above and together we’ll see what ideas we can come up with…

    Ernie

  5. Ayodeji Olasoji Says:

    WE are trying to compiler the Java code without having any Java experience. The compiler can’t seem to locate the following files.
    import com.ascentialsoftware.jds.Row;
    import com.ascentialsoftware.jds.Stage;
    import com.ascentialsoftware.jds.Column;
    Can you help?. We are currently running on IIS 8.0.1

    Regards

    Deji

  6. dsrealtime Says:

    My bad. This is an old comment. Sorry for missing it. I’m sure you found all of these already. But for other’s reference, you should be able to find them for release 8.x in /IBM/InformationServer/Server/DSEngine/java/lib (or close — I did this from memory). For 7.x it’s nearly the same, but in the main DataStage directory structure instead of /IBM/InformationServer ….

    Ernie

  7. AJ Says:

    where is your tr4j.jar located?

    The path of this JAR should be in your system CLASSPATH variable.

  8. dsrealtime Says:

    Same place. C:\IBM\InformationServer\Server\DSEngine\java\lib

    I don’t have immediate access to a 7.x DS machine, but it should be found in a similar location.

    Ernie

  9. Ricky Bannerjee Says:

    Following on from Ayodeji Olasoji comments:

    We have added the location of the tr4j.jar file to the CLASSPATH, but it still doesn’t seem to compile. The error message we receive is:
    “cannot find symbol”. Can you help? We are currently running on IIS 8.0.1

  10. Ayodeji Olasoji Says:

    Thanks Ernie for your reply.

    I sort of left this for a while to fix other problem, now I am back to it. I was able to compile the code by seeting the classpath as you have instructed.
    Due to my very limited Knowledge of Java and JMS messaging I have ran into another problem. When I tried executing the Java class from within IIS 8.1 (DS) I get the following error :
    “LoadQueue_test..Java_Client_2: TJClient::initialize: unable to create Java Virtual Machine; classpath = /da01/apps/IBM/infoserver/Server/DSEngine/java/lib/tr4j.jar”
    The environment setting of job the last time I ran is as follows:

    DATASTAGE_JRE=/apps/IBM/infoserver/ASBNode/apps
    DATASTAGE_JVM=jre/bin/j9vm
    JAVA_HOME=/apps/IBM/infoserver/ASBNode/apps/jre

    I noticed thst I could not find j9vm in the DATASTAGE_JVM path, I have listed the content of the directory below:

    dsadm>:/apps/IBM/infoserver/ASBNode/apps/jre/bin>ls -1
    ControlPanel
    java
    java_vm
    keytool
    orbd
    policytool
    rmid
    rmiregistry
    servertool
    tnameserv

    The closest file by name is java_vm, I also did a find on the box and I could not find the j9vm library.

    Also within the job my classpath was set as:

    /apps/IBM/infoserver/Server/DSEngine/java/lib/tr4j.jar:/apps/IBM/infoserver/Server/DSEngine/java/lib/api.jar:/apps/IBM/AppServer/profiles/default/classes/log4j.jar:/export/home/dsadm

    Do you have any idea of how to fix this.

    Thanks again for your help.

  11. Andy Sorrell Says:

    Just a note – It took me a couple of tries to get the job to work because I placed the java class file on the client because of the Windows pathname. Once I realized you were probably developing on a stand-alone Windows box I moved them to the Linux server, modified the path in the job and it worked great.

  12. shiv Says:

    I m a websphere Datastage Developer , Can Show any examples how u people integrate that java code using java client or java transformer

    • dsrealtime Says:

      Hi Shiv…did you find the java pack example here on the blog? You can download a simple one that illustrates how it works. Check the table of contents….

      • shiv Says:

        The example here did’nt worked for me, i have a datastage 8.1 server edition .
        i have the class and even in the transformer i have given the path of the classes , but in the transformer there is option called Java virtual Machine Options, I don’t know what to give here . Is there any plugin or Jar files i have to import in the java transformer ? so can u tell me what i have to define in java transformer . If u want more descriptions about my job then tell me, i will let u know
        thanks for ur reply

      • dsrealtime Says:

        You shouldn’t need to put anything in the jvm options….those might only be necessary in case you had to increase heap size or other such things. Depending on your platform, you may need to set some variables for the location of the jre used by JavaPack….. what sorts of errors are you getting?

      • shiv Says:

        i am getting these errors in the director
        1.Java_Transformer_1,0: Warning: java_transformer_try.Java_Transformer_1: ASCL-DSJNI-00001`:`TJStage::initialize: initializing Transformer for Java stage; resource path not found
        2.DB2_UDB_API_2,0: Warning: java_transformer_try.Java_Transformer_1: ASCL-DSJNI-00001`:`TJStage::initialize: initializing Transformer for Java stage; resource path not found
        3.Java_Transformer_1,0: Error: TJClient::initialize: unable to create Java Virtual Machine; classpath = java/lib/tr4j.jar
        ASCL-DSJNI-00011`:`JNIWrapper: load library failed: directory java/jre\bin/classic, name jvm
        The specified module could not be found.
        4.DB2_UDB_API_2,0: Error: TJClient::initialize: unable to create Java Virtual Machine; classpath = java/lib/tr4j.jar
        ASCL-DSJNI-00011`:`JNIWrapper: load library failed: directory java/jre\bin/classic, name jvm
        The specified module could not be found.
        5.APT_CombinedOperatorController,0: Resource bundle corresponding to message key DSTAGE-TODC-00017 not found! Check that DSHOME or APT_RESPATH is set.

        THIS is the error what i m getting in the datastage director

      • dsrealtime Says:

        Yes…those are the initialization errors I was referring to. Contact your support provider for the actual details — it depends a bit on your platform and exact release, but you will probably need to set variables for DATASTAGE_JVM and DATASTAGE_JRE, and maybe one or two others. Otherwise, the example here should run fine and give you an idea of how the api in JavaPack functions.

        Ernie

      • shiv Says:

        the main thing i have to ask here i have implent a java transformer stage in a parallel job or a server job ,

  13. shiv Says:

    As i am IBM employee and datastage is an IBM product so i m using the latest version i.e 8.1 and with all fixes and even if i have to set the variables in datastage_JVM and datastge_jre where exactly i have to set those variables.
    And i m not using the examples what u have shown , i have created my own classes where i have given my options . if u want to know more about my job what i m doing in that then tell me i will let u know

  14. 2010 in review « Real-Time Data Integration Says:

    [...] The busiest day of the year was January 5th with 152 views. The most popular post that day was Incorporating Java classes into your DataStage Jobs. [...]

    • bhuvnesh2703 Says:

      Hello Ernie,

      Thanks for the post.

      I am trying to work out a test job using a Java transformer stage for learning how to configure it.

      When i try to run the job. I get following error:
      Error: java.lang.ClassNotFoundException: com.ascentialsoftware.jds.test.UpperCase.

      Steps I have done:
      I have taken a java class present in the Datastage Documentation (UpperCase class). Compiled it externally in a tool like eclipse. Included the tr4j.jar file before compiling in the lib. The java class got compiled successfully. Now, i have placed the compiled UpperCase.class file in a directory on the Linux server.

      In the Java transformer, i have given the User’s Classpath as the directory holding the class file.
      Ex: User Classpath:
      /etl/dev/MyDirectory
      Inlcuded the Classpath holding tr4j.jar file too.

      & Class name as
      Ex: Transformer Class name: com.ascentialsoftware.jds.test.UpperCase
      (Is this the right way to provide the class name? If not please suggest. UpperCase is the class name. I have tried giving the class name as simply “UpperCase” also.)

      I have provided the input metadata as a column “EmpName” and output metadata as “EmpOutName”.

      In the Datastage Administrator. I have set the following 2 variables.
      DATASTAGE_JRE
      DATASTAGE_JVM

      Please suggest what shall be done.
      It would be great, if you can mail dsx of a job containing Java Transformer & Java client stage.

      Regards,
      Bhuvnesh
      Email: bhuvnesh2703@gmail.com

  15. bhuvnesh2703 Says:

    Hello Ernie,

    I am able to solve the above issue. I placed the files in a folder in which the DS user does not had permissions. Please ignore the above mail.

    However, Now, i am stuck with the following error:
    Java_Transformer_3,0: Error: java.lang.NoClassDefFoundError: UpperCase (wrong name: com/ascentialsoftware/jds/test/UpperCase)

    The Class File “UpperCase.class” is present in the following directory: /etl/dev/source_files/adw

    In the User’s Classpath: I have added
    - /etl/dev/source_files/adw
    -/opt/IBM/dev/APP/InformationServer/Server/DSEngine/java/lib/tr4j.jar
    -/opt/IBM/dev/APP/InformationServer/ASBNode/apps/jre/bin

    Please share a dsx.

    Regards,
    bhuvnesh2703@gmail.com

  16. Sushi Says:

    Hi Ernie,

    We have a requirement to export data from an API call. it is a URL which is used for bulk export and I get the set file which i need to store on the system and then process it can you please help me on what are the opetions to get that data is there a way i use the java tansformer for this.

    Thanks
    Sushi

    • dsrealtime Says:

      Is it a REST based API call? (one that returns pure xml?)….JavaPack would certainly be one way…but it’s possible that you might be able to leverage the XML Stage in 8.5 for this also…..

  17. tmmcnicol1723 Says:

    Trying to get .dsx for this. How does one become a ‘user’ of this blog. I have a wordpress.com account?

    • dsrealtime Says:

      Not sure what you mean by a “user” of the blog….do you mean of WordPress in general? They do a great job supporting bloggers thru their hosted site, although I know of other folks who have used WordPress’ open source code and hosted their blogs elsewhere…………as for the .dsx, it should be available in the post…but I’ll check it and see if there is an issue.

      Ernie

      • tmmcnicol1723 Says:

        This is the message I get when I click on the example links.

        — 403: Access Denied —

        This file requires authorization:

        You must both be a user of this blog as well as be currently logged into WordPress.com

        I am logged on to wordpress.com

        Reply

        Leave a Reply Cancel reply

        This is the message I get when I click on the example links.

        — 403: Access Denied —

        This file requires authorization:

        You must both be a user of this blog as well as be currently logged into WordPress.com

        I am logged on to wordpress.comThis is the message I get when I click on the example links.

        — 403: Access Denied —

        This file requires authorization:

        You must both be a user of this blog as well as be currently logged into WordPress.com

        I am logged on to wordpress.com

        Guest
        tmmcnicol1723
        tmcnicol1723: You are commenting using your WordPress.com account. (Log Out)

      • dsrealtime Says:

        hmm. ok. Try this instead: hover your mouse over the link, then use your right mouse and choose “save as”…..the files can then be saved wherever you want…..just rename them to their formal suffixes (the DS export should be renamed to .dsx, etc.).

        Ernie

  18. Serina Adickes Says:

    you’re really a good webmaster.The web site loading speed is incredible.It seems that you are doing any unique trick.In addition, The contents are masterpiece.you’ve done a wonderful job on this topic!

  19. tmmcnicol1723 Says:

    Thanks for the guidance as I was able to get the XML transformer configured and it now takes in three inputs from a DS job and calls the Java class files to insert into an LDAP DB. Now I have to capture the log message that is created in the Java program (which is in the director log) and evaluate it to determine if the process was successful.

  20. htamboli Says:

    How Do I get those files. I am getting Access Denied.

  21. tdeuchler Says:

    nice blog here…. very usefull


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.