Thought it was about time for another set of entries on Web Services. A few months ago I wrote about how we can invoke or “call” a Web Service from within DataStage, with a brief overview of the process. Now it’s time to address more difficult scenarios. I’ve had some comments from some of you who have run into issues — hopefully these details will help.
What exactly is a “complex” web service? Well, the word “complex” is relative, but I guess for our purposes here, it means one that you couldn’t easily get to work the first time out-of-the-box!
Seriously though, Web Services come in a lot of flavors. The industry standards are in place, but not always rigorously followed. Web Services may have been designed and written before the standards were solid, or there may simply be constructs within and issues around a particular web service and its pattern that may it difficult to implement in any client tool. In fact, understanding the pattern of the service may be the most important research you can do before trying to utilize a web service in your applications. How often is it called? What does it do? What does it deliver? Does it send back one row or many rows? Does it expect you to “give it” one row or many rows when called? Does it send back data directly or just a big chunk of xml? Does it need you to send it a big chunk of xml?
Let’s start with defining what I’d call a “simple” web service and then go on from there. A perfectly simple web service is one with single row input and single row output, with maybe just a couple of columns in/out, no out-of-the-ordinary datatypes, and located on a nearby machine behind the firewall without any security. It will be invoked for every row that flows past. For DataStage purposes, it will use the Web Services Transformer Stage. In contast, here are many of the factors I’d use to qualify a web service as “complex”:
Output or Input only. Some web services are sources or targets. Call it once and it delivers rows, or receives information at a final target. In DataStage, we use the Web Services Client Stage for these. The gui lets you choose inputs that are static, particularly important if this is a “source”. For what it’s worth though, I tend to use the Web Services Transformer stage for these also, because I prefer to send my input via a link…it allows me to be more creative with the source parameters. What’s important though is that you understand the “pattern” of your desired Web Service
Security and proxies. Can you get there from here? What do you need to get past your firewall? Is HTTPS and SSL part of the web service you need to invoke? In basic scenarios, DataStage provides properties directly for these. WS-Security on the other hand, is more difficult, and involves a lot more hand-shaking and coordination between SOAP client and the provider of the Service.
Complex SOAP Bodies. XML comes in a lot of varieties. Sometimes the SOAP body being sent or received is in a hierarchical form. If that’s the case, we’ll need to decipher that XML into its relational rows and columns after we receive it at the client.
Embedded XML. A SOAP envelope is xml, but sometimes “contains” xml. In otherwords, we might have a simple string called for in the web service “myWebServiceResponse,” but there is no detail for it — it’s simply a big giant chunk of XML that is being passed back for further deciphering by the client. This is similar to the complex SOAP body, except in this case the WSDL contract knows nothing about the structure expect that a single string is being sent back. Again, we’ll need to decipher that XML into its appropriate parts after reception.
Arrays. The request or response is looking for, or sending back, an “array,” or “list” of values. There may be multiple columns in each entry of the list, or just one. This is most easily identified by looking at the WSDL in a browser — developers usually name the data areas as “ArrayOf……” or “ListOf…..”, although not always. A weather example I saw recently had “ListOfCities,” for example.
SOAP Headers. Ah. These can be tricky. Getting SOAP Headers right usually means having to know a bit more about the Service. The authors of the service will hopefully have documented what they expect in a SOAP Header, and when. Complex Web Services APIs may have a sequence of calls…one to “login” and perhaps get an access code, and then another to get started and do real work. The access code and other details are often parts of the header. Formal SOAP Headers are one thing, and what you might call “pseudo-headers” are another. In certain cases, the authors of a Service may have chosen, for various reasons, to place userID and access details inside the Body instead. It depends on the age of the service, how mature the standards, and the SOAP clients using them were when the service was first put into production, and the creativity of the original developer. We can retrieve these from the response, or if necessary, build them for the input request.
A fellow DataStage user has shared a WSDL with some of the issues above. As we work thru it and get it operational, I’ll share the techniques required.
Ernie


August 23, 2008 at 3:48 am
Hi,
while im process an Embedded XML. A SOAP envelope in datastage im getting below error please adivise
xml_job1..XML_Input_0: Xalan fatal error (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:NetAccessorException, Message:Could not connect to the socket for URL ‘{0}’. Error={1}
xml_job1..XML_Input_0: Xalan fatal error (publicId: , systemId: , line: 1, column: 230): Fatal error encountered during schema scan
August 23, 2008 at 4:08 am
My gmail id is : gopalopposite@gmail.com
September 9, 2008 at 8:12 pm
Venu …
You should have received the .dsx from me with the solution. Not sure exactly what was thwarting you in your installation, but as we discussed, there are a variety of solutions. For everyone else, it appeared to be a namespace issue. In my 8.x with FP1, I was able to read the documents just fine, but if there are issues and you are reading XML to put into an rdbms or other target, there are lots of shortcuts to get you moving….the easiest is to just pass your xml string thru a Transformer and edit out the offending namespace prefixes. You can’t always do that, but very often, especially if you are just shredding the document anyway, it will be fine. DataStage has rich text functions, especially in the BASIC Transformer or Server Jobs, that can manipulate lengthy text strings, with eReplace being one of my favorites.
Ernie
April 15, 2009 at 6:44 am
Not that I’m totally impressed, but this is more than I expected for when I stumpled upon a link on Digg telling that the info is quite decent. Thanks.
July 31, 2009 at 1:03 pm
Hi,
We have been reading your articles Datastage webservices and they have been very helpful. I have a doubt, was wondering if you can help us.
We need to connect Datastage 8.0 to an application via a URL by passing username and password in the HTTP headers. This application will return the cookie and a session Id within the cookie. This session Id should be captured for further transactions.
Now we imported the WSDL published by this application and created a Webservices Tranformer Stage but have not been able to figure out on how to pass the username and passwd in the HTTP header to connect with this app? Any help?
July 31, 2009 at 3:30 pm
It can be done, but it is not particularly easy. The WSTransformer has the ability to pass a header, but it has to be in the absolutely correct format, and namespaces have to right….. the best way to approach it would be to use some separate testing tool (Actional, SOAPscope from Mindreef, etc.) and formally capture a “working” SOAP envelope and header. Then you will be able to tell what you need to build and send. It gets more difficult than that, because you may also have to custom craft the details of the SOAP body, and then do the same with the response and any other queries/commands you may be issuing. WSTransformer is best for single row, stateless and no-session invocations that return a transformed result. If you need to do complex processing it is best to put it together in a java client and invoke via the Java Transformer. In the java client you will have all the control that you need, and can more easily unit test it outside of DS and then include in DS for integration testing once you know how Java Pack works.
Ernie
January 25, 2010 at 6:11 am
Thank you for your help!
rH3uYcBX
February 18, 2011 at 6:34 pm
Hi Ernie, could you please tell me where can i get the information or example about web services with arrays, thanks.
February 18, 2011 at 6:37 pm
Hi Ernie could you please tell me where can I get information or some example about web services with arrays, thanks.
February 22, 2011 at 10:19 am
Hi Sergio… Dealing with arrays can be tricky, but it can be done. Here are some general guidelines:
a) Be sure that you are 100% successful with web services that do NOT have arrays. Able to call them easily from WSTransformer or WSClient — just to make sure that everything is ok in the environment.
b) Be sure to have a test tool, such as SOAP UI or Actional’s Soap Testing tool, so that you know EXACTLY what the behavior of the service is, and what the arrays look like when it is working, etc.
c) Start if you can with a service that takes one row in and sends back an array (these are easier to work with initially).
d) Import the WSDL and do as you would normally, except then, go back into the OUTPUT link of the WSTransformer, DELETE all the columns that come in automatically when you import at the “Message” Tab, and put in one giant column of your own (mySOAPresponse with longvarchar and a long length of your choice). Put a single “/” in the Description property, and then at the Message Tab, find the pull down for “user defined column” and select your new column.
e) Send that output link to disk, with no formatting (no commas or quotes).
f) look at it. It should be a valid chunk of xml.
g) Now you have to parse out that xml. Do you have experience with the xml Input Stage?
h) Put a Transformer between your WSTransformer and an XMLInput Stage, and another Sequential Stage after that.
i) push your single column thru the Transformer and into XMLInput. Check that this is XML “content” on the xml Input Stage.
j) on the output link for the xmlInput Stage, load the _OUT table definition that was imported from your WSDL. This will have the details for the lower level structure inside the SOAP body.
It will take some trial and error and testing, especially if you have never used the xml stages….. good luck!
…and if your service needs an array on input, it’s the same idea, but in reverse, but learning how to write xml with xmlOutput Stage is more difficult, so give yourself time to learn that, and send the xml you produce to a flat file and compare to the result used in one of your testing tools before trying to send it to the WS directly.
Ernie
July 20, 2011 at 9:12 am
Hi Ernine,
How to send the session id while invoking the sales force web service, my sales force team is saying session id is mandatory to invoke sales force web service.
-Kumar
July 20, 2011 at 9:13 am
Hi Ernine,
How to send the session id while invoking the sales force web service, my sales force team is saying session id is mandatory to invoke sales force web service.
-Kumar
August 22, 2011 at 5:39 pm
..I believe we covered this offline in a separate discussion…..you can obtain the session id in an upstream call, and then build the header needed for the next invocation.
November 16, 2011 at 9:45 am
Hi Ernie,
I asked this question in DSXchange also but trying my luck here also i think you can only help me. Can we call web service using web service client stage with out passing any input arguments. it works fine in SoapUI but in datastage it is giving error. i strongly believe it is because of void input argument because other method in the same web service is working fine through datastage. Please help me
November 16, 2011 at 9:48 am
Not much I can offer outside of the thoughts I put into dsxchange…. what happens with tracing, and with capture of the entire output payload?
November 16, 2011 at 9:59 am
If i change to trace from fatal ,job is successful but 0 rows. i did not understand ‘capture of the entire output payload’. I am using web services metadata to import table definitions