How to Invoke Complex Web Services

Thought it was about time for another set of entries on Web Services. A few months ago I wrote about how we can invoke or “call” a Web Service from within DataStage, with a brief overview of the process. Now it’s time to address more difficult scenarios. I’ve had some comments from some of you who have run into issues — hopefully these details will help.

What exactly is a “complex” web service? Well, the word “complex” is relative, but I guess for our purposes here, it means one that you couldn’t easily get to work the first time out-of-the-box! ;) Seriously though, Web Services come in a lot of flavors. The industry standards are in place, but not always rigorously followed. Web Services may have been designed and written before the standards were solid, or there may simply be constructs within and issues around a particular web service and its pattern that may it difficult to implement in any client tool. In fact, understanding the pattern of the service may be the most important research you can do before trying to utilize a web service in your applications. How often is it called? What does it do? What does it deliver? Does it send back one row or many rows? Does it expect you to “give it” one row or many rows when called? Does it send back data directly or just a big chunk of xml? Does it need you to send it a big chunk of xml?

Let’s start with defining what I’d call a “simple” web service and then go on from there. A perfectly simple web service is one with single row input and single row output, with maybe just a couple of columns in/out, no out-of-the-ordinary datatypes, and located on a nearby machine behind the firewall without any security. It will be invoked for every row that flows past. For DataStage purposes, it will use the Web Services Transformer Stage. In contast, here are many of the factors I’d use to qualify a web service as “complex”:

Output or Input only. Some web services are sources or targets. Call it once and it delivers rows, or receives information at a final target. In DataStage, we use the Web Services Client Stage for these. The gui lets you choose inputs that are static, particularly important if this is a “source”. For what it’s worth though, I tend to use the Web Services Transformer stage for these also, because I prefer to send my input via a link…it allows me to be more creative with the source parameters. What’s important though is that you understand the “pattern” of your desired Web Service

Security and proxies. Can you get there from here? What do you need to get past your firewall? Is HTTPS and SSL part of the web service you need to invoke? In basic scenarios, DataStage provides properties directly for these. WS-Security on the other hand, is more difficult, and involves a lot more hand-shaking and coordination between SOAP client and the provider of the Service.

Complex SOAP Bodies. XML comes in a lot of varieties. Sometimes the SOAP body being sent or received is in a hierarchical form. If that’s the case, we’ll need to decipher that XML into its relational rows and columns after we receive it at the client.

Embedded XML. A SOAP envelope is xml, but sometimes “contains” xml. In otherwords, we might have a simple string called for in the web service “myWebServiceResponse,” but there is no detail for it — it’s simply a big giant chunk of XML that is being passed back for further deciphering by the client. This is similar to the complex SOAP body, except in this case the WSDL contract knows nothing about the structure expect that a single string is being sent back. Again, we’ll need to decipher that XML into its appropriate parts after reception.

Arrays. The request or response is looking for, or sending back, an “array,” or “list” of values. There may be multiple columns in each entry of the list, or just one. This is most easily identified by looking at the WSDL in a browser — developers usually name the data areas as “ArrayOf……” or “ListOf…..”, although not always. A weather example I saw recently had “ListOfCities,” for example.

SOAP Headers. Ah. These can be tricky. Getting SOAP Headers right usually means having to know a bit more about the Service. The authors of the service will hopefully have documented what they expect in a SOAP Header, and when. Complex Web Services APIs may have a sequence of calls…one to “login” and perhaps get an access code, and then another to get started and do real work. The access code and other details are often parts of the header. Formal SOAP Headers are one thing, and what you might call “pseudo-headers” are another. In certain cases, the authors of a Service may have chosen, for various reasons, to place userID and access details inside the Body instead. It depends on the age of the service, how mature the standards, and the SOAP clients using them were when the service was first put into production, and the creativity of the original developer. We can retrieve these from the response, or if necessary, build them for the input request.

A fellow DataStage user has shared a WSDL with some of the issues above. As we work thru it and get it operational, I’ll share the techniques required.

Ernie

6 Responses to “How to Invoke Complex Web Services”

  1. venu Says:

    Hi,
    while im process an Embedded XML. A SOAP envelope in datastage im getting below error please adivise
    xml_job1..XML_Input_0: Xalan fatal error (publicId: , systemId: , line: 0, column: 0): An exception occurred! Type:NetAccessorException, Message:Could not connect to the socket for URL ‘{0}’. Error={1}

    xml_job1..XML_Input_0: Xalan fatal error (publicId: , systemId: , line: 1, column: 230): Fatal error encountered during schema scan

  2. dsrealtime Says:

    Venu …

    You should have received the .dsx from me with the solution. Not sure exactly what was thwarting you in your installation, but as we discussed, there are a variety of solutions. For everyone else, it appeared to be a namespace issue. In my 8.x with FP1, I was able to read the documents just fine, but if there are issues and you are reading XML to put into an rdbms or other target, there are lots of shortcuts to get you moving….the easiest is to just pass your xml string thru a Transformer and edit out the offending namespace prefixes. You can’t always do that, but very often, especially if you are just shredding the document anyway, it will be fine. DataStage has rich text functions, especially in the BASIC Transformer or Server Jobs, that can manipulate lengthy text strings, with eReplace being one of my favorites.

    Ernie

  3. Heartburn Home Remedy Says:

    Not that I’m totally impressed, but this is more than I expected for when I stumpled upon a link on Digg telling that the info is quite decent. Thanks.

  4. Anu Says:

    Hi,
    We have been reading your articles Datastage webservices and they have been very helpful. I have a doubt, was wondering if you can help us.
    We need to connect Datastage 8.0 to an application via a URL by passing username and password in the HTTP headers. This application will return the cookie and a session Id within the cookie. This session Id should be captured for further transactions.
    Now we imported the WSDL published by this application and created a Webservices Tranformer Stage but have not been able to figure out on how to pass the username and passwd in the HTTP header to connect with this app? Any help?

    • dsrealtime Says:

      It can be done, but it is not particularly easy. The WSTransformer has the ability to pass a header, but it has to be in the absolutely correct format, and namespaces have to right….. the best way to approach it would be to use some separate testing tool (Actional, SOAPscope from Mindreef, etc.) and formally capture a “working” SOAP envelope and header. Then you will be able to tell what you need to build and send. It gets more difficult than that, because you may also have to custom craft the details of the SOAP body, and then do the same with the response and any other queries/commands you may be issuing. WSTransformer is best for single row, stateless and no-session invocations that return a transformed result. If you need to do complex processing it is best to put it together in a java client and invoke via the Java Transformer. In the java client you will have all the control that you need, and can more easily unit test it outside of DS and then include in DS for integration testing once you know how Java Pack works.

      Ernie


Leave a Reply