Thought it was about time for another set of entries on Web Services. A few months ago I wrote about how we can invoke or “call” a Web Service from within DataStage, with a brief overview of the process. Now it’s time to address more difficult scenarios. I’ve had some comments from some of you who have run into issues — hopefully these details will help.
What exactly is a “complex” web service? Well, the word “complex” is relative, but I guess for our purposes here, it means one that you couldn’t easily get to work the first time out-of-the-box! 😉 Seriously though, Web Services come in a lot of flavors. The industry standards are in place, but not always rigorously followed. Web Services may have been designed and written before the standards were solid, or there may simply be constructs within and issues around a particular web service and its pattern that may it difficult to implement in any client tool. In fact, understanding the pattern of the service may be the most important research you can do before trying to utilize a web service in your applications. How often is it called? What does it do? What does it deliver? Does it send back one row or many rows? Does it expect you to “give it” one row or many rows when called? Does it send back data directly or just a big chunk of xml? Does it need you to send it a big chunk of xml?
Let’s start with defining what I’d call a “simple” web service and then go on from there. A perfectly simple web service is one with single row input and single row output, with maybe just a couple of columns in/out, no out-of-the-ordinary datatypes, and located on a nearby machine behind the firewall without any security. It will be invoked for every row that flows past. For DataStage purposes, it will use the Web Services Transformer Stage. In contast, here are many of the factors I’d use to qualify a web service as “complex”:
Output or Input only. Some web services are sources or targets. Call it once and it delivers rows, or receives information at a final target. In DataStage, we use the Web Services Client Stage for these. The gui lets you choose inputs that are static, particularly important if this is a “source”. For what it’s worth though, I tend to use the Web Services Transformer stage for these also, because I prefer to send my input via a link…it allows me to be more creative with the source parameters. What’s important though is that you understand the “pattern” of your desired Web Service
Security and proxies. Can you get there from here? What do you need to get past your firewall? Is HTTPS and SSL part of the web service you need to invoke? In basic scenarios, DataStage provides properties directly for these. WS-Security on the other hand, is more difficult, and involves a lot more hand-shaking and coordination between SOAP client and the provider of the Service.
Complex SOAP Bodies. XML comes in a lot of varieties. Sometimes the SOAP body being sent or received is in a hierarchical form. If that’s the case, we’ll need to decipher that XML into its relational rows and columns after we receive it at the client.
Embedded XML. A SOAP envelope is xml, but sometimes “contains” xml. In otherwords, we might have a simple string called for in the web service “myWebServiceResponse,” but there is no detail for it — it’s simply a big giant chunk of XML that is being passed back for further deciphering by the client. This is similar to the complex SOAP body, except in this case the WSDL contract knows nothing about the structure expect that a single string is being sent back. Again, we’ll need to decipher that XML into its appropriate parts after reception.
Arrays. The request or response is looking for, or sending back, an “array,” or “list” of values. There may be multiple columns in each entry of the list, or just one. This is most easily identified by looking at the WSDL in a browser — developers usually name the data areas as “ArrayOf……” or “ListOf…..”, although not always. A weather example I saw recently had “ListOfCities,” for example.
SOAP Headers. Ah. These can be tricky. Getting SOAP Headers right usually means having to know a bit more about the Service. The authors of the service will hopefully have documented what they expect in a SOAP Header, and when. Complex Web Services APIs may have a sequence of calls…one to “login” and perhaps get an access code, and then another to get started and do real work. The access code and other details are often parts of the header. Formal SOAP Headers are one thing, and what you might call “pseudo-headers” are another. In certain cases, the authors of a Service may have chosen, for various reasons, to place userID and access details inside the Body instead. It depends on the age of the service, how mature the standards, and the SOAP clients using them were when the service was first put into production, and the creativity of the original developer. We can retrieve these from the response, or if necessary, build them for the input request.
A fellow DataStage user has shared a WSDL with some of the issues above. As we work thru it and get it operational, I’ll share the techniques required.