About a year ago the Hierarchical Stage (used to be called the “XML” Stage) added the capability of invoking REST based Web Services. REST based Web Services are increasing in popularity, and are a perfect fit for this Stage, because most REST based services use payloads in XML or JSON for their requests and responses.
REST based Web Services have a couple of challenges, however, because they do not use SOAP, and consequently, they rarely have a schema that defines their input and output structures. There is no “WSDL” like their is for a classic SOAP based service. On the other hand, they are far less complex to work with. The payloads are clean and obvious, and lack the baggage that comes with many SOAP based systems. We won’t debate that here…both kinds of Web Services are with us these days, and we need to know how to handle all of them from our DataStage/QualityStage environments.
Here are some high level suggestions and steps I have for working with REST and the Hierarchical Stage:
1. Be sure that you are comfortable with the Hierarchical Stage and its ability to parse or create JSON and XML documents. Don’t even think about using the REST step until you are comfortable parsing and reading the XML or JSON that you anticipate receiving from your selected service.
2. Start with a REST service “GET” call that you are able to run directly in your browser. Start with one that has NO security attached. Run it in your browser and save the output payload that is returned.
3. Put that output in a .json or .xml file, and write a Job that reads it (using the appropriate XML and/or JSON parser Steps in the Assembly) Make sure the Job works perfectly and obtains all the properties, elements, attributes, etc. that you are expecting. If the returned response has multiple instances within it, be sure you are getting the proper number of rows. Set that Job aside.
4. Write another Job that uses the REST Step and just tries to return the payload, intact, and save it to a file. I have included a .dsx for performing this validation. Make sure that Job runs successfully producing the output that you expect, and that matches the output from using the call in your browser.
5. NOW you can work on putting them together. You can learn how to pass the payload from one step to another, and include your json or xml parsing steps in the same Assembly as the REST call, or you could just pass the response downstream to be picked up by another instance of the Hierarchical Stage. Doing it in the same Assembly might be more performant, but you may have other reasons that you want to pass this payload further along in the Job before parsing.
One key technique when using REST with DataStage is the ability to “build” the URL that you will be using for your invocations. You probably aren’t going to be considering DataStage/QualityStage for your REST processes if you only need to make one single call. You probably want to repeat the call, using different input parameters each time, or a different input payload. One nice thing about REST is that you can pass arguments within the URL, if the REST API you are targeting was written that way by its designers.
In the Job that I have provided, you will see that the URL is set within the upstream Derivation. It is very primitive here — just hard coded. It won’t work in your environment, as this is a very specific call to the Information Governance Catalog API, with an asset identifier unique to one of my systems. But it illustrates how you might build YOUR url for the working REST call that you are familiar with from testing inside of your browser or other testing tool. Notice in the assembly that I create my own “argument” within the REST step which is then “mapped” at the Mappings section to one of my input columns (the one with the Derivation). The Job is otherwise very primitive — without Job Parameters and such, but simply an example to help you get started with REST.
…another good reference is this developerWorks article by one of my colleagues: