A seldom used object in Metadata Workbench is a “Data File”. It is not as common because it has to be manually created. Database Tables are created whenever you use a Connector or other bridge to import relational tables from a database. Data Files, however, can only be created manually, using the istool workbench generate feature, or from inside of the DataStage/QualityStage Designer.
Why create Data Files?
A Data File is the object available in the Metadata Workbench that represents flat files, .csv files or DataSets. It is able to connect to the Sequential Stage or Dataset Stage for data lineage purposes. A Data File object might be used also for pure governance reasons — a special transaction file might be defined by a particular Business Term, or you might want to assign a Steward to the Data File object — the subject matter expert on one particular file. Of course, if you are a DataStage user, you probably use regular Sequential Table Definitions all the time. Data Files are similar but are more “fixed” — they are designed to represent a specfic flat file, on a given machine, and in a particular sub-directory, as opposed to being a general metadata mapping with proper column offsets for any file that matches the selected schema.
The simplest way to create a formal Data File is to start with a DataStage Table Definition. You may already have one that was created when you imported a sequential file, or can easily create one using the “Save” button on any column list within most Stages. Once you have the Table Definition, double click on it. Review all of the tabs across the top. Pay special attention to the “Locator” Tab. Click on that one. Look at its detail properties. Values at the Locator tab control the creation of Data Files or Database Tables.
Set the pull-down option at the top to “Sequential”. If that value is not already in your pull-down list, type it in… Towards the bottom you will see an entry for the Data Collection — put in the name you want for your file. Close the Table Definition.
Now put your cursor on that Table Definition in the “tree”. Right mouse and select “Shared Table Creation Wizard”. When that dialog opens, click Next. Then open the pull-down dialog and select “create new”, and click Next. Notice the properties at this new page….you have the Filename, the Host (pick a machine or enter a new one) and Path. Make the filename the SAME as what you have hard coded in your Sequential or Dataset Stage, or the filename of any fully expanded Job Parameter default values that you are passing into it. Then set the “Path” value to the fully qualified path of the expanded Job Paramters or what you have in the same filename property. For example, if your filename in the Stage looks like this:
/tmp/myfile/#myfilename# …and #myfilename# has a default value of mySequentialFile.txt
Then use mySequentialFile.txt as the Filename and /tmp/myfile (without the final slash) for the path. Now you will have a Data File inside of Metadata Workbench that you can govern with Steward and Term assignments, and it also will stitch to the Stages that use its name in hard coded fashion or expanded Job Parameters for Design time or Operational lineage.
Ernie