Not having to do all the leg work necessary to arrive at table-level data lineage in the PeopleSoft EPM Warehouse is a great testimony to good documentation and a great culture back there at big PS, for those of you that enjoy this benefit, here’s some suggestions on how to leverage it:
1) [Re]organize the excel spreadsheet so that it reflects the sequence of your ETL schedule and if time allows add simple job dependencies.
2) Yes, consolidate all the lineage worksheets into a single centralized list (or at least one per business domain [hr, fin, scm]).
3) Add a second worksheet to your consolidated lineage file to track hash file dependencies and read/write operations.
3) Add a third worksheet to track look up operations and their keys.
The benefits of this approach far out weight the initial investment, when you realize how tedious this task is, just keep in mind all those hours that you’ve spent in agony debugging DataStage programs in the past. This spreadsheet is a real time saver when trying to find our where a particular column came from or what jobs perform write operations on the sequential file that is messing up your dimension.
While I work on it I keep my sanity thinking about the new episode of Scrubs next week and being able to make my deadline, I’m looking forward to a big celebration and a weekend in Vegas with my friends…