I wasn't going to be writing any more blogs about the new SAP Information Steward Metadata Management immediately but confess to being intrigued by the range of third party metadata integrators available when preparing my blog on adding new SAP integrators. In particular, as both an Oracle and SAP BI Partner, I was interested in the potential of integrating in the metadata available within Oracle BI to the SAP Information Steward. It's proven to be an interesting investigation so thought I'd put up a little blog on what's possible and what, frustratingly, seems not to be.
In previous posts looking at the new SAP Information Steward Metadata Management I've given an overview of how metadata can be brought together from different sources and then walked through how to set up a new Metadata Integrator that will interrogate those sources. In this post I'll give a quick overview of how SAP BusinessObjects developers can use this gathered metadata to quick start the process of Universe Design. Of course building Universes will be bread and butter stuff to most SAP BusinessObjects developers but I think there's a potentially productive new discipline on offer through this SAP information Steward feature. As I've mentioned before in these blogs, SAP Information Steward is an environment where both Business and IT users can come together. By sourcing the tables and columns they are going to use from those verified in SAP Information Steward, Universe Designers can be sure that they are using the business preferred data sources or, alternatively, use the tool to suggest preferable options.
In my previous post I gave an overview of the Metadata Management module of the new SAP Information Steward. This provided a single directory structure for a number of Metadata Integrators providing a unified view of metadata across your BI environment. Metadata Integrators are available for a wide variety of integration points including SAP BusinessObjects, Data Services, Data Insight, CWM Models and Third Party interfaces such as Oracle Data Integrator or Microsoft SQL Server. In this blog I'm going to look at the process for setting up a new Metadata Integrator, specifically one which will gather the metadata details from an SAP Data Services Repository. The process is similar regardless of which Metadata Integrator you are setting up so hopefully of use for most scenarios.
In previous blogs about the new SAP Information Steward I've looked at the Data Profiling and Quality Scorecarding capabilities. Both are useful for developing a true picture of the quality of your data and ongoing initiatives to improve it - key requirements for any Data Governance programme. But what about the use to which that data is put? How is it transformed? Where is it deployed? What reports rely upon it? It's to answer these questions and more that SAP Information Steward also includes a Metadata Management module and this is going to be my subject for the next few blogs in this series.
In my previous posts in this series (here and here) I looked first at scheduling the SAP Information Steward Data Profiling tasks and then building and processing the Business Rules that validated specific tests against this data. Both are essential pre-requisites for the design and build of SAP Information Steward Quality Scorecards reviewed in this post.
The principal benefit of the Quality Scorecards to my mind is that it provides a front window to any organisations Data Quality initiative. It brings to life what is too often an abstract concept by using proven dashboard communication principles. Aggregating the weighted results of the previously defined Business Rules allows Data Stewards to present a single view of any data domain (e.g. a single score recording the quality of our Product Data) and, crucially, the progress being made over time in improving this quality. Combined with the other features of SAP Information Steward – the Metadata Integrators, Metapedia definition resources and Cleansing Package Builders – these Quality Scorecards have the potential to become the central point of Data Governance team management, prioritising and monitoring activity.
In my previous blog about using SAP Information Steward to Build Scorecards, I reviewed the fairly straightforward process by which Data Profile tasks are scheduled. This is important as it automates the delivery of profiling results which too many Data Governance environments are currently running on a costly, manual basis. It also provides the raw data profile results by which a Data Governance team can then apply Business Rules which validate business constraints and requirements and it is the setup of this process in SAP Information Steward which I'll cover in this blog.
The distinction between profiling data and applying business rules to data is a useful one for any Data Governance team. It may be, for example, that your profiling of a product data set reveals there to be lots of misspellings in the Product Colour field. In some businesses that may be fine whilst for others this could be a rule cause of concern. The former group will be relaxed about the occasional key slip and not need to prioritise correction. The latter group however will perhaps be grouping the results of queries by colour and will create a business rule that checks how many records comply with the correct spelling. They will want to use this business rule to monitor the accuracy of colour on an ongoing basis. This may, for example, help them decide whether to change the input of colour at source system screen from free text to a fixed list of values.
SAP Information Steward allows Data Governance teams to apply their business rules (whether already known or 'discovered' through the data profiling process) to data, automate their calculation and monitor the trends of compliance. With the rules built, it is then possible to aggregate them into the Quality Domains shown in the SAP Information Steward Dashboards. That is a process I'll review in the third of these Building Scorecard blogs. This post is solely concerned with measuring business rule compliance.
When I was first introduced to the concept of Data Quality tools (many moons ago admittedly) my reaction was naively indifferent. So what, I thought. Just a bunch of SQL in a box that checks missing values, formats, etc... Fair enough reaction for one so very young but, of course, there's more to it than that. Simply profiling data is worthwhile but the value is undeniably limited when compared to Data Auditing where those profile tasks are conducted on a regular, automated schedule and the Data Governance team can begin to see the development of data quality issues and/or improvements over time.
SAP Information Steward does, of course, have just this functionality included and, critically, has introduced Quality Domains where aggregated quality scores for related entities can be accessed and reviewed. In my first post about Using Information Steward, I looked at the process of Profiling. In this and a further two blogs, I'll take the story forward to explain how the Information Steward Scorecards are built examining how business data rules are set up against scheduled profiling results before themselves being scheduled and aggregated into a Quality Domain based scorecard.
We'll start with the scheduling of the Information Steward Data Profiling jobs.
SAP Information Steward is one of the more intriguing components of the recently released 4.0 Product Suite. It's positioned as an aid to Data Governance programmes providing tools to help manage data quality, resolve metadata definition conflict and plan data cleansing routines. The tool not only integrates into your existing SAP BusinessObjects environment but also other metadata sources such as Oracle Warehouse Builder. In later blogs, I'll review some of this functionality in detail but this blog will review the relatively straightforward process of installing the SAP Information Steward environment.
So, it's almost August 2011 and we're still waiting for the general availability of the new SAP Business Objects 4.0 platform. At Maxima, we've been getting used to the Ramp Up release for a while now and I've been paying particular attention to the new Information Design Tool. It doesn't replace the Universe Designer as such but, if you were building a new data source for your WebIntelligence, Crystal or Dashboards (formerly Xcelsius...I'll stop saying that soon!), then the IDT is where you'd start. The headline news is that it allows for federated data sources where the traditional Universes have only ever allowed a single source system but there is lots more detail in the tool than that. This blog focuses on the new Data Profiling capability.
Over the last few months I've been preparing for the SAP Business Objects 4.0 ramp-up and release, specifically looking into the Data Services and Information Steward sections of the new release.
For those of you who haven't come across these products before here's a quick summary.