Wordless Wednesday: April 14, 2010
Photo via Flickr under terms of Creative Commons License: thor is a laptop
A Data Quality Riot Act
As the son of a former Marine Chief Warrant Officer (CW2) I have not only seen the occasional riot act being doled out to my brother; I have been on the receiving end of a few myself. I have discovered that in order for your data quality initiatives to be successful you need someone who is willing to play the role of drill sergeant on occasion. There are going to be occasions where someone needs to take control when a business process or system provides the latitude to introduce bad data into the environment. This person is going to need to motivated and passionate about rooting out bad data and the processes that allow it to be introduced.
Recently I received an email from a business analyst inquiring about a discrepancy in the between the operational system and the analytical application that his team had recently distributed across the enterprise. The discrepancy was that a name change in the operational system was not being reflected in the analytical application. At the surface, this seemed fairly benign. We took a look at the table responsible for this particular entity the table below depicts what was discovered.
| Id | First Name | Last Name | Employee Id | User Id |
| 10087632 | Olivia | Johnson | 257303 | johnsono |
| 10108465 | Olivia | Matthews | 0257303 | matthews |
Since this table behaves as a Type 4 slowly changing dimension, an update of the last name and user id would have simply move the original record to an audit table and introduced a new record with the correct last name. Instead, a new record was created not only introducing a new surrogate key for the same person but the unique constraint on the employee id was bypassed by inserting a leading zero into the field. This had to have been done by a user who was not properly trained on handling employee name changes in the operational system. A well written email can prevent this from happening again. The user could even be instructed to go back in to the system, delete the new record, and update the existing record properly. But that was not the case.
The users responsible for managing user accounts in the operational system have been instructed to create a new user account when someone submits a name change. These instructions also explicitly state to insert a leading zero for the employee id to bypass the uniqueness constraint placed on the field. What happens when an employee gets married then divorced and no longer wishes to retain her now ex-husband's last name or marries again? Will they add a second leading zero to the employee id?
It was upon hearing the details of the business process that I could feel the increase in my blood pressure. This operational system has been live enterprise wide for less than six months and the business processes that support it are already showing signs of failure. History did not provide the learning experience that you would have hoped; it will take someone to put on their drill instructor uniform and break the privates in their platoon of their bad habits. Only then can they be built back up with understanding that they are to protect the enterprises data like it was their homeland.
Teradata Viewpoint – DBMS Management meets Web 2.0
Teradata released a white paper today highlighting the release of Viewpoint, a Web 2.0 based portal for managing their data warehouse offering. If it lives up to the hype from the 2007 Partner's conference it will be a welcomed replacement to out-dated Teradata Manager offering. Viewpoint is based on the technologies behind many other Web 2.0 offerings, including JavaScript and AJAX to provide end users with a fairly customizable and clean user interface.
Out of the box, Viewpoint provides Teradata customers with a 11 portlets that target a combination of end users, DBA's, and manager. These portlets provide users with visibility to their current queries, the current capacity heatmap , current problem queries, current system health, and as well as a handful of other portlets. Also included with Viewpoint is a Portlet Development Kit (PDK) that allows third parties the ability to develop their own portlets. It would be nice to see Teradata provide a library section on their website of approved third party portlets.



