Rob Paller A blog about databases, business intelligence, and an outlet for my inner geek

Let the Data Geeks Play

Posted on February 25, 2010

Recently there have been several music parodies posted by members of the MDM, Data Quality, and Data Governance community. These parodies have used songs from Rush, Styx, and George Thorogood as their muse. Jim Harris and I discussed the idea of holding a contest where members of the community could submit their own parody.

What's in it for you?

Bragging rights! (That is until the next challenge...) Sure there is always the chance your YouTube video goes viral. (You were going to submit a YouTube video, right?)

Rules?

We don't need no stinkin' rules... Just a few suggested guidelines is all we have to offer:

  • Your song can be original or may parody an existing song.
  • Your song has to be related to MDM, Data Governance, or Data Quality.
  • You may submit your song via a comment to this post or as a post on your own blog. If you post it on your own blog you must link back to this post or risk not being included in the vote.
  • Video or audio submissions are permitted but not required.

Celebrity Panel of Judges?

Due to prior engagements and personality conflicts we were unable to arrange for Simon Cowell, Ellen Degeneres, or Howard Stern to participate in the judging of the best song. So instead we are going to let you, the community, be the judge of contest. In a subsequent post we will provide a list to all of the submissions and a poll to allow you to vote.

Submission Deadline

In order to be included in the poll your submission must be completed by March 20, 2010 at 20:00 EST. Any submission received after that will made available for everyone to enjoy, but will not be included in the poll.

MDM to the Bone

Posted on February 18, 2010

As I was reading Jim Harris excellent post, Data Quality is a Rush over at the Data Flux Community of Experts I started thinking about theme songs for data related technologies such as data quality, master data management, data governance or data warehousing. If you uttered the name of one of those technologies in the company of a non-data geek their eyes would quickly glaze over. But set to the tune of a head bobbing, foot tapping song?

The following music parody is set to the tune of George Thorogood and the Destroyers' Bad to the Bone (via Last.fm).

MDM to the Bone

Bad to the Bone

On the day I was born
The analysts all gathered 'round
And they gazed in wide wonder
At the joy they had found
The lead analyst spoke
Said "leave this one alone".
She could tell right away
That I was MDM to the bone.

MDM to the bone
MDM to the bone
M-M-M-M-MDM
M-M-M-M-MDM
M-M-M-M-MDM
MDM to the bone

I mastered countless rows
Before I met you
I'll master countless more data
Before I am through
I'll master your messy data
Yours and yours alone
I'm here to tell ya honey
That I'm MDM to the bone
MDM to the bone
M-M-M-M-MDM
M-M-M-M-MDM
MDM to the bone

I make data analysts beg
I'll make the data warehouse steal
I'll make board members blush
And make the auditors squeal
I wanna integrate your data
Yours and yours alone
I'm here to tell ya honey
That I'm MDM to the bone
MDM to the bone
M-M-M-M-MDM
M-M-M-M-MDM
MDM to the bone

And when I master your data
SMEs and Stewards step aside
Every system I beta
They all stay satisfied
Well Ya see I make my own
I'm here to tell ya honey
That I'm MDM to the bone
MDM to the bone
M-M-M-M-MDM
M-M-M-M-MDM
MDM to the bone

What song would you parody to get your fellow geeks tapping their feet and bobbing their heads?

Data Quality Mad Libs

Posted on February 16, 2010

Mad Lib

Jim Harris has started a Data Quality Mad Lib series over at OCDQ Blog. I decided instead of leaving my version of the Mad Lib in his comment section to share it with everyone here.

In case you have forgotten how a Mad Lib works, it is a short story or phrase with several key words or phrases missing. Often you are asked to provide the missing items ahead of time before you see the phrase or short story. This increases the chances that the result is purely random and often amusing to read.

For Jim's Mad Lib he asked for the following to be provided by his readers:

  • an adjective - unflattering
  • a noun - a data warehouse
  • a verb - propogates
  • a noun or phrase - multiple versions of the truth
  • a phrase - ineffectively managing the master data produced by

A Data Quality Riot Act

Posted on January 14, 2010

As the son of a former Marine Chief Warrant Officer (CW2) I have not only seen the occasional riot act being doled out to my brother; I have been on the receiving end of a few myself.  I have discovered that in order for your data quality initiatives to be successful you need someone who is willing to play the role of drill sergeant on occasion. There are going to be occasions where someone needs to take control when a business process or system provides the latitude to introduce bad data into the environment. This person is going to need to motivated and passionate about rooting out bad data and the processes that allow it to be introduced.

Recently I received an email from a business analyst inquiring about a discrepancy in the between the operational system and the analytical application that his team had recently distributed across the enterprise. The discrepancy was that a name change in the operational system was not being reflected in the analytical application. At the surface, this seemed fairly benign. We took a look at the table responsible for this particular entity the table below depicts what was discovered.

Id First Name Last Name Employee Id User Id
10087632 Olivia Johnson 257303 johnsono
10108465 Olivia Matthews 0257303 matthews

Since this table behaves as a Type 4 slowly changing dimension, an update of the last name and user id would have simply move the original record to an audit table and introduced a new record with the correct last name. Instead, a new record was created not only introducing a new surrogate key for the same person but the unique constraint on the employee id was bypassed by inserting a leading zero into the field.  This had to have been done by a user who was not properly trained on handling employee name changes in the operational system. A well written email can prevent this from happening again. The user could even be instructed to go back in to the system, delete the new record, and update the existing record properly. But that was not the case.

The users responsible for managing user accounts in the operational system have been instructed to create a new user account when someone submits a name change. These instructions also explicitly state to insert a leading zero for the employee id to bypass the uniqueness constraint placed on the field. What happens when an employee gets married then divorced and no longer wishes to retain her now ex-husband's last name or marries again? Will they add a second leading zero to the employee id?

It was upon hearing the details of the business process that I could feel the increase in my blood pressure. This operational system has been live enterprise wide for less than six months and the business processes that support it are already showing signs of failure. History did not provide the learning experience that you would have hoped; it will take someone to put on their drill instructor uniform and break the privates in their platoon of their bad habits. Only then can they be built back up with understanding that they are to protect the enterprises data like it was their homeland.

   
One of 493 websites proudly supporting Earth Hour. On WordPress? Get the plugin.