2010-12-27

DAMA-DMBOK book review

Diagram showing the importance and result of w...Image via Wikipedia
“What do you do?”

I am asked this question frequently. Family members, church friends, even recruiters and coworkers sometimes ask this question.

Depending on the audience, I will say something like “work with computers”, or “I’m a DBA.” or “I’m a database developer.”

Dr. Richard Feynman once said: “If you can't explain something to a first year student, then you haven't really understood it.”

The DAMA – Data Management Body of Knowledge is a work that attempts to document and formalize the definition of the Data Management profession.

According to the book, a Data Management Professional is responsible for the planning, controlling and delivery of data and information assets.


The thing that impressed me the most is that it brought together so many formal definitions of many various concepts that I work with on a daily basis. Whole books can, indeed have, been written on each component of data management touched on in the body of knowledge. One of the values of this book is the bibliography. If one were to acquire every book referenced in this work they would have an impressive library of data management knowledge.

Another thing that was impressive to me is this book advocates the role of the Data Management Executive. The Data Management Executive is defined as: “The highest-level manager of Data Management Services organizations in an IT department. The DM Executive reports to the CIO and is the manager most directly responsible for data management, including coordinating data governance and data stewardship activities, overseeing data management projects and supervising data management professionals. May be a manager, director, AVP or VP.” I have worked with and in many organizations; very few actually had an “official” data management executive. As a result, data movement into and out of the organization has been something of a haphazard process. Each project that required movement of data was approached differently. If a single official point of contact for all data management activities existed, then these projects could have been more streamlined to fit into an overarching design for the enterprise as a whole.

Each chapter covers a different aspect of the overall Data Management Profession.  The first chapter gives an overview of why data is a corporate asset. The definition of data as a corporate asset is the foundation of all data management activities. Focusing on data as an asset first, then the follow on activities discussed in the major component chapters are seen as value-add activities.
 This picture illustrate the Data Architecture ...Image via Wikipedia
The major components of Data Management covered by the chapters and the definitions the DMBOK provides are:


Data Governance: The exercise of authority and control (planning, monitoring and enforcement) over the management of data assets. The chapter gives an overview of the data governance function and how it impacts all of the other functions. Data Governance is the foundation for the other functions.

Data Architecture: An integrated set of specifications artifacts used to define data requirements, guide interaction and control of data assets, and align data investments with business strategy.

Data Development: The subset of project activities within the system development lifecycle (SDLC) focused on defining data requirements, designing the data solution components and implementing these components.

Data Operations Management: The development, maintenance and support of structured data to maximize the value of the data resources to the enterprise. Data operations management includes two sub-functions: database support and data technology management.

Data Security Management: The planning, development and execution of security policies and procedures to provide proper authentication, authorization, access and auditing of data and information assets.

Reference and Master Data Management: The ongoing reconciliation and maintenance of reference data and master data.

Data Warehouse and Business Intelligence Management: This is a combination of two primary components. The first is an integrated decision support database. The second is the related software programs used to collect, cleanse, transform, and store data from a variety of operational and external sources. Both of these parts combined to support historical, analytical and business intelligence requirements.

Document and Content Management: The control over capture, storage, access, and use of data and information stored outside relational databases. Document and Content Management focuses on integrity and access. Therefore, it is roughly equivalent to data operations management for relational databases.

Meta-data Management: The set of process that ensure proper creation, storage, integration and control to support associated usage of meta-data.

Data Quality Management: A critical support process in organizational change management. Changing business focus, corporate business integration strategies, and mergers, acquisitions, and partnering can mandate that the IT function blend data sources, create gold data copies, retrospectively populate data or integrate data. The goals of interoperability with legacy or B2B systems need the support of a DQM program.


The last chapter covers Professional Development, ethics, and how DAMA( Data Management International) dama provides a professional society body or guild for the communal support of information and data management professionals.

Overall this is an outstanding book for defining the roles associated with data management. While it is light on details for implementing the programs, processes and projects that it defines, it is nevertheless a great book for creating a common vocabulary amongst professionals who work day-to-day in the data management profession.

The more we, as data management professionals, communicate consistently with business users, executives, and the public about what we do the better it will be for all of us when one of us is asked “what we do”.

My answer now is I am a Data Management Professional. I can assist you with better understanding, delivery, analysis, security and integrity of your data. 



Enhanced by Zemanta

2010-12-01

Data as an Enterprise Asset

From the wiki an Asset is: "Anything tangible or intangible that is capable of being owned or controlled to produce value and that is held to have positive economic value is considered an asset"

Data is the most valuable Enterprise Asset's in existence. The recent release of documents on the Wikileaks site is a prime example of this. What will be the final cost associated with the release of these documents? How many man-hours will be devoted to changing procedures, implementing new security protocols, trying to recover loss of face by many government agencies?


Your address book


What would happen if you were to lose your phone?

You would just replace it right?

What about your address book?

How many people keep their contacts in multiple locations for "safekeeping"?

You want to make sure that you keep your contacts regardless of what happens to your phone, right. Data Management professionals feel the same way about the data that they safeguard.

The CEO view

It is 11:43 p.m. on a Friday night. The alcohol from the dinner meeting with investors won’t wear off for a few more hours. You should be fine for the 7:12 a.m. tee time with the next group of potential clients. When the phone rings you just curse and pick it up.

“What!” you yell into the phone.

“Hey boss “, you hear the head of your IT department.

“Listen; there is no easy way to say this. In the storm that we had earlier tonight, we took a handful of lightning strikes and had a tornado touch down on the building itself. The lightning strikes then caused a fire that wasn’t caught until it was too late. The building is pretty much destroyed.

We have already updated DNS to our DR site. Some of the DBA's and server admins are on the way there. Our main network guy is unavailable since he is out of town. We are supposed to have our backup tapes there in a few hours. The server guys will get our servers back up, the DBA's will restore the databases and validate where we are with the data."

What do you do?

If you trust the DR plan and your DBA's, then you can go back to sleep.

Would you sleep well?

How valuable is your data now that you don't know whether you have it or not?

Valuating your data

One way to determine the value of your data is to identify the direct and indirect business benefits derived from use of the data. Another way is to calculate the cost of its loss; what would be the impact of not having the current quality level of your data or the amount of data you have?

What if you only lost a years' worth of data?

What change to revenue would occur?

What would be the cost to recover it? Man-hours, potentially consultant hours as you hire outside expertise if necessary would factor in to the costs.

Data Management Professionals protect your Assets


Data Management Professionals are the ones that protect your data assets. By protecting and safeguarding your data assets, they are protecting and safeguarding the enterprise itself.
Enhanced by Zemanta

2010-11-23

Data never dies

Do you ever wish you could forget something?

Certainly there are traumatic events in some of our lives that we may wish that we could forget; more often than not most people wish they could remember something.

A taste, a name, the name of a restaurant, these are all things that some of us try very hard to remember at times.

For computer data management it is a different story. I have been in many discussions where we question how far back in time we are required to retain data.

By formal definition this is called the data retention policy of an organization. There are laws in place that require certain records to be retained for 7 years. Some data must be kept indefinitely.

This simple term: “Data Retention Policy” can have an impact on software purchases, hardware purchases, man-hours and processing time for an analytics application.

Why is my application taking so long?


As the volume of data within an application grows, the performance footprint of the application will change. Things that previously ran fast will begin to run slow. I once worked for an application vendor and handled more data management issues than software development issues. On one particular occasion shortly after I started there I received a call about the application not working. Upon review of the environment where “nothing had changed” I discovered the problem. The SQL Server database was severely underpowered. Simply manually executing the SQL directly through query analyzer showed dreadful performance.

We had to recommend an upgrade in hardware. Once the customer purchased new hardware I took a team to the customer site and we did a migration of the data from the old server to the new server. When I left the site, I heard numerous people talking about how the application had not worked that well since it had been freshly installed.

A simpler answer may have been to “archive” data, to clean it out so that the performance would have returned to a fresh state or even just delete it. The reason we could not do that is that this particular application was a time tracking system for recording time-in and time-outs of employees working at a refinery. Employee data is not something that should just be purged; especially data that directly impacts how contractors and employees are paid.

The data would be studied for some time to report on cost expenditures for the site where the work was recorded.

But simply upgrading the back end database server was really only a short term solution.
This is a case where we kept all of the historical data within the application itself for reporting and study.

Reporting systems can help


As a data warehouse engineer, now I would have suggested an alternative solution. I would have suggested that “warm” data should be moved to a reporting application for reporting and study.

A threshold should be established for what is useful within an application itself for data that is pertinent and needed on a daily and weekly basis. This is the “hot” fresh data that is currently being processed. The data that is important for business continuity and reporting to auditors, vendors other business units and executives does not necessarily need to be kept within the application itself. We should have spun off a reporting system that could be used to retain that data and allow study and reporting, but not bog down the application itself.

Building specific reporting systems is essential to maintain optimal application performance. By offloading this data into an Operational Data Store, Data Mart, or Data Warehouse you will keep your “hot” data hot and fresh and your “warm” data will be available for use in an environment that does not interfere in any way with the day to day work of your business.

How long do I keep data?


How long data is kept for each business unit within an organization is a question for each business owner. Law’s need to be examined for requirements, analysts need to make it clear how much data they need to see for trending analysis, and data management personnel need to contribute to the discussion by showing alternatives for data storage, retrieval and reporting.

Keep your corporate “memory” alive by having a current data retention policy that is reviewed every time a new application is brought online. Reviewing your data retention policy at least annually keeps this issue fresh in all of the stake-holders minds. Disaster recovery planning and performance optimization both are influenced by the data retention policy.

Since the data of your company is really all of the information about your customers, vendors, suppliers, employees and holdings data never dying is a good thing!

Enhanced by Zemanta

2010-08-18

The Architect vs. Superman.

I consider that there are really two types of workers in the IT field: The Architects and the Supermen. A mature Superman realizes that they cannot continue racing against time and performing super feats in order to save the enterprise. An immature Superman thinks this is the way things should be done.


There is an old joke I heard when I was in the Marine Corps. A young bull walks up to an old bull and says excitedly “Hey, lets run down this hill jump over the fence and have our way with one of those cows down there!!!”. The old bull looks up from chewing some grass, looks at the fence, gazes over the cows on the field, and then looks at his young friend. “No, let’s walk down the hill, crawl under the fence and have our way with all of them.”


A little planning can go a long way towards a successful project. In my opinion Architected solutions that have built in flexibility lend themselves towards ease of extensibility. I once wrote a system that required data to be loaded from a flat file into a very specific structure. Instead of simply writing the code to move the data from one table to the other using a stored procedure that was 5 pages long, I simply inserted the data into an intermediate table with some extra columns for housekeeping and data matching.


We were able to re-use that structure for a half-dozen later projects. So instead of each time figuring out how to put the data into the target structure and duplicating the code repeatedly to update the housekeeping and matching, we simply put the new data into the intermediate structures and let the normal process work.
I was told later that before I got there the developers would have re-written all of that code multiple times with each one being coded slightly different.


There are times when it is tempting to break out the mountain dew and pull an all-nighter. But if you are doing this a number of times to “keep the business going” then I contend that someone in your organization needs to go back to the drawing board or modeling tool of choice with your business users and make sure that everyone understands the fundamental business and technical process to keep the business running.


Those of you who say that the business process and the technical processes are totally separate need to consider the following: If the business is not making money, then who is paying your check? For those business users who think they don’t need to be involved in technical decisions, consider this: If you shut down your data center for 2 days, or even 2 hours, would your business process continue? We are all in this together.


I titled this article the Architect vs. Superman; however there really is no battle, because if you have an architecture for your IT applications, data, security, quality, testing and infrastructure then you don’t need superman.
Enhanced by Zemanta