In the last issue, we looked at data classification, and some of the main attributes used to differentiate the different data classes. In this issue, we will continue with some strategies used for data classification, with the ultimate goal of providing the IT infrastructure to support these data silos.
In Part One we examined what I call the ‘relevancy’ of data – in short:
- Mission Critical
- Non-Business Related
In Part Two, we complete the classification process with the following three concepts; Volatility, Structure, and Security.
In a typical organization you have the following Data Volatility profiles:
Dynamic data includes all information that changes rapidly over a given period of time, generally a business day. This includes accounting and sales information, as well as communications and scheduling.
Periodic data does not have the volatility of Dynamic data, but is still subject to change – generally over a 30 day cycle.
Archival data refers to data that is static in nature, and should not or cannot be changed – generally due to compliance issues. This data may include accounting history and communications archives.
Relevance and Volatility together determine how ‘close at hand’ data needs to be; in Tech-Speak, the “Recovery Point Objective” needs to be high. As such, local storage devices with high availability are generally the rule. As well, the “Recovery Time Objective” needs to be high. This has a very specific influence on how the data backup is performed, and how it is safeguarded from corruption. More on this later.
Data Structure (Complexity)
Data generally falls into two broad categories. Structured and Unstructured; the difference being how the data is accessed. Structured Data is usually housed within database architecture (Exchange, Oracle, SQL etc.) and accessed within a custom application or Application Program Interface (API). This ‘structured’ approach has a significant impact on how data is migrated and the manner in which it is recovered; it may be completely dependent on a specific application or application version.
Unstructured Data is generally stored in discrete data packets or documents and has no interconnectivity with other documents, application modules or interfaces. Most people are familiar with this data type as the documents, spreadsheets, pictures, and multimedia files they use daily.
A key point to consider when dealing with data classified as ‘Structured’ is the Application Framework necessary to access that information once recovered. Think of this… you might have all your 1988 tax information faithfully backed up on a Commodore 64 disk – but how would you ever access that information if the hardware and application that created it were unavailable?
Of course this is a very simple example – imagine dealing with a multi-module ERP or accounting system that is two or three versions old – could your new version make use of that recovered data from five years ago, if necessary?
Data Security represents the final key component of data classification, and generally relates to who has access and regulatory compliance. It also relates to the safeguarding of data against corruption due to flaws and errors in the storage medium itself.
In part three of this series, we will look at specific mechanisms, using data classification to design systems for data storage and recovery.