Have any of you watched the TV show “Hoarding: Buried Alive”? In case you haven’t, essentially the people on the show fill their house and any other buildings on their property with stuff. That stuff could be random and from garage sales or very specific and bought new. The bottom line, they are compelled to keep everything! When any of us think about the files we keep and the data our companies keep, in many cases we could be considered file and data hoarders! But in this case, being a hoarder is a good thing. It would be more costly and impact business processes to delete the “stuff” that we may end up needing later. So, what can we do in order to keep the files and data but reduce the storage costs?
With all these old files and data, who has the time or desire, honestly, to comb through hundreds of folders and thousands of files and move or delete them? This is a massive undertaking when one considers all the company data that could be archived. Consider the following: according to IDC, only 40% of fixed data is active or is accessed infrequently. Forrester Research says that 85% of production data is inactive, with 68% having not been accessed in 90 days. For a decent size enterprise with 2PBs of primary storage this means that between 1.2 and 1.7PBs of primary storage is doing NOTHING!
This massive amount of inactive data unnecessarily increases the storage needs, backup infrastructure, and costs to maintain it. Furthermore, the companies that we all work for are spending more and more hard earned dollars to store all those files and data from end users, application and database servers, email, etc. To add to this increasing challenge, big data will increase the demand for storage even more.
Key areas of evaluation and decisions to make
Begin by identifying the use case(s) and workflow(s) for which an archive would help. In one use case, completed projects and the associated files are prime candidates for an archive. In another use case, quarterly and annual financial reporting files can be placed in an archive. Consider creating an inventory of use cases and or workflows and rank them in order of priority based on amount of data, ease of archival, and any other criteria.
Once the use case(s) and workflow(s) are identified and prioritized, a decision must be made as to how and when the data and files will be moved to archive. This is where business conditions, needs, and policies play a critical role. Business policies that come into play and should be considered are length of time data will be retained, how frequently files will be accessed, which employees have access, etc. From a business conditions and needs perspective, increasing storage costs, maintenance, and online access to an indefinite number of years of files need to be identified and documented.
Another key area to evaluate and decide upon is how to move the data and files. A comprehensive approach is to implement a digital asset management (DAM) software package and associated process. The available DAM packages will enable the classification of files, setting policies that dictate when files move between primary and archive storage tiers, and provide uniform user access to the files across those storage environments. Alternatively, a manual approach is also a possibility. In this case, the users will need to move the files from a production folder or directory to an archive folder or directory. In most cases, evaluating the cost and benefit of differing approaches will result in a combination of methods being implemented. A word of caution is to try and limit the archive approaches to a couple.
Although one might think an archive storage solution would mean much lower performance, there are several archive options with different levels of performance available. The level of performance will be a key driver of the cost per GB. If it takes minutes in order retrieve archived files is acceptable, a cloud or locally deployed tape based solution is likely. If the files need to be retrieved more quickly with response times in seconds, then a spinning disk solution can meet the need. There is a wide range of performance levels available from a variety of vendors that will meet the business needs. Decide on the level of performance your archive storage should deliver to your application and/or end user needs.
Two other areas that must be evaluated are security and Service Level Agreements (SLAs). It is very important to understand the sensitivity of the data that is in the files that will be archived. For files that need and or require high levels of security, for business and regulatory reasons, one would most likely want the data to be hosted in one’s own data center and be surrounded and imbedded with security controls, monitoring, and alerting. For the files that need more security, there are archive storage services from service providers that will meet the security requirements but your contract with them must meet all of your requirements. For files that require lower levels of security, there are a large number cloud and hosted options to consider and will deliver cost and operational efficiencies. Lastly, SLAs must not be forgotten. What level of performance, availability, incident response, etc. service levels are needed in order to meet the technical and business requirements? Each of these SLAs should be evaluated and incorporate the application, user, security, workflow, and network requirements and or constraints.
Summary of Archive Recommendation
As stated earlier, after 90 days, most files are infrequently, if ever, accessed again. Archival storage is an important storage solution that will help your organization in many ways. First, it will reduce your storage costs by moving files from high cost, primary storage to low cost, secondary storage. Second, by aligning the amount of primary storage to deliver active files, the burden on the backup infrastructure can be reduced and the backup window shortened. Third, for any data warehouses, data lakes, etc., archival storage will improve performance by reducing the amount of data that must be tracked and analyzed. So, if you haven’t started already, now is the time to consider and possibly implement an archival storage solution.