Cloud Optimized Storage Solutions: Part 3b – Service Level Agreements

by dave on February 12, 2009

In Part 3a of the Cloud Optimized Storage Solution series, I covered the concept of data tiering within the COSS solution.  In this particular post, I’m going to start the conversation on how SLAs may tie into the overall concept of data tiering as well as infrastructure access SLAs. This particular post is more of a “working edition” than anything else, so, comments are certainly welcome and warranted. 

Service Level Agreements provide additional frameworks for data storage and access along with particular sensitivities to the methodology of access as driven by compliance.  Understandably this subject is very broad in scope so, for the purpose of clarity, focus will be given to two basic SLA metrics: data storage and data access.  These SLAs serve two purposes: to structure the type of relationship between a customer and their data within the cloud and provide a legal framework whereby customer and provider realize risks/benefits and provide remediation.

Data Storage SLAs

Within any company’s datacenter, there is an implicit understanding from the employees that data that is stored on an array will have the appropriate mechanisms for timely and appropriate access.  While this model works well behind the firewall (indeed, this is truly the first “cloud storage layer” of sorts), when it is moved outside the relatively secure and comfortable NOC walls to a hosted service (a la EC2/S3), it becomes harder to maintain and enforce.  

A currently working SLA model as noted by Amazon’s S3  includes provisions for storage uptime and error rates and ties them to a specific model for service credit should there be any proven interruptions.  There are also mechanisms for exclusions due to various cataclysms such as fires and other “acts of god” that would otherwise be outside the general control of the S3 group.  Additionally, connectivity issues are specifically mentioned as an exclusion due to ownership of the communication lines by third party.  This begs the question, then: What good is an SLA if data cannot be accessed?

Data Access SLAs

As the previous question notes, data is meaningless if it cannot be accessed.  Where storage SLAs would typically be seen in a corporate environent, access SLAs would be more visible for resources outside of the corporate environment.  While this type of definition does not exclude VPNs, for example, it does tend to look at the method of access a bit more critically with an eye towards conditions that would make access untenable.  Because there are so many variables that prevent data access over public infrastructure (e.g. storms, fires, environmental destruction, to name a few), it becomes impossible to guarantee that acess will always be available.  As such, data access SLAs have to be tied to the underlying COSS systems and their management of data (as noted previously in the data tiering section).

Expendable Data SLA

Part of the issue when tying SLAs to data tiering is that data has to be given a level of importance to best determine initial data placement.  Coupled with that is an inherent data “aging” process that can be used to reclaim space within a given Cloud File System (cFS).  This type of data can be considered to be “expendable” and thus deserves its own SLA and understanding.  

Expendable data is designed to be a “scratch” pool that could be used in a similar fashion to Gmail’s SPAM folder where a pre-determined time limit is imposed on the data within a given cFS and it’s aged out.  The benefit to this level of SLA is that it allows for defined storage reclamation periods as well as relaxing SLA requirements for data access and retention.  Designating cFS objects of this type are simply tagged as part of the ingest process provided through the SOAP/REST interface. Objects of this type can subsequently be considered for base-level replication to other COSS systems for protection or geo-dissemination or can be left on the source COSS system.

Critical Data SLA

Similar to the Expendable Data SLAs, the critical data SLA focuses on those Tier0/Tier1 application sets or data groups that require ongoing retention and protection within the cFS.  These objects are similarly meta-tagged via ingest and have priority access to Tier0/1 data tiers for storage and management.  Object protection is assumed to be provided by multiple object replicas and geo-dissemination amongst multiple COSS systems.

Closing Thoughts

As noted at the beginning, this list is by no means exhaustive.  Rather, this is meant to at least show how SLAs can be tied to data as well as to the underlying “Tiers” used to optimize the cFS data.  Thoughts and comments are absolutely welcome!

Reblog this post [with Zemanta]