storage — Flickerdown Dave Graham's Weblog


Why Policy is the future of storage

by dave on September 20, 2009

As many of you may know, I work for EMC‘s Cloud Infrastructure Group as part of the Atmos solution team. In this role, I’ve been blessed with getting a closer look at where the future of cloud storage is going as well as some of the drivers that will get it there. In this post, I’d like to talk a bit about policy and how this will shape the future of storage. I’m going to keep this as abstracted from product as possible, but where appropriate, I’ll try to show you how products are implementing this technology TODAY.

What is Policy?

By definition, policy is “[an] action or procedure conforming to or considered with reference to prudence or expediency” ( for that definition).  When viewed in the context of storage systems and management, policy, then, is the actions (scripted or otherwise) that influence data to provide for retrieval, performance, or manipulation by systems.  In other words, policy is an engine that manages data from start to finish.  Why this is important requires us to look at what the typical management stack looks like today.

Data is created by users accessing programs that are tied to physical and virtual resources.  This generated data is then processed and stored by the programs and their underlying storage I/O layers (LVMs, hypervisor I/O stacks, etc.) onto some sort of storage device (SAN, NAS, DAS, etc.) where it sits until next access.  In essence, once data is created it is considered to be “at rest” until it is next accessed (if ever).  Within this data generation and storage continuum, the process is fundementally simple as generated data is put directly to storage.  However, if the data continues to sit in the same place endlessly, it’s typically inefficient to retrieve and access.  Managing this data was typically a manual process where data, LUNs, and their topologies had to be moved around using array or host-based tools to provide better “fit” for data at rest or data accesses for performance.  This is where policy steps in.

Policy uses hooks into data (also known as metadata) in order to enact controls.  Please see this post for more detailed explanation of metadata.

Why use Policies?

If the previous example shows anything, it’s that the management of data is fundementally…well, boring and manual.  Policy provides a method of controlling the stack of data ingest AND data management while allowing business to continue to generate, retrieve, and manipulate data.  For example, a simple policy that could be enacted against data could be as follows:

if data < 14 days old, store on EFD drives, LUN 11; if > 14 days old, store on SATA drives, LUN 33

Obviously, that’s a high-level abstraction of what the actual process for data control would look like but drives the point home.  What used to be a manual LUN migration policy to “performance” or “store” data now is set based on a logical control structure that can be automagically enacted on the storage system itself.  A working example of this type of policy can be seen in the tiering provided by Compellent and EMC’s FAST systems for storage management.  Pretty cool, huh?

An alternative method of control that isn’t necessary tied to the storage array is the recent introduction of VMware‘s Storage DRS (Dynamic Resource Scheduling) which is enacted against the storage I/O stack of VMware’s vSphere hypervisor.

The Future of Policy

Obviously, my examples are very simplistic in nature but hopefully, they make the policy technology somewhat more accessible.  As far as policy futures are concerned, this is where storage technologies (and even host process management) will be going.  In the future, simple policy creation and enforcement will be a necessary part of storage pool creation and integration as well as the ongoing maintenance and support of storage arrays.

As always, feedback is welcome!

edit: 9/21/09: removed a mis-aligned reference to Atmos storage policy.

Reblog this post [with Zemanta]


Micro-burst: Retrofit or Net-New?

by dave on August 12, 2009

I’ve been ruminating on a conversation that I was part of at the recent Cloud Camp – Boston “un-conference.”  In this particular case, a customer (a VAR; NOT a manufacturer) was talking about leveraging cloud storage for a particular customer of theirs who had the following “essential criteria” that needed design help:  multiple petabytes of storage, significant unstructured data, low cost of entry, data primacy/ownership (e.g. privately controlled assets/data), and very little need for typical NAS/SAN implementations.  The questions that this VAR brought up were related to designing for this type of storage.  Let’s explore this a little more (remember, just thinking out loud here) by looking at retrofitting cloud-type storage (a la Atmos) versus looking at a “net new” installation of a completely cloud storage based infrastructure.


The concept of retrofitting is to shoehorn a “new” product into a space where “old” product was either unsatisfactory or incapable of servicing the ongoing data needs of a company’s infrastructure.  In this case, the goal is to use as much of the existing infrastructure as possible to minimize cost while at the same time providing the much-needed boost in management and capability brought to the table by the new technology.  In these type of cases, the ability of the storage product (in my case, Atmos) to integrate seemlessly is vital to bringing the “cloud” to the table.  Atmos, for what it’s worth, offers the ability to integrate into traditional NAS/SAN environments through CIFS, NFS, and IFS connectivity options (IFS is through a RHEL 5.x client) while also allowing the customer to develop connectivity and SOA options through REST/SOAP API interfaces.  This way, Atmos allows you to granularly “grow” into a API-based storage model without completely getting rid of (dare I say it? 😉 ) legacy NAS/SAN environments.


The Net-New concept really thrives when the customer is at a cross-roads; the need for new technology and infra outstrips the need to preserve the current infrastructure (obviously not limited to just the infrastructure discussion ).  The idea here is that by adding a “cloud capable” infrastructure the company can look to potentially minimize the overall OpEx recidivism that they experience as part of their normal buy cycles. (that was a painful sentence to write.)  Objectively, a net-new architecture allows a clean-slate “ground-up”  approach to storage architecture where careful design and planning can be based around hybrid cloud capabilities (e.g. federation between Atmos and Atmos Online) as well as the scalable growth that is offered by those platforms.  Again, provision is made for integrating into the infrastructure where needed via the aforementioned NAS capabilities (CIFS/NFS/IFS) but the emphasis is placed on self-service through the API interface.

Your Choice

The cool part about this evolution is that the choice is ultimately up to you as to how and when you implement.  Having the capabilities of integrating and growing now cannot be overlooked but, obviously, there are challenges with any type of new integration.  Similarly, tossing out the old and bringing in the new has its own sets of challenges such as internal SLAs that IT has with it’s “internal customers” etc.

Comments and feedback (as always) are welcome!

Reblog this post [with Zemanta]



So, it’s 1201am and…well, now you know what we’re talking about, hinting about, laughing about, and honestly…what we’ve been waiting for.  The Symmetrix V-Max is here and it’s ready to drop the hammer on what you know about storage.

[click to continue…]



Hybridizing DR for the Cloud: Concerns

March 2, 2009

Over at Information Playground, Steve Todd has started down the path of no return: private clouds.  (Incidentally, I find it quite ironic that private clouds are no more private than public clouds in that they’re essentially run on the same infrastructure and face the exact same challenges for security, data mobility, and perminence that the […]

<br />

Refreshing Celerra: New Models + New Features

February 23, 2009

As you probably have heard by now, EMC has refreshed the Celerra line a bit. With today’s announcement, we’ve added the following hardware models to the fold:

NS-480 with 2 or 4 datamovers
NS-960 with support for up to 960 drives
NS-G8 with support for up to 8 datamovers and connectivity to Symmetrix & CLARiiON systems

Now that that’s […]

<br />