Emulex E3S: Fitment

by dave on June 30, 2009


In my first article on the Emulex E3S Gateway, I simply relayed a high-level look at the device as well as a sample diagram of how integration would look within a very simplistic block-based environment. Overall, it appears that the concept of a hardware “gateway” to the cloud is very well accepted as a mechanism for enterprise cloud integration but, it’s not without its fitment issues. In this particular post, I’m going to attempt to answer and evaluate why this approach is appropriate and where integration points could be somewhat sticky.

There’s really no secret that the cloud (depending on your definition, of course) is the ‘next big thing’ when it comes to managing and storing your data. Irrespective of whether you believe in the greater “private cloud” vision or not, the fact of the matter still is: how will you get your data to the cloud? Emulex’s E3S provides a rather creative solution to that question but it also requires a structure to be in place to support its methods. Let’s dig into that to begin with.

As I noted before, Emulex emulates SAS or FC drives as a “target” and in turn, utilizes this target-based approach to move data from your primary block storage to the cloud destination of your choosing (EMC Atmos, for example). This requires that the Emulex software be present somewhere within the storage destination or that it be the destination to begin with. For customers that want to maintain their data on primary storage arrays like EMC’s CLARiiON or Symmetrix (V-Max, of course 😉 ) this would require a bit of retrofitting to make work. Either the E3S maintains a level of pass-through data pathing (injecting a potentially disrupting operation into the I/O stream) within the fabric, or, it inserts its code into the storage array stack as an emulated target (injecting potentially disruptive code into the core OS of the storage array). In either case, the margin of error is exacerbated by the controls that are in place from the storage vendors.

Rather than try to retrofit storage TO the E3S, I think the implementation metrics should be focused on how to drag new data across to the E3S without trying to re-invent the storage wheel. There are a couple of ways to do this. First and foremost, the concept of host or fabric write splitting. EMC’s Recoverpoint utilizes this particular method of integration (adding storage array write splitting on the CX3 and CX4 CLARiiON arrays) which simply initiates a synchronous write to a secondary target which is then sent across the wire. If the E3S code could act as that secondary target (which it should given that Emulex is running a target-mode Fibre Channel HBA inside the appliance), then every outgoing write could be sent to the cloud with a minimum amount of overhead.

A secondary method of integration is definitely pointed at VMware. With the emergence of SATP integration (Dell’s MEM and EMC’s PowerPath Virtual Edition being the exemplars), it is completely feasible (in my mind) for the E3S software to be integrated in that way. The advantages of this particular integration point would be pretty powerful as any VMDK could be replicated to the cloud intact. This’ll need to be fleshed out a little more (as it’s completely conjecture in my mind right now) but the implications could be considerable.

Answering some of the questions

Chris Evans (as posted at Gestalt IT) poses some very interesting questions regarding Emulex’s E3S. (I’ll write them down here as well as link to his post):

* Is the E3S a backup solution or a replication solution?
* How is data integrity maintained in replicating LUNs to “the cloud”?
* Does the E3S appliance cache data as it is forwarded to “the cloud”?
* What redundancy and integrity is built into the E3S appliance to ensure no data loss?
* What level of throughput can the E3S appliance maintain?
* How is multi-LUN replication integrity managed?
* How is multi-array replication integrity managed?
* Can data be accessed directly from “the cloud”?

Obviously, some of these questions I can answer and some I can’t (by virtue of not knowing the product as fully as the product architects) but I’ll attempt to answer as best as I can. From a very simple perspective E3S is a disk emulation product that creates a new tier of storage in the cloud.

Is the E3S a backup solution or a replication solution?
The E3S is, in my mind, both/and versus either/or. In the backup vein, you could utilize EMC’s Atmos Online, for example, for maintaining data copies of a given lun. In another sense, it’s a replication solution since you’re essentially replicating data to another data store elsewhere and having the ability to act upon it. E3S makes cloud storage usable by encrypting, and optionally compressing and dispersing the data to one or more endpoints. In this way, data availability is improved while being protected through industry standard encryption. E3S is really neither a backup nor a replication solution; it is simply a gateway for block data of any type into cloud storage. The application at the heart of the concept is archival of cold data. In this application, the user is not replicating or backing up but rather moving data off of an expensive asset to a pay-as-you go elastic storage solution that is much less expensive.

What redundancy and integrity is built into the E3S appliance to ensure no data loss?
The basic prototype design presented at EMC World was based around a Dell PowerEdge R200. As a prototype, redundancy was not provided. However, it is clear that enterprise customers are accustomed to redundancy and failover for nearly their entire datacenter infrastructure. Enterprise-class data integrity, availability and reliability requirements are at the core of E3S design with a suite of features and capabilities to satisfy the most demanding enterprise customers.

Can data be accessed directly from “the cloud”?
Data that is sent to the cloud can be accessed directly via that particular cloud vendor’s interface and control systems. In an Atmos world, that data would be viewable as objects. But as the data is compressed, encrypted, and dispersed to multiple cloud endpoints (per customer configuration), the data is usable only through the E3S devices.

What level of throughput can the E3S appliance maintain?
Performance is a function of a number of elements, including host server performance and the customer’s available network bandwidth to the storage cloud. E3S is not intended to replace an on-premise array, but instead provide a target for archive data (meaning write once, and read rarely). E3S is about creating a transparent way to get to cheaper, pay-as-you go cloud storage.

Does the E3S appliance cache data as it is forwarded to “the cloud”?
Caching between a high-speed SAN and the WAN connection is an absolute necessity to insure that the use of cloud storage, with its inherent performance characteristics, does not inhibit the performance of the existing data center storage infrastructure.

Hopefully this will continue to highlight the strengths of the Emulex E3S product as a viable integration point for “cloud ready” commercial and enterprise businesses.

Special thanks to Michael and Tim @ Emulex for their patience in answering my (and others) questions.

Reblog this post [with Zemanta]
Share
  • Dave

    Thanks for taking the time to reply to my questions. It sounds like E3S is pretty much how I imagined it to be. I think it will be one of many “staging” products that moves us from current storage architectures to a much more diversified infrastructure. On the one hand I'm excited by the potential of making storage ubiquitous; on the other I retain my cynical self which wants to ensure the technology hype doesn't get out of control. Please keep us all up to date as things progress!

  • Chris, as always, your viewpoints are admired. Word has it that Emulex is releasing a whitepaper in a few weeks that will go over the architecture in more detail. Stay tuned!

  • ianhf

    Dave,

    As always a good blog – raises a few more 'top of head' questions though :-
    1) How doe the 'adapter' handle and treat mutable / changing blocks? (eg does it write new block object and retire the old or something more optimised (eg a mini delta block))

    2) What are the scale targets for the adapter (or groupings of adapters) re qty objects, capacity abstracted, throughput & cache etc?

    3) What underlying cloud APIs are used? are these 'pluggable / changable'? and can multiple be used at once?

    4) What specific encryption & KMS system is used?

    5) How does the adapter work with authentication, authorisation & accounting / billing attributes that may need to be handled re cloud storage?

    6) What policy mngt framework is used to control the behaviours of the adapters? and how does this relate / compete / cooperate with other policy frameworks (eg Atmos's or Symm FAST etc)?

    7) Does the adapter maintain the checksum history of the cloud objects written in order to validate that their retrieval matches? if so where is this data stored and how is it protected / made resilient?

    Cheers

    Ian

  • Ian,

    Great questions re: E3S. Will pass them along and see what Emulex can get answered for you!

    cheers,

    Dave

  • Pingback: Transition to the Cloud — Dave Graham's Weblog()

  • I love this blog! Will come again next time for sure,

  • I love this blog! Will come again next time for sure,