Moving a Fabric forward: FCoE Adoption and other Questions

by dave on December 9, 2008

It’s always exciting to me to see the chatter around a new standard or new “way” of doing storage. Fibre Channel over Ethernet (FCoE) has certainly had it’s share of chatter as well as the inherent criticisms from detractors. There have also been questions regarding the inevitable overlap with other technologies like iSCSI that are able to accomplish much of the same functionality without, perhaps, having to re-invent the wheel. What follows in this post is a series of responses to several questions and postings on the web on FCoE. I would like to note that my comments are mostly reductionistic and don’t represent as deep of a stream of thought as perhaps merited by the topic. 

To begin, I’d like to take a look at a question posted by Rich Hintz (@rjhintz) in response to my article on FCoE vs. Infiniband from yesterday:

Over time, regardless of *current* technical merit, what can the industry do to reduce transition costs for today’s installs? (of FCoE or other “new” fabrics)

The difficulty for ANY new technology, fabric or not, is going to be initial resistance for implementation (i.e. “what will this do to my current environment?”) and cost.  These two factors are obviously not the entire story, but they’re significant especially given the current economic climate that we’re in.  To mitigate these two factors, then, it behooves both the technology partners as well as the manufacturers to understand their product fitment within existing environments without attempting to broadbrush the technology as the “salvation of the NOC or DC.”  Let’s break these two factors down a bit more:

  • Resistance to Implementation:  in all actuality, resistance is based on any number of factors that are not just limited to technical prowess of an IT staff or resource group.  There is a heightened awareness that the current LAN/SAN fabric isn’t capable of sustaining additional data growth along with the inevitable management issues that accompany data and infrastructure growth.  Obviously, tools like EMC’s Control Center and SMARTS can handle some of the initial data management and monitoring capabilities but, from an infrastructure standpoint, there still are fabrics to manage etc.  So, a proper approach to minimizing this resistance is to actually provide infrastructure “audits” or “reviews” with the customer to adequately understand their capabilities and capacity for future growth with both their existing infrastructure as well as the proposed infrastructure.  I’ll go ahead and assume (and recommend) that this engagement should be done IN PERSON, not over a phone or within a WebEx.  It’s critical both from a relational standpoint as well as a “in the thick of things” view.  Along with this in-person engagement, appropriate and designated modeling must be done, whether that be via Viso or another type of graphical demonstration.  Similar in approach to VMWare‘s Capacity Planner, a deliverable to the customer denoting ROI/TCO as well as CapEX/OpEX will be critical to the final push for acceptance.  I know that this is more sales focused, but it’s built on a solid foundation of technical examination and exhoneration.
  • Cost.  Talk about the 800lb gorilla in the room.  With a paucity of vendors in the general market promoting these “new” fabrics and standards, cost control becomes a critical link between the resistance to a new product/implementation and the adoption of said technology.  With the R&D associated with a given product set usually part and parcel of the higher initial cost of a new technology, for adoption to occur with any great rapidity, either margins on the hardware must be lowered to a threshold that best represents a “value” to the early adopters or, if prices are to remain higher, there must be a “light at the end of the tunnel” where valuation is determined to drop significantly as the technology ages.  A recent example of this is the development of Cisco’s “Lab Pack” for the Nexus 5020 FCoE switch which uses a margin-lowering scheme to enable acceptance.  When combined with the type of pre-sales architecting and design work mentioned above, this provides a level of comfort to the customer that allows for the technology to be implemented with significantly less issue than if it was adopted without design or validation.

Hopefully the explanation I’ve provided to Rich Hintz’s question rings true. If you have any further thoughts or comments on this subject, please let me know via the comment system.

To look at another facet of FCoE and adoption, there’s the issues posted by Scott Lowe (@scott_lowe) on his blog.

Here’s the question: how is FCoE any better than iSCSI?

Scott lays out a very intelligent examination using the Socratic method (asking GOOD questions that require actual neural pulses to be firing in the brain 😉 ) that cover the following aspects of this argument:

  1. FCoE is always mentioned hand-in-hand with 10 Gigabit Ethernet. Can’t iSCSI take advantage of 10 Gigabit Ethernet too?
  2. FCoE is almost always mentioned in the same breath as “low latency” and “lossless operation”. Truth be told, it’s not FCoE that’s providing that functionality, it’s CEE (Converged Enhanced Ethernet). Does that mean that FCoE without CEE would suffer from the same “problems” as iSCSI?
  3. If iSCSI was running on a CEE network, wouldn’t it exhibit predictable latencies and lossless operation like FCoE?

A quick answer to these questions wouldn’t get to the real heart (and initial question Scott asks) but to put it bluntly: 1.) yes, 2.) perhaps, 3.) not necessarily.  Let me explain.

1.) Since ever CNA includes a 10GbE IP or TOE engine (usually in the form of an Intel 10GbE PHY on Gen1 parts; EDIT: thanks to Stu Miniman for clarifying…PHY only), iSCSI isn’t an issue on 10GbE.  Yes, it will run and yes, it will perform within the given norm of what 10GbE provides.

2.) Conceivably, if CEE didn’t provide the facility for latency and lossless payloads, you would be encumbered by Ethernet frame unpack, FC frame routing, and finally, FC frame unpack at the storage target.  All of those processes add additional latency and overhead to an otherwise decently optimized storage fabric.

3.) iSCSI still requires the use of TCP/IP headers to package the SCSI payload, even at 10GbE speeds.  As such, there still is an element of latency for the offload (whether that be handled by a dedicated TOE ASIC or the GPCPU) before command processing and data placement.  Now, while the 10Gbe protocol stack lowers the latency threshold that was present in Gigabit, it still doesn’t achieve the same level of latency that FCoE or, better yet, Infiniband does. (Ok, Infiniband is the halo latency product here. It’s being thrown in for reference.) Further, depending on what study you’ve looked at (and this is mostly manufacturer driven), 8Gb/s Fibre Channel can offer the same level of latency while obviously suffering a small bandwidth loss to 10GbE iSCSI.

There are ton of really good posts within the Comment section @ Scott’s site on this particular topic.  I’d recommend that you take a look at them for more details.

Last, but certainly not least, I’ve been asked to review the objections to FCoE as noted here by Julian Satran. His particular objections center around the following:

It ignores the networking layering practice, build an application protocol directly above a link and thus limits scaling, mandates elements at the link layer and application layer that make applications more expensive and leaves aside the whole “ecosystem” that accompanies TCP/IP (and not Ethernet).

I don’t have the time (realistically) to unpack all of this but, suffice it to say, some of these points carry an element of truth to them.  I’ve covered, for the most part, some of these objections elsewhere but, one could reasonably argue out of every single point Julian makes.  For example, Julian’s last point about “leaving aside the whole ‘ecosystem’ that accompanies TCP/IP” sounds almost fatalistic in approach. By using converged fabrics, however, one can accomplish the management of both the SAN and LAN fabrics from a common interface.  Ideally, a data center driven by DCE (and I’m not trying to be a cheerleader here) would limit the management issues with multiple fabrics.  Oh, and that would influence the “applications being more expensive” argument as well.

Thoughts?  Comments?


Dave Graham

Reblog this post [with Zemanta]