Fibre Channel over Ethernet or Infiniband: a Response

by dave on December 8, 2008


Infiniband Ports: Voltaire ISR-6000 Infiniband...
Image via Wikipedia

Over at Cisco, there is a post from April, 2008 on the FCoE vs. IB argument that was written by Dante Malagrino. What follows is MY perspective (not my employers) on the FCoE vs. IB argument that he puts forth.

a.) Historically, both IB and FC have been more difficult to manage than competing IP-based solutions (though I’d argue that IQDNs are as much of a pain as node addresses and WWNs).  With the advent of truly GUI driven switching solutions, a LOT of the legwork required has been reduced both for FC and for IB.  Matter of fact, a true novice to both protocols can get a solution set up and running with a hour or two (cabling and hardware setup included).

b.) FCoE does much to assuage some of the performance concerns associated with current generation GigE-based iSCSI by including 10GbE into the equation and quashing latency issues that were terrible with GigE.  However, it STILL pales in comparison to IB from a latency and bandwidth perspective.  Couple the added cost (and hardware changes required), I see more of a case for using converged switches like Voltaire, Xsigo or Qlogic IB-to-FC solutions for companies that are currently using IB for node-to-node clustering.  If they’re refreshing their NOC, then FCoE makes some level of sense but you’re still going to have to provide legacy support (albeit limited to 8 total FC ports to legacy attach). Then there’s the added overhead of managing 3 separate fabrics: FCoE from host to Nexus, FC to legacy MDS units, 10GbE to (hopefully) Catalyst 6500 series frame for IP.  ouch!  Couple that with the absolutely abysmal power req’s per CNA (24w for 1st gen) and incremental cost.  You buy now, you’re going to be wanting to move to Gen2 pretty quickly.

c.) regarding the bandwidth vs. latency argument.  Sure IB can work both ways. IPoIB proves that inherently and companies like Xsigo, Voltaire, and Qlogic have proven that you can have your cake and eat it too when it comes to fabric conversion.  SDR/DDR/QDR Infiniband has its place in HPC, to be sure, but moving it out from there, you can realize less CapEX/OpEX from using HCAs and these converged routers than overhauling to FCoE and CNAs.

anyhow, those are my thoughts. 😉  What do YOU think about this type of positioning?

cheers,

Dave Graham

Reblog this post [with Zemanta]
Share
  • Pingback: Continuing the FCoE Discussion - blog.scottlowe.org - The weblog of an IT pro specializing in virtualization, storage, and servers()

  • Pingback: Continuing the FCoE Discussion | Storage Blogs - Storage Monkeys Blogs()

  • Pingback: Moving a Fabric forward: FCoE Adoption and other Questions | Dave Graham's Weblog()

  • A couple of thoughts:

    Re IB performance: Duh. However–CEE at 10Gb, and especially at 40 & 100Gb, should change that. The economics of that will be interesting to see.

    Re the overhead of 3 separate fabrics: IMHO if you're already already dealing1/10GbE & FC, then it isn't particularly onerous to collapse the two into FCoE at the access layer. You're still managing GbE & FC. If the FCoE works as the standards in dev say it should, then the added management overhead of 1 (not 3) additional fabric that works in ways similar to the existing ones isn't going to be significantly more. Of course, it won't work the way it should–that's where you'll suffer.

    Re moving IB out from HPC into the rest of the network: now that would be interesting.

  • A couple of thoughts:

    Re IB performance: Duh. However–CEE at 10Gb, and especially at 40 & 100Gb, should change that. The economics of that will be interesting to see.

    Re the overhead of 3 separate fabrics: IMHO if you're already already dealing with 1/10GbE & FC, then it isn't particularly onerous to collapse the two into FCoE at the access layer. You're still managing GbE & FC. If the FCoE works as the standards in dev say it should, then the added management overhead of 1 (not 3) additional fabric that works in ways similar to the existing ones isn't going to be significantly more. Of course, it won't work the way it should–that's where you'll suffer.

    Re moving IB out from HPC into the rest of the network: now that would be interesting.

  • Pingback: Enabler’s of the Unified Fabric-FCoE and iSCSI? Not so fast Kilroy… « blog.virtualtacit.com()

  • Mark

    The company which had the most complete IB-based multi-fabric I/O (MFIO) solution was … Cisco! The SFS-3012 was Cisco's InfiniBand based MFIO solution. Cisco's InfiniBand products came about via its acquisition of Topspin, where the SFS-3012 was formerly known as the Topspin 360.

    While the solution worked, Cisco decided FCoE was a better way to unify I/O.

    The problem? An additional network to manage: IB. And new switches to learn how to manage.

    The problem? New I/O protocols to support, such as IP over InfiniBand (IPoIB), SCSI RDMA Protocol (SRP), and iSCSI over RDMA (iSER).

    The problem? New driver stacks to support those protocols. Drivers for Windows, drivers for Red Hat, drivers for SUSE, drivers for Solaris. By far, maintaining drivers is the bigger ongoing engineering effort for any InfiniBand MFIO provider. For this reason Mellanox, the provider all IB switching silicon, and most of the host channel adapter silicon, is exploring FCoIB, which encapsulates FC frames onto IB, rather than converting FC to SRP or iSER.

    The problem? New upper-layer drivers to support the new base I/O drivers for things like multipathing, and failover. And making things like IP multicast work over IPoIB.

    The problem? New certifications for drivers, from upper layer software like clustering, to storage systems like EMC.

    FCoE eliminates this mishmash.

    The connection from the host (CNA) to the access switch (Nexus) is Ethernet! There is no new network to manage.

    And the protocol used for storage transport on FCoE is Fibre-Channel! There are no new drivers required.

    It is Emulex or QLogic FC silicon, and the same Emulex and QLogic drivers work just as before.

    It is IP over Ethernet, and multicast just works.

    It is Fibre Channel, and unmodified multipathing software (i.e., EMC PowerPath) just works. From the disk drive to the host driver, there is no change to the underlying Fibre Channel frame.

    The FC and Ethernet interfaces are unmodified to the host operating system. That means clustering, etc., just works.

    There are still some certifications required, but these are just that, certifications no differnet than a new HBA or a new FC switch. Not the kind of work required to support new protocols and new drivers.

    As someone who has configured and set up both InfiniBand based multi-fabric I/O and Nexus based FCoE, I can tell you FCoE is easily 10X easier and faster. Why? Because there are no new protocols to configure. No new drivers to worry about. And no new networks or switches to learn how to manage.

  • Mark

    The company which had the most complete IB-based multi-fabric I/O (MFIO) solution was … Cisco! The SFS-3012 was Cisco's InfiniBand based MFIO solution. Cisco's InfiniBand products came about via its acquisition of Topspin, where the SFS-3012 was formerly known as the Topspin 360.

    While the solution worked, Cisco decided FCoE was a better way to unify I/O.

    The problem? An additional network to manage: IB. And new switches to learn how to manage.

    The problem? New I/O protocols to support, such as IP over InfiniBand (IPoIB), SCSI RDMA Protocol (SRP), and iSCSI over RDMA (iSER).

    The problem? New driver stacks to support those protocols. Drivers for Windows, drivers for Red Hat, drivers for SUSE, drivers for Solaris. By far, maintaining drivers is the bigger ongoing engineering effort for any InfiniBand MFIO provider. For this reason Mellanox, the provider all IB switching silicon, and most of the host channel adapter silicon, is exploring FCoIB, which encapsulates FC frames onto IB, rather than converting FC to SRP or iSER.

    The problem? New upper-layer drivers to support the new base I/O drivers for things like multipathing, and failover. And making things like IP multicast work over IPoIB.

    The problem? New certifications for drivers, from upper layer software like clustering, to storage systems like EMC.

    FCoE eliminates this mishmash.

    The connection from the host (CNA) to the access switch (Nexus) is Ethernet! There is no new network to manage.

    And the protocol used for storage transport on FCoE is Fibre-Channel! There are no new drivers required.

    It is Emulex or QLogic FC silicon, and the same Emulex and QLogic drivers work just as before.

    It is IP over Ethernet, and multicast just works.

    It is Fibre Channel, and unmodified multipathing software (i.e., EMC PowerPath) just works. From the disk drive to the host driver, there is no change to the underlying Fibre Channel frame.

    The FC and Ethernet interfaces are unmodified to the host operating system. That means clustering, etc., just works.

    There are still some certifications required, but these are just that, certifications no differnet than a new HBA or a new FC switch. Not the kind of work required to support new protocols and new drivers.

    As someone who has configured and set up both InfiniBand based multi-fabric I/O and Nexus based FCoE, I can tell you FCoE is easily 10X easier and faster. Why? Because there are no new protocols to configure. No new drivers to worry about. And no new networks or switches to learn how to manage.