Why wouldn’t the following work? (Future Storage System: Part 1)

by dave on October 7, 2008

So, I’ve been toying around with this in my mind for some time.  Essentially, I’ve tried to understand the basic “Storage Processor” limitation of current storage systems and propose an admittedly simplistic design to get around some of the difficulties.  The biggest hurdle, in my mind, is to have cache coherency, low latency memory access to other nodes in a “cluster,” and have a communications “bus” between nodes that is extensible (or at least will grow bandwidth with more devices on the signal chain).  Staring at that problem, then, look at the image below.

A case for Hypertransport connected nodes...

A case for Hypertransport connected nodes...

Reviewing the image, you can see that I’ve essentially “glued” the two nodes together using either Hypertransport 1.x or 3.x spec signaling. With this model, you COULD feasibly scale to n-way nodes but, for reasons of simplicity, I’ve kept it “pure.”  Preference would be for HT3 spec as it allows for ganging/un-ganging, bit width changes, and lower system power draw but HT1.x would be similarly effective (though not as integrated with future technology).  I’ve obviously not listed processor specs as such, but given that HT is really only available on AMD based platforms, you can take a guess as to where I’m heading. 😉

Secondary to the actual computing side of things, I’ve looked for basic functionality for PCIe expansion.  Recognizing that PCIe (PCI Express in long form) is going to be the essential bus for the forseeable future, I made provision for three x8 slots for I/O.  There are some natural assumptions as part of the design.  First, x8 is going to handle the majority of bandwidth that can be thrown at it. (x8 Physical slot/Electrical signaling)  High bandwidth, low latency interconnects like Infiniband (SDR/DDR/QDR from Mellanox, Qlogic, et al.) and/or FCoE (Fibre Channel over Ethernet; from Brocade, Emulex, Qlogic, et al.) should be the necessary beneficiaries of this type of technology.  Secondly, since the topology is based on n-way nodes (>= 2), the PCIe lane count can be kept rather simple (currently, only 24 lanes are required for I/O; additional lanes would be required by southbridge/northbridge toplogies, etc.).  Keeping this expansion simplistic reduces the necessary power draw of component to a smaller scale and allows for some level of node portability.  Physically, it also reduces the size of the nodes as the 3 slots could be engineered in a smaller footprint.

If you see any significant flaws or just want to give feedback on the conceptual design, let me know.  I’m learning as I go along. 😉



Reblog this post [with Zemanta]