In the previous two articles on the Future Storage System (FSS), I took a general look at a basic storage system architecture (Part 1) and then went a bit deeper into some of the more interesting bits of that system from a platform standpoint (Part 2).  In this article, I want to dive a bit deeper into how I envision nodes to be building blocks for additional capabilities and processing directives.  I will be referencing the image below as part of this article.

Hypertransport Node Expansion (detailed)

Hypertransport Node Expansion (detailed)

[click to continue…]



Thoughts #1

by dave on February 14, 2008

Being a slave to technology really isn’t as bad as everyone thinks. For example, I’m currently sitting in a car dealership, waiting for my oil to be changed and, well, banging away at the tiny keys on this Blackberry 8800. To that end, I’m able to take some time to review some of the stories that I feel have had some measure of impact in the storage world.

First off, XIV going to IBM. (linked here, here, here, and here), I never knew Moshe Yanai and honestly, the particular markets I work in don’t really necessarily benefit from the Symmetrix or it’s architecture. So, fundamentally, I could care less about IBM taking on the Symm in Enterprise (versus some of the other folks out there blowing hot air about it…) That being said, in reviewing the Nextra white papers and comments/blogs of folks who are more intelligent than I, I can see where Nextra could have trickle down impact on other discrete EMC (or competitor) products, namely nearline archive or A level Commercial accounts.

The devil is in the details though…For example, Centera has always operated on the RAIN (Redundant Array of Independent Nodes) principle and consequently the architecture that it encompasses has a very long product lifecycle. Changes can be made at the evolutionary level (i.e. Shifting from Netburst P4 processors to Sossaman Xeons and accompanying board logic) vs the revolutionary and literally, software/OS changes can cause the most impact on performance. At the end of the day, the differentiation is at the software level (and API integration, lest we forget :)) not hardware. Where Nextra seems to throw itself into the ring is in fundamental flexibility of hardware. Don’t need quite the same processing threshold as Company X but want more storage? Use less compute nodes and more storage nodes. Need to have more ingest power? Add connectivity nodes. Etc, etc. This blade or node based architecture allows for “hot” config changes when needed and appears to allow for pretty linear performance/storage scaling. Beyond the “hot” expansion (and not really having any clean insight into the software layer running on Nextra), one has to asusme that at minimum, there is a custom-clustering software package floating above the hardware. Contrasting this to Centera, then, what really is the difference?

A.) Drive count vs. Processing/connectivity count. With Nextra, you can have an array biased towards storage or connectivity. With Centera, each node upgrade that you add has the potential to be both storage AND connectivity but requires reconfig at the master cluster level.
B.) Capacity. Nextra is designed to scale to multiple PB’s by introducing 1Tb drives as the baseline storage config. Centera, while currently shipping with 750Gb drive configs, will obviously roadmap to 1Tb drives for 4Tb per node (16Tb per 4 node cluster; that’s raw storage prior to penalties for Centrastar, CPP or CPM, etc). Again, Centera is designed from a clustering standpoint so, feasibly, a multiple petabyte cluster is with reason, provided the infrastructure is there.
C.) Power. No contest really. Centera is designed to function within a very strict “green” envelope and from the hardware perspective is very “performance per watt” oriented. (Granted, I believe that they could eek more performance out of a low power Athlon64 processor while keeping within the same thermal/power guidelines…but, I digress..). Nextra, again by design, fits into a enterprise data center grade power threshhold and, consequentlu, even with using SATA drives, will have much higher power consumption and overhead. If they use spin-down on the disks, then perhaps they can achieve better ratios but if the usuage profile per customer doesn’t fit, they’ve mitigated it’s advantages.

Anyhow, I’ll probably revise this list as we move along here, but…I just wanted this to be food for thought.




{ 1 comment }