Now that SSD technology is at a point where it seems to be a desired component of nearly any array, regardless of its position in the stack, we seem to be hitting a “speeds and feeds” bump in the technology marketing road. Not that this is anything new to the industry as a whole; if you recall, this same sort of “my number is bigger than your number” chatter occurred back with processors in the 90s. Back then, a processor was typically distilled to a simple comparison of megahertz (and later gigahertz) to figure out which one was better, and most of the advanced features such as MMX, SSE, and other extensions were sort of glossed over. I see benchmarks come out for arrays that are utilizing SSD (be that NAND-based flash memory or DRAM), with promises of hundreds of thousands to millions of IOPS being produced. Of course, there are a lot of questions floating around my head as to how the IOPS were calculated — as even my home lab SSDs can put out some decent numbers with an 8 Kbyte, 50 percent read, 100 percent random workload. However, this isn’t really the thrust of my discussion.
I’m a big fan of using SSD technology in a storage array, as there really hasn’t been much in the way of speed improvements on spinning disk over the past several decades. But, are we really that focused on just the IOPS? Sure, IOPS are important for workloads — especially when the workloads are virtual and highly varied — but a race car that can go 500 MPH is rather limited if it only goes in straight lines. For an array that was built with SSD in mind from the start, the IOPS should be there for me to use. I can check that box, and move on to other features that make having an array more valuable from an architecture, design and operational perspective.
Here are some other areas I tend to focus on when looking at arrays that harness SSD.
Quality of Service
It is important to know that a mixed workload of various virtual machines will be able to coexist on the array. SSD delivers a great number of IOPS, which is certainly a part of the equation, but it is equally vital to ensure that a workload is also being served in a fairly latency-free manner. Additionally, as volumes of traffic ebb and flow, the array has to be able to deliver consistent performance where it is needed. And to cap it off, all of these QoS needs should be met without having to manually tune the array or require any administrative touch.
Scalable Node Architecture
I would venture to wager that a scalable node layout is going to be the one that best compliments the virtual data center. The idea of only having a single, monolithic set of controllers that power all of the storage is on its way out. From a maintenance perspective, it’s a lot easier to migrate workloads to another node for those mission-critical workloads were you can take no chance of any issue occurring during maintenance (such as upgrading firmware, swapping out hardware, or doing a physical migration of data-center related services). And from a scalability perspective, you never have to worry about head room (the network and processing power of the array controller) in your total design, as each node of the array provides additional head room.
Integration and Management Control
The other important piece is how integrated the array is with the hypervisor and/or workloads it is running. I’ve written various articles and threads on the complexity of virtualization as it relates to both monitoring and troubleshooting. If the array isn’t helping you solve problems and gain visibility into your workloads, it’s only introducing more complexity and introducing additional operational expenses. Additionally, nodes should be easy to manage and control, as virtualized workloads aren’t all that predictable by definition (especially once we throw that fun buzzword “cloud” in there). Designing an array for static workloads doesn’t work; they should adapt to the workloads for you without requiring a high-touch level of user control.