Skip to content

Struggling with Budget SAN Speed

by karlkatzke on September 14th, 2009

It’s Monday morning. Your boss strolls into your office. You just finished with the trouble tickets from the weekend, and this is his favorite time to ruin your entire week. He says, “I have a project for you. I need a cluster with a primary and backup SAN that is going to store about 8TB of infrequently accessed images and it will also need to host virtual machines and an Oracle database. You’ll have to fit a budget for two sites in there, but the second site is a cold, hourly-synch backup. And it has to scale. And we’d prefer if you used a vendor solution and didn’t homebrew things.”

Talk about a list of contradictory feature requests! You’ve got a limited budget, it’s hard to squeak 8 usable TB out of your average entry level 12-disk arrays (i.e. HP MSA60 or Sun J4200 disk array, with Dell’s MD1000 15 disks and AC&NC’s 516-series with 16 disks being notable exceptions) when you factor in a double parity stripe and a couple of hot spares.

In most cases, you’ll do just fine. What happens when the load on the infrequently accessed (slow) portion of the array is ‘peaky’ though? During one of those peaks, you’ll max out a gig per second line in – depending on what you’ve got driving the array, that might be your entire bandwidth budget. What’s Oracle, which is also running in one of the VMs, going to do then? It doesn’t like having slow access to it’s log files, which means it’ll be consuming RAM and swapping heavily on it’s VM, which means the VM image will also be trying to write to disk. Triple-whammy until something gives — either load decreases or something fails.

The obvious choice is ZFS and Solaris. And the obvious choice for hardware is also Sun; you get four NICs by default on Sun hardware, with management ports and ILO ports out of band on their own interfaces. (Side note: When you have 7 Cat5e cables, a KVM dongle, and 2 power cords running to a 1u chassis, yes, you really do want the cable management arm.) ZFS support with Sun is excellent. Their storage products are also excellent.

By the time you get done buying storage, you’re through most of your budget — those 1TB disks aren’t cheap, and neither are the arrays themselves. Your maximum speed across the SAS backplane for the J4200 or J4400 series is going to be 3 or 6 GB/S, and your input is only going to be 1 gb/s actual even with bonded ports, but you’d probably rather not skip all over the place on the array as you try to write 3,000 10GB (compressed) images and then try to write to the Oracle logs. The question still remains: how do you squeeze in a budget for some faster storage for the VM images and database storage while still paying for the bulk storage you need and some room to grow?

Answer: What are you using to drive that array? Buy a bigger chassis, and put it inboard. The 2.5″ 10k SAS drives aren’t hideously expensive, and the additional grand for a larger chassis beats the hell out of buying an entire extra J4200. Note that you can’t mix the 10k SAS disks in an array with the 7200 SATA disks… on any vendor that I know of, at least. But inboard on the system’s backplane, you can run SAS and then run SATA on the outside.

Bonus points: This may not last, and it might just be the academic pricing that we get at work, but right now I can buy a half-full J4400 (24 disks) for less than I can buy a fully-loaded J4200. Guess which we’re getting? It’ll be half full of blanks, but those are free. As our 8TB grows over the next year, we’re going to just slot additional disks in. ZFS’s ability to add disks to pools relatively painlessly has made this a realistic goal. ZFS also has a built-in management server (which we’ll restrict to our private network and people will have to VPN in, but that’s trivial…) which makes management’s acceptance of the technology dead simple.

Also, don’t forget that if you can acquire some SSDs, you’ll be able to drive your storage even faster by offloading the ZFS log writes to the much-faster SSD. They have a limited lifespan, though, so consider if it’s really worth it to you and make sure that you plan for their obsolescence and replacement considering that a log buffer is a r/w-intensive application.

Our total server budget for this project (a compute-/storage-intensive academic project where data loss is not acceptable) was only $70k total. We managed to squeeze an insanely fast cluster out of a paltry budget.

From → sysadmin

No comments yet

Leave a Reply

Note: XHTML is allowed. Your email address will never be published.

Subscribe to this comment feed via RSS