Skip to content

Thinking about Blades? Downsides to consider…

by karlkatzke on February 20th, 2012

Stephen Foskett is running a series about server blades — and as usual for someone who gets a lot of trial equipment to review, he’s pretty bullish on them.

After a few years with blades at my current company, I’m not. Unless you need the density that they offer, they’re probably not worth your time and money — and if you can afford them, you can probably afford to lease another rack or a larger cage.

While Stephen does an excellent job of covering the high points of blades, he skips or glosses over the downsides. The downside is that you re-introduce several single points of failure in the form of the backplane and modules that are plugged into the chassis, and extra management overhead of switches attached to the backplane, and add risk of heat because of the miniaturized and densely packed nature of the components.

Think this is doom-and-gloom? We’ve got a bunch of hardware sitting in a pile that says it isn’t. One of our IBM BladeCenter chassis has only one slot that will work in it — the rest of the slots give you strange PCI bus errors, KVM won’t work, or the management module will fail to connect to the hardware that’s installed properly. Since the blade’s backplane and management modules are a part of the chassis, IBM declined to replace it under our parts-and-labor warranty agreement — they said that we’d have to replace the entire chassis at our cost since the chassis is not a Field Replaceable Unit.

Troubleshooting problems with parts or upgrading parts on individual blades is a chore. Again, many of the parts aren’t technically Field Replaceable Units (and this includes parts like on-blade flash disks), so you’ll need to get out your oddball collection of Torx heads. It’s like laptop repair, with fine ribbons and cooling ducts stitching together byzantine layers of circuit boards. And let’s add another negative in — even if you cool the systems appropriately and your cooling systems aren’t overloaded and don’t ever overload, you still face heat death problems after the term of a normal warranty. Many higher ed institutions are starting to buy on a five year lifespan instead of the traditional three year lifespan, so high density systems like blades or thumpers are not an advisable solution there.

Many blade chassis are limited on expansion module space. Depending on your I/O configuration, you need at least six expansion slots to have some semblance of redundancy — two management modules, two I/O bus (Fiber Channel, Infiniband, 10gbE, SAS, etc.) modules, and two switch (ethernet) modules. The IBM BladeCenter S and E chassis options only support four modules. The higher end newer options, H and HT, support four high-speed and four legacy — keep that in mind when you’re thinking about expanding. Most of the modules only support six ports, which means that you’d need three modules (high speed slots only, of which you have four!) to support a single full-bandwidth fiber channel connection to a server in each of the 14 bays in an BladeCenter H-series — with no redundancy, and no way to expand further. For environments that need both Fiber and Infiniband, you’re pretty much out of luck.

Let’s not forget that each of the modules usually has a management interface of it’s own. The fiber channel modules have a console that you need to manage separately from any other fiber channel interfaces you might have. The switches have a cisco IOS-like interface to them, unless you buy actual Cisco modules for your blade center. Why’s that a hassle? Keep in mind that you need to manage VLAN and trunking assignments and limits on both your core switch and your blade center’s switch.

So: High-bandwidth environments need not apply, since shared connections are the rule instead of the exception. Environments where an addition or switch to a new technology might be managed by adding three or four PCI cards to the affected servers need not apply — your chassis won’t have room for it.

For all of those “Features”, you gain the ability to save some floor space … and you pay a lot more.

Let me introduce to you a new technology called the “40 blade server” — you take a 42U rack, set up appropriate power modules on it, and then plug 40 1U servers and a pair of switches in at the top. Sure, there’s a bit more wire, but that’s easily managed. The 1u servers are individually less expensive than server blades and have a host of nice features — such as independent KVM and individual expansion card slots — that you won’t find in any blade server.

Admittedly, one place we have been very happy with “Bladed” components is our Cisco routers. The ability to hot swap modules and fail between modules is nice — but it’s something that we could manage without; it’s simply a better way to do things in the Cisco world since the price differential isn’t that high and the equipment lifespan is closer to ten years than to three.

But for compute? Heck with that. I see very few environments where blade centers are a good solution compared to a rack of 1u servers.

From → sysadmin

11 Comments
  1. Thanks for weighing in on this topic!

    I’m a storage guy by background, so it’s hard for me to say much real-world about blades. The last blade-like system I used was a Cubix ERS-FT (as you’ll hear about shortly on my blog!)

    I certainly didn’t mean to gloss over the downsides, or even to cover them at all! This was merely a post regarding the definition. As you’ll soon see, I’m going somewhere else entirely with this series!

    But now that you mention it, maybe a downsides of blades piece would be worthwhile! Thanks!

  2. Great article – and great points. Even though my blog focuses on blade servers, I’ll be the first to tell you they are NOT for everyone, nor every workload. Until they go to an all modular, all rack design (see my article about the future of the datacenter) there will always be points of failure within a blade chassis. Even if a vendor offers passive midplanes, there is a risk, albeit small, that that board could get a bent pin, or something minute that requires replacement. In fact, in many enterprise data centers that are using blade servers, you’ll find redundant blade chassis – simply “in case”. As multi-node servers and blade servers become more similar, I think you’ll see a broader adoption of a blade-like device for large server environments, but until then – I definitely recommend that you review your requirements and insure you develop the most redundant environment you can with or without blade servers.

  3. Dmitri Kalintsev permalink

    Hi Karl,

    Would you be able to comment on how Cisco UCS would fare against your set of criteria?

    Cheers,

    – Dmitri

  4. Level380 permalink

    What a load of FUD…. IBM are the worst blades on the market. Years behind, clunky old design that needed a refresh years ago!

    Most of these points are not valid on other blade makers devices!

    Hell IBM wouldn’t even let you hot swap hdd in blades for a long time that’s how crap they are!

    As for your pizza boxes. A rack of 40 pizza box servers ontop of each other can get very hot for the servers in the middle!

  5. Brent permalink

    I might know someone who would be interested in taking those old servers off your hands…

  6. Sure, if you want to drive to Florida to pick up a bunch of non-working junk… Actually, we’re moving most of our production stuff off of them, but we’ll keep them around for development and worker nodes until they completely fail. Then they won’t be worth much as even scrap metal…

  7. John permalink

    Are you moving to 4u’s or 1u’s? What kind of workload – virtual or non-virtual?

    I would love to hear more about your decision to move off. We are looking at 100-140 blades at a purchase and the opportunity to do something else would make me ecstatic.
    I haven’t seen a good cost/benefit map between 1u/4u/Blade/Microservers.

  8. Brent permalink

    We won’t drive there but we can arrange shipping. Better to get a few bucks per unit than just recycle them…

  9. John – We’re moving to a mix of machines. Our DB nodes will be on 4u or larger boxes — they’re running Oracle, non-virtualized, and we’re currently trying to decide between “enterprise-class” (aka mini mainframes) or just really big commodity servers — HP DL900 to be specific. For the rest of our workload, which is virtualized under Xen, we’re moving to a bunch of 2U SuperMicro servers that were left over from another project in the short term — and we’ll purchase 1U hardware as we fall below a certain number of nodes.

  10. Dmitri – I haven’t ever been hands-on with the Cisco UCS stuff, so I can’t really comment on it. I can see several advantages to them as a supplier, and I really enjoy working with Cisco’s extremely competent support teams, but I can’t really comment on anything except the specs. My experience has been with IBM, HP, Dell, and SuperMicro’s equipment.

  11. Cisco UCS solves a lot of your complaints, but it comes at a cost.
    You get converged fabrics and centralized management with virtualized physical hardware. I realize that doesn’t make sense but it’s not something that can be explained easily in a blog comment.

    Combine UCS with Nexus and you have a very robust blade solution as well as up to 160Gbps aggregate throughput per CHASSIS, or 80Gbps aggregate throughput per fabric. Up to 40Gbps per blade, typically you will see 20Gbps per blade though. To get 40Gbps you need the full width blades and dual mezzanine such as a M81KR. Those figures are on the chassis backplane assuming the 2208 IO modules and 6296 Fabric Interconnects. Blades are treated as a commodity item and are for the most part disposable with migrating service profiles, FC SAN and iSCSI SAN boot there isn’t much on them anyway. If it does you pull it out and replace it, re-associate the service profile to the new blade and boot it.
    The UUID, MAC, WWN, etc are all kept in the service profile, not the hardare.

    My point is, UCS is badass. It’s not terribly complex once you get down to it but the learning curve is steep since it brings EVERYTHING together;
    -Virtualization the hardware
    -Software virtualization [Vmware, Xen, Hyper-V]
    -Complex SAN fabric management
    -SAN Storage management
    -Switching
    —Nexus1Kv
    —VMFex
    -Voice (You can run Unified Communications on it in VMware)

    Please excuse any typo’s – I’m up late :)

Leave a Reply

Note: XHTML is allowed. Your email address will never be published.

Subscribe to this comment feed via RSS