Cisco UCS - The Next Step in Blade Architecture

I've been working with blade servers for almost ten years, when I first started working on the HP P-class blades.  Back then, virtualization was in its infancy, and the typical rack-mount server was two or three rack units tall, so being able to squeeze eight blades into a 10U chassis was about as dense as you could make your data center.

Old School
The P-class blade enclosures offered a variety of interconnect options, including pass-through modules as well as internal switches.  Back then, we couldn't figure out how we could delegate responsibility over the servers and the switches to the server and network teams, so we opted to use the pass-through modules. The pass-through modules simply provided individual network connections for each blade, so since each blade had two Ethernet NICs and two HBA ports, you would end up with sixteen Ethernet cables and sixteen fibre channel cables hanging out the back of the chassis.

Keep in mind that we'd install two or three chassis per rack, so now we're talking about as many ninety six cables in the rack, not including power cables, and a few more cables for management connections.  Rather than bringing all that cable out of the rack, we installed top-of-rack switches and then sent the uplink cables out of the rack.

When the HP C-class blade enclosures came out, we wised up and revisited the interconnect options, this time opting for the chassis switches.  Now, a couple of 10 Gb Ethernet cables and a couple of 8 Gb fibre-channel cables were all you needed to connect your blade chassis to your fabrics.  Meanwhile, the blade form factor had gotten even smaller, allowing sixteen half-height blades in a 10U chassis.  Very dense, and lots of IO without many cables.  Good stuff.

Painful Management
OK so in our 10U chassis we have sixteen blade servers, each of which may have a dual-port NIC and a dual-port HBA.  We've also got a pair of interconnect switches for Ethernet and maybe another pair for the SAN.  The Chassis itself has a management controller (HP calls theirs the Onboard Administrator or OA).  Now let's say you want to bring all this gear up to the latest firmware versions?  Well, we log onto the OA and we can easily update its firmware though the menus.

Within the OA menus, there are links to the switch module management interface, within which we can find the menu for updating the switch firmware.  Then for each blade, we establish a remote console, mount up the ISO image with the latest firmware, and run through the process of updating all of the components within that blade.  Then on to the next one.  You could spend a whole day getting an enclosure up to date.

Now when you want to install an OS on the blade, we go back into the remote console (which is accessed via a web browser by the way), and we mount the ISO image for the OS we are building, and we boot it up.  Keep in mind that the OA is connected to the network by a 100mb NIC.  Booting and installing Windows 2008 over 100mb is horribly slow.  And heaven forbid that you need to watch the blade boot and hit any keys to get the blade to boot off the CD, especially if your blade has lots of RAM.  Man it's like watching paint dry.

Automation and Abstraction
Don't get me wrong, blades are awesome.  I would never build a rack-mount server unless there was some compelling reason to do so.  When the first P-class blades hit the scene, they came with Altiris Rapid Deployment Solution, which was a great tool for setting up unattended OS installation scripts for the blades.  Once you had the OS build scripted, you could drag and drop an OS job onto a blade in a GUI interface, wait a few minutes, and your blade would be all built, sitting at the login prompt.

Over the years, various automation solutions have come and gone, and to some extent they have fallen by the way side due to the advent of virtualization.  After all, it's pretty easy to get VMware ESXi installed, and a lot of your work then shifts to automation within VMware.  Still management of the hardware remains painful.

Beyond hardware maintenance, provisioning of network and SAN connectivity can be challenging.  Typically, especially in a VMware environment, many VLANs may be trunked into the blade enclosure, after which VLANs can be configured within the enclosure to be piped into the blade's NICs, as well as SAN connections.  I'm over simplifying here, some of that configuration can be quite complex.

HP Virtual Connect allows you to define server profiles, which include network and SAN connectivity settings.  Once created, the server profile can be applied to a slot in the enclosure, thus providing those connectivity settings to the blade in that slot.  Optionally, virtually assigned MAC addresses and WWNs can be assigned to the server profile, so that storage and network connections can be moved from one slot to another without having to re-zone storage connections, etc.

You can also specific boot-from-SAN options in the server profile, so that the blade can be stateless.  When you apply the server profile to the slot, the blade picks up its network and SAN identity, and boots from its boot LUN on the SAN.  Move that profile to another slot, and now that blade assumes the identity of the previous blade.  The blade isn't entirely stateless though since BIOS settings and firmware versions are set on the blade.

Enter Cisco UCS
Cisco takes statelessness to the next level.  Their slot profile, called a service profile, includes LAN, SAN, BIOS, firmware levels, every bit of state that can be applied to a blade.  This not only makes the state of the blade completely controllable by applying a profile, but it also eases the process of firmware management.

Where the HP management engine is built into the chassis, it's in the OA module installed in each Chassis, Cisco has placed their management engine, UCS Manager, into the top of rack switches.  Not your run of the mill switches, the Cisco Fabric Interconnects provide connectivity for Ethernet, FCoE and Fibre Channel fabrics, and like I said, they manage the UCS blades and enclosures.

A pair of fabric interconnects can control up to twenty enclosures as a single system.  Any service profile can be applied to any blade in any enclosure connected to the interconnects, and the LAN and SAN connections, BIOS settings and firmware revisions specified in the profile will be applied to the blade.

Each enclosure can hold either eight half-width blades or four full width blades or a combination of each.  The enclosure will contain two fabric extender modules each with either four or eight 10 Gb uplinks to the fabric interconnects, for a maximum possible 160 Gb of IO throughput to each enclosure.  That's 20 Gb per half-height blade with no oversubscription.

The IO into each blade can be used for QOS-enabled Ethernet and FCoE traffic.  Up at the fabric extender, the FCoE traffic can be converted to fibre channel and sent off to your SAN.  The resulting cabling is very clean.  Each enclosure is uplinked to the fabric extender by a maximum of sixteen 10Gb cables, however, it's typical to use only four, and build some oversubscription into your design. The fabric interconnects are then uplinked to your core switches, and typically, also uplinked to your SAN switches.

Cisco UCS blades use Intel processors only, no AMD.  As of this writing, eight and ten core processors are avialable.  HP on the other hand, does support AMD and currently you can put 16 core processors in their blades.  There's some debate about what's better, hyper-threading enabled Intel processors or AMD processors with more cores.  It really depends on the workload I think.  This difference will dictate the overall workload density that you can achieve with each architecture.

From a rack-space density point of view, the HP blade edge out a slight improvement, they can fit 16 blades in a 10U enclosure, while Cisco UCS hosts the same sixteen blades in two 6U enclosures for a total of 12U.

I think that both the HP and Cisco solutions can be architected to optimize cabling, density and flexibility.  The ultimate differentiator may well be operational efficiency afforded by the BIOS and firmware abstraction, the QOS features, and the multi-enclosure level management.

UCS provides role-based access controls designed to allow segregated management of LAN, SAN and server configurations by the appropriate support staff, but it goes a step further in providing a hierarchical organizational structure, such that sub organizations can be created, granted a pool of servers, and allowed to define their own service profiles for a true multi-tenancy solution.  Considering that up to 160 servers can be managed by one instance of UCS manager, this makes it possible to deliver physical as well as virtual servers in a service provider model.

UCS also delivers an open XML API for all of the configuration and management items of the entire system, allowing automation via scripting languages and management tools.

All of these features combine to make Cisco UCS the next step in blade architecture, converged networking, and automated provisioning.


0 comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...