Infrastructure Capability and Value – Great Architectures need Great Plumbing
For a variety of reasons ranging from being incredibly busy to writing posts involving the Corvus corax I never got to finish my series on how infrastructure capabilities enable architects to build masterworks, and what better time than when for the first time in my life I have lots of time, at least for a couple of months.
In today's IT world everyone I talk to wants things to be provisioned in software and drive towards a more automated network. But you can only do in software what the hardware is capable of accomplishing. i.e. you can't make a one-lane road into a ten-lane highway without a lot of graders, asphalt, and people in orange vests. But with the miracle of double-sided signs that say 'Slow' and alternating 'Stop' plus a few orange cones and maybe at night a red flare or two I can pretty quickly, albeit still needing a few guys in orange vests, turn a ten-lane superhighway into a one lane road, or any size in between.

Manual Traffic Engineering
What we really need though is not the cones and Stop/Slow signs -- we need those fancy signs with green arrows and red X's that can automatically make a lane send or receive traffic in a chosen direction. A connected system of these would allow the infrastructure to dynamically react to changing traffic patterns in near-real time. This would be a connected system that can measure load and provision capacity based on available resource and current demand. The roads would last longer, the people driving on it would get to their destinations more quickly, and it would take a lot less time than cones and signs.
There is however, one drawback to this analogy in today's world. Let's assume I get on this awe-inspiring ten-lane marvel of highway engineering and want to make a cell phone call. I pick up my iPhone, make the call (using a BlueTooth headset in California of course) and start driving. If this proverbial highway had a similar addressing architecture to today's networks - in about six to ten miles you'd lose your cell signal, then you'd get a new phone number and sometimes have to reboot your phone, then you could initiate a call but it would take about 12-24 hours for your new number to be known by anyone trying to reach you.
So let's get a little techie on this for a second:
1) Virtualization drove a new abstraction layer into the x86 server market. Many computer science problems can be solved with another level of indirection/abstraction. In solving this problem the layer of abstraction was so good that workloads started to move.
2) I can only move a workload (think Virtual Machine) to a server that is capable of receiving that workload. If the workload requires block-storage connectivity, the server has to have that, if it requires eight separate Ethernet connections - the server also has to have that.
3) While VMs are portable from server to server unfortunately the IP address is not. Thus my cell phone example above. Addressing is, today, constrained by the logical topology.
4) What about using DNS as a level of indirection applied to masking the VMs new address from the client? Won't work for a few reasons:
a) some client stacks and applications cache the DNS response for the life of the session
b) some client-side DNS servers ignore the TTL
c) the TTL is set at time of A-record request, not based on a pub/sub model that enables distributed cache invalidation
d) the host application may be keyed to the IP address (not uncommon with older applications)
5) Remember that three-tier network architecture with core, distribution, access? The one where it was advantageous to push L3 as close to the edge as possible so you could take advantage of equal-cost multi-path routing, and design spanning-tree out of the topology? It's gone. The business value of what the virtualization team is doing in IT in many cases outweighs the value we all have traditionally placed on three-tier network architectures with perfectly designed L2/3 logical topologies that enabled secure, scalable, and stable networks.
The reality is that while we all want the right infrastructure that can support any workload on any server, at any time, with all the provisioning being done in software the bulk of the infrastructure we have all deployed is not ready to deliver that capability yet, its still the proverbial one-lane road with no intelligent signaling. Couple that with the current addressing architecture and routing architectures and workloads are not as portable as we would all like them to be.
What do we need to build an infrastructure that is capable of meeting the needs of customers? An infrastructure that operates as a system, with a cohesive and scalable control plane, one with enough capacity to enable variable workload support, and one with the Layer-2 and Layer-3 capabilities that enable workloads to become portable and then mobile over an ever-increasing number of host devices without requiring obtuse addressing changes or DNS trickery. It also has to a system that ocuses on reducing complexity rather than increasing it. One that is designed to make the operators jobs easier, then focus on automation of tasks that today are onerous but always enable the fine-grained control necessary to troubleshoot problems if and when they occur. Lastly network infrastructures must break the bonds of the static CLI and move to more of a registry or database driven model that scale across multiple systems in a federation.
Here's a thought- virtualization has changed the way we design and build networks today. Virtualization at an unprecedented scale is one aspect of cloud computing. Cloud computing will require us to rethink the architecture of the Internet.
Lets also not forget to discuss infrastructure value - measured simply by cost, quality, and risk. All the features in the world don't make a bit of difference if you cannot deploy them. To maximize the value out of any infrastructure investment the capabilities need to be expressed in a way that makes them readily deployable by the operators, the engineers, the guys on the ground who have to make this infrastructure work day-in and day-out. Value is a simple math equation - the less investment required to acquire the asset and maintain the asset that goes in and the more capability your business realizes while keeping risk low and quality of service high, the more value you extract from the investment.
dg
Additional comments powered by BackType
July 9th, 2009 - 02:06
Thanks for the article. Just a point of clarification: VM IPs are portable between virtualization platforms if they are in the same L2 domain.
July 10th, 2009 - 17:50
Randy, a point of clarification – VM IPs are portable across server platforms provided the virtualization platform is the same on both servers (VMW to VMW, Xen to Xen, etc) and the IP subnet spans the physical server platforms.
The problem I am highlighting is that as you widen the distance to encompassing a greater and greater set of resources (servers) you start breaking some network fundamentals. When you go cross-AS, you require a lot of changes to the IP routing design.
July 11th, 2009 - 20:02
Yeah I see now how I mis-typed it. Making the edits tonight. Thanks for the catch! dg