loopback0 – Douglas Gourlay's Blog Data Centers, Virtualization, and Cloud Computing

15Jul/096

Power as Ultimate Limiter

power capacity is biggest limiter of data center growth today

power capacity is biggest limiter of data center growth today

I was just reading this article by Stacey Higginbotham at GigaOm on the power limitations in major metropolitan areas and how they are affecting sales in the data center segment.

This is the first time I have heard of power constraints on the grid as justification for quarterly earnings results. In no way am I refuting it, its actually becoming a critical problem - California is a great example.  In 2006 we had this rather warm day - about 114 degrees as I was driving along Interstate 280. It was a Sunday fortunately so there was not a lot of commercial enterprises in full-tilt operation.  The next day it was commented that the northern California power grid was around 380 megawatts short of the point where rolling blackouts start.

To put this in perspective I was at a data center the other day in at Switch in Las Vegas that is a 100 Megawatt critical load facility.  The new Microsoft Data Center in Chicago is also in the same range and while I have seen no hard data on the size of the Google or Apple facilities given the square footage and power densities available today they are also most likely in the same power range.  Short version - Northern California was three to four data centers short of rolling black outs.

See, here's what happened.  For a long time we didn't care.  When we were designing products like the Catalyst switches in the 2001-2006 timeframe power just simply was not a concern - density was the main concern.  We needed to get more and more switch ports into a smaller form-factor because our customers were running out of space.  We saw this in every tech segment- denser servers, denser storage, denser networks.  I often joke we had the densest product management teams out there - I can say this, was one of them :)

Coupled with this density increase was the trend towards data center consolidation - the gist of it is that as bandwidth costs went down post-2001 and the operating costs of these ever increasingly complex data centers went up, the cost slopes inflected, and it became operationally advantageous to run an IT infrastructure with as few physical facilities as possible while maintaining a viable disaster recovery/avoidance plan.  When people consolidated their data center facilities together all of a sudden what was distributed became centralized, what was centralized was measurable behind the power meter, and the facilities team decided to pass the power bill to the IT department for this now measurable and consolidated data center facility.  Ooops - that wasn't in the budget plan!

We ran out of space - the industry delivered denser equipment.
It got expensive to maintain them, and more cost effective to connect them - data centers consolidated.

Then we ran out of cooling capacity - and this made many cooling vendors very happy.  Enter the Computer Room Air Conditioner, colloquially knows as the CRAC unit - and yes, to a facilities manager being pushed to cool these massively dense systems these are quite the addictive element when you have cooling problems.

Then, lastly, as Stacey indicates we started running out of power.  You go to the local power provider and say, "Hey! I need another 10 MegaWatts, please sir."  Then they pretty much laugh at you almost hysterically.  In some cases you are asked to build your own substation and finance their build out to support you, in some cases you opt for co-generation, in others you hope a second provider exists.  But generally your option is to find a new location that has the power density you require.

Now lets look at this from an economic perspective in California.  California, as a stand-alone entity, is the world's 6th largest economy.  And just FOUR more data centers would break the power infrastructure in this state.  Thus we have witnessed the 'Flight of the Data Centers' (cue Wagner now for a laugh).  Businesses have moved their data center facilities in one safe direction - away.  Las Vegas, Oregon, Washington  - places with sustainably generated, renewable, low/no carbon emission power sources that are cheap.

What does this mean to capital equipment sales taxes, import tariff, infrastructure investment, job creation, and property taxes in California?  I think we can see that it is not positive at a local level although most would not argue that it is the right choice for the business.

Q: So what can, or should we do about this? A:  There is no silver bullet.

Every company you talk to will have their own angle on what can be done.  There is no one all-encompassing right answer.

Cloud! some will say - but that moves the problem from you to someone else, granted it is someone who hopefully has the technical extertise and economy of scale on their side to run the IT infrastructure more efficiently than you can in your own domain.

DC Power!  Of course - run everything DC, that will save power.  The studies vary, just like the mileage on my car; however, since most of the IT infrastructure is AC today that is an awful lot of equipment that would have to be replaced which generally seems to benefit the vendors who are pushing DC.

Higher Voltage AC- possible, not the be-all/end-all but could improve efficiencies, especially in the US where 110 may be the norm, even going from 208 to ~240 would be an improvement.

Solid State Disk- can certainly handle more IOPS than spinning-disks at a more efficient power draw.  A good example is Cisco was touting 400,000 IOPS per blade in a UCS system.  If you supported this with rotating media it would take around 2000 hard drives per UCS module to absorb the I/O performance - suffice to say that is a lot of capacity.  At 11-16w of power (idle versus active) per hard drive and assuming usage averages at 50% thats 27 kilowatts of power per blade in storage requirements, thus SSD is the more logical choice with far faster IO rates.

Energy Efficient Ethernet- why do you run a port at 10Gb when it may only need 1Gb or 100Mb?  Energy Efficient Ethernet is being developed through and with the IEEE to lower the power draw, rate adapt, and reduce the power when load factors are low or the ports are in a shutdown state.

The list goes on.  What other ways can power be lowered and workloads be processed more efficiently?

dg

sharing is fun
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • Ping.fm
  • RSS
  • Slashdot
  • StumbleUpon
  • Suggest to Techmeme via Twitter
  • Technorati
  • Twitter
Filed under: Tech Leave a comment
Comments (6) Trackbacks (0)
  1. Any insight on what the ratio is between power consumed for cooling, and power consumed by co-located equipment? I’ve heard of data centers cooling through the use of external air when the climate is appropriate, but I have no idea how much energy this really saves.

    • James,
      I have seen some companies even in what we would think of as ‘hot’ climates like Las Vegas use outside air economizers. There is a pretty substantial savings IF the outside ambient is lower than inside a reasonable amount of time. Turns out Las Vegas is cooler than the 80.6F Omar references about 70% of the HOURS in a year (but maybe not the daily high). You have to couple this with software, and intelligently managed HVAC that can recirculate, or pull from outside depending on the temperature for that particular HVAC system and its particular air. It’s easier to imagine if you think closed-loop cooling like a nuclear reactor)

      dg

  2. Doug:

    One interesting approach I have recently seen is to simply run the data center at a higher temperature. The American Society for Heating, Refrigerating and Air-conditioning Engineers just raised their recommended upper limit for data centers from 77F to 80.6F. With every degree increase potentially shaving 4% off DC energy costs, it certainly seems like an enticing approach, but I think you would really need to have a handle on your DC cooling before trying this.

    Omar Sultan
    Cisco

  3. Actually “cloud” may not be the answer, but the unfortunate victim of this [power] limitation – i.e. if I cannot “move” I/S/PaaS across Data Centers, when in need to re-balance resources, because I may have plenty of CPU or memory in the servers in the target DC, but not enough power to sustain the new load, then I am really in trouble.

    On the other hand, looking at this type of problem as an opportunity, though, maybe we could come up with facilities controlled management tools which could have as much an impact on resource (automatic?!? = cloud) reallocation decision as the internals of servers and networks have today …

    Stefan

    • To a point @aneel was making loudly on Twitter today we need to be able to synchronize the moves of Virtual Machines with their associated storage (or abstract that move from the workload) while maintaining network policy, finding the closest availabel default gateway, ensuring service consistency for transport in-band services like SLB and Firewalls, and coordinate these moves with the availability of sustainable and cost-effective power generation at a facilities and geographical level.

      The further we can move a workload, statefully, the more choices we have about where that workload can most efficiently be processed. Arguably a good problem to have.

      dg

      • One thing I did not put into the main post, but is almost laughable that it is not implemented in many platforms is the following:

        If the ports are administratively shut down, why do the chips initialize and raw power? I.e. put a port in down/down state administratively. Why should the PHy and MAC power up? Why not keep as many system components in a low power draw state and have a sublinear power curve based on true-demand rather than having the device run ‘hot’?

        Now I know why this is- its because its EASIER on the software developer in teh engineering team. But I think we should all start demanding more efficient and effective power management – a variable rotation fan is great, but there is a lot more that can be done.

        Let’s start by measuring the power/throughput but in an apples/apples way based on the featureset. A switch with tunneling and MPLS and huge routing tables can not be compared to a switch with just forwarding features. Similarly Netflow accounting takes a large amount of transistors on the silicon die to support it at rate.

        Evaluate your true requirements for implementation, then baseline the system choices you have. Power isn’t the only cost factor, but it is a significant part of lifecycle cost on these IT assets. Evaluate total cost (power, ops, maintenance, etc), then risk, and then quality and make a decision…

        dg


Leave a comment


No trackbacks yet.

Additional comments powered by BackType