Portion Control: Why Curbing Your Cloud Appetite Is So Critical
Chicago is a town of hearty appetites: Our pizza is legendarily deep, and a trip to Gino’s or Giordano’s can easily turn into a belt-loosening experience.
The public cloud is the same way.
It’s easy to let your appetite get away from you. In fact, the very things that make the public cloud so attractively agile—self-service and automatic provisioning and rapid elasticity, for example—can also cause some serious CIO heartburn.
A Recipe for Cloud Chaos
In a public cloud environment, compute resources can be dialed up easily by anyone with spending authorization and, in many cases, simply by using a credit card. Everyone is ordering services and, if business units are not governed in their use of cloud resources, chaos ensues. It’s like an open bar at a wedding. The father of the bride is happily oblivious until the tab comes due.
There’s also the challenge of tracking what’s running. Spinning up new virtual machines and containers is incredibly easy in a cloud environment. Yet, it’s also easy to lose sight of who’s using what, when and how much. Resource management becomes critical as budgets take a hit from every unused or underused resource.
Who Forgot to Turn Off the Lights?
Fundamentally, neglecting to turn off an app or power down a server is a human failing. Somebody has it in their notes to go back and clean up after the project is completed. But the next problem or project pops up, and that note is suddenly buried three pages deep.
It might be a situation where you provision for a short-term project or capacity need and then neglect to turn it off. It could also be a production environment that you make changes to—and don’t go back and clean up the old system, storage or network segments. For example, maybe you’ve decided to move your platform to a bigger box. Inevitably, there will be a week or two where you run parallel. Time passes, and no one goes back and decommissions the old system.
Plan for Life Cycle Management
There is a lifecycle to every system and every service. Managing that lifecycle on the public cloud is critical. You’ll need to have some process in place that says “Ok, the infrastructure team is done and the product is going to production; the test system can be turned off.”
Then, your infrastructure provider can go back and review the documentation and change tickets. They can identify the system and services that are no longer needed, and schedule decommissioning.
Remove the Human Factor
One way to remove the potential for human error is to automate the decommissioning process. Administrators can set access to the resources to expire once development or test procedures have been completed, and automatically power down servers. Resource automation tools can be used to automatically determine when a project is considered to be “complete” using completion data or other criteria.
Scripting terminating services is not simple to do. But if your usage of public cloud service is fairly sporadic or periodic, this approach may be worth investing in.
Use the Tools
Amazon’s Trusted Advisor and CloudWatch are technical, micro portion control tools that allow you to track resource usage levels. Trusted Advisor, for example, goes through a number of different checks around storage, compute, elastic load balancers and other services to say, “Hey you’ve got three elastic load balancers, but only one of them is actually seeing traffic. You need to lose the other two.”
But you need to be aware of these tools and make a point of using them. Just as important, you have to do something with the information they are providing you. If you find that CPU levels have been at or near zero every month for the past three months, it’s time to ask why you have that box out there.