Too often, the debate about running Kubernetes on bare metal versus virtual machines is overly simplistic. Theres more to it than a trade-off between the relative ease of management you get with VMs and the performance advantage of bare metal. (The latter, in fact, isnt huge nowadays, as Ill explain below.)
Im going to attempt to walk through the considerations at play. As you will see, while I tend to believe that Kubernetes on bare metal is the way to go for most use cases, theres no simple answer.
Off the bat, lets address the performance vs. ease-of-use question.
Andy Holtzmann
Andy is a site reliability engineer at Equinix and has been running Kubernetes on bare metal since v1.9.3 (2018). He has run production environments with up to 55 bare-metal clusters, orchestrated Kubernetes installs on Ubuntu, CentOS and Flatcar Linux, and recently helped accelerate the bring-up of Equinix Metals Kubernetes platform to under one hour per new greenfield facility. Andy joined Equinix after working in senior software engineer roles at Twilio and SendGrid.
Yes, VMs are easier to provision and manage, at least in some ways. You dont need to be concerned with details of the underlying server hardware when you can set up nodes as VMs and orchestrate them using the VM vendors orchestration tooling. You also get to leverage things like golden images to simplify VM provisioning.
On the other hand, if you take a hypervisor out of the picture, you dont spend hardware resources running virtualization software or guest operating systems. All of your physical CPU and memory can be allocated to business workloads.
But its important not to overstate this performance advantage. Modern hypervisors are pretty efficient. VMware reports hypervisor overhead rates of just 2 percent compared to bare metal, for example. You have to add the overhead cost of running guest operating systems on top of that number, but still, the raw performance difference between VMs and bare metal can be negligible, at least when youre not trying to squeeze every last bit of compute power from your infrastructure. (There are cases where that 2% difference is meaningful.)
When its all said and done, virtualization is going to reduce total resource availability for your pods by about 10% to 20%.
Now, lets get into all the other considerations for running Kubernetes on bare metal versus Kubernetes on VMs. First, the orchestration element. When you run your nodes as VMs, you need to orchestrate those VMs in addition to orchestrating your containers. As a result, a VM-based Kubernetes cluster has two independent orchestration layers to manage.
Obviously, each layer is orchestrating a different thing, so, in theory, this shouldnt cause problems. In practice, it often does. For example, imagine you have a failed node and both the VM-level orchestrator and the Kubernetes orchestrator are trying to recover from the failure at the same time. This can lead to your orchestrators working at cross purposes because the VM orchestrator is trying to stand up a server that crashed, while Kubernetes is trying to move pods to different nodes.
Similarly, if Kubernetes reports that a node has failed but that node is a VM, you have to figure out whether the VM actually failed or the VM orchestrator simply removed it for some reason. This adds operational complexity, as you have more variables to work through.
You dont have these issues with Kubernetes on bare metal server nodes. Your nodes are either fully up or theyre not, and there are no orchestrators competing for the nodes attention.
Another key advantage of running Kubernetes on bare metal is that you always know exactly what youre getting in a node. You have full visibility into the physical state of the hardware. For example, you can use diagnostics tools like SMART to assess the health of hard disks.
VMs dont give you much insight about the physical infrastructure upon which your Kubernetes clusters depend. You have no idea how old the disk drives are, or even how much physical memory or CPU cores exist on the physical servers. Youre only aware of VMs virtual resources. This makes it harder to troubleshoot issues, contributing again to operational complexity.
For related reasons, bare metal takes the cake when it comes to capacity planning and rightsizing.
There are a fair number of nuances to consider on this front. Bare metal and virtualized infrastructure support capacity planning differently, and there are various tools and strategies for rightsizing everything.
But at the end of the day, its easier to get things exactly right when planning bare metal capacity. The reason is simple enough: With bare metal, you can manage resource allocation at the pod level using cgroups in a hyper-efficient, hyper-reliable way. Using tools like the Kubernetes vertical autoscaler, you can divvy up resources down to the millicore based on the total available resources of each physical server.
Thats a luxury you dont get with VMs. Instead, you get a much cruder level of capacity planning because the resources that can be allocated to pods are contingent on the resource allocations you make to the VMs. You can still use cgroups, of course, but youll be doing it within a VM that doesnt know what resources exist on the underlying server. It only knows what it has been allocated.
You end up having to oversize your VMs to account for unpredictable changes in workload demand. As a result, your pods dont use resources as efficiently, and a fair amount of the resources on your physical server will likely end up sitting idle much of the time.
Another factor that should influence your decision to run Kubernetes on bare metal versus VMs is network performance. Its a complex topic, but essentially, bare metal means less abstraction of the network, which leads to better network performance.
To dig a level deeper, consider that with virtual nodes you have two separate kernel networking stacks per node: one for the VMs and another for the physical hosts. There are various techniques for negotiating traffic between the two stacks (packet encapsulation, NAT and so on), and some are more efficient than others (hint: NAT is not efficient at all). But at the end of the day, they each require some kind of performance hit. They also add a great deal of complexity to network management and observability.
Running on bare metal, where you have just one networking stack to worry about, you dont waste resources moving packets between physical and virtual machines, and there are fewer variables to sort through when managing or optimizing the network.
Granted, managing the various networks that exist within Kubernetes, and this partially depends on the container network interface (CNI) you use, does add some overhead. But its minor compared to the overhead that comes with full-on virtualization.
As Ive already implied, the decision between Kubernetes on bare metal and Kubernetes on VMs affects the engineers who manage your clusters.
Put simply, bare metal makes operations and hence your engineers lives simpler in most ways. Beyond the fact that there are fewer layers and moving parts to worry about, a bare-metal environment reduces the constraints under which your team works. They dont have to remember that VMs only support X, Y and Z configurations or puzzle over whether a particular version of libvirt supports a feature they need.
Instead, they simply deploy the operating system and packages and get to work. Its easier to set up a cluster, and its much easier to manage operations for it over the long term when youre dealing solely with bare metal.
Let me make clear that I do believe there are situations where running Kubernetes on VMs makes sense.
One scenario is when youre setting up small-scale staging environments, where performance optimization is not super important. Getting the most from every millicore is not usually a priority for this type of use case.
Another situation is when you work in an organization that is already very heavily wedded to virtualized infrastructure or particular virtualization vendors. In this case, running nodes as VMs simply poses less of a bureaucratic headache. Or maybe there are logistical challenges with acquiring and setting up bare metal servers. If you can self-service some VMs in a few minutes, versus taking months to get physical servers, just use the VMs if it suits your timeline better. Your organization may also be wedded to a managed Kubernetes platform offered by a cloud provider that only runs containers on VMs. Anthos, Google Clouds managed hybrid multicloud Kubernetes offering, supports bare-metal deployments, and so does Red Hats OpenShift. AWSs EKS Anywhere bare metal support is coming later this year.
In general, you should never let a dependency on VMs stop you from using Kubernetes. Its better to take advantage of cloud native technology than to be stuck in the past because you cant have the optimal infrastructure.
VMs clearly have a place in many Kubernetes clusters, and that will probably never change. But when it comes to questions like performance optimization, streamlining capacity management or reducing operational complexity, Kubernetes on bare metal comes out ahead.
Feature image via Pixabay
Read the original post:
Kubernetes on Bare Metal vs. VMs: It's Not Just Performance The New Stack - thenewstack.io
- Setting up a Virtual Server on Ninefold - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- ScaleXtreme Automates Cloud-Based Patch Management For Virtual, Physical Servers [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Secure Cloud Computing Software manages IT resources. [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Dell unveils new servers, says not a PC company [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Wyse to Launch Client Infrastructure Management Software as a Service, Enabling Simple and Secure Management of Any ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- As the App Culture Builds, Dell Accelerates its Shift to Services with New Line of Servers, Flash Capabilities [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Terraria - Cloud In A Ballon - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Ethernet Alliance Interoperability Demo Showcases High-Speed Cloud Connections [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- RSA and Zscaler Teaming Up to Deliver Trusted Access for Cloud Computing [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- [NEC Report from MWC2012] NEC-Cloud-Marketplace - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- IBM SmartCloud Virtualized Server Recovery - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- BeyondTrust Launches PowerBroker Servers Windows Edition [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Ericsson joins OpenStack cloud infrastructure community [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- ScaleXtreme Cloud-Based Patch Management Open for New Customers [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- RootAxcess - Getting Started - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- How to Create a Terraria Server 1.1.2 (All Links Provided) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Dell #1 in Hyperscale Servers (Steve Cumings) - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Managing SAP on Power Systems with Cloud technologies delivers superior IT economics - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- AMD Acquires Cloud Server Maker SeaMicro for $334M USD [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Web Host 1&1 Provides More Flexibility with Dynamic Cloud Server [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Leap Day brings down Microsoft's Azure cloud service [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- RightMobileApps White Label Program - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- bzst server ban #2 - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- “Cloud storage served from an array would cost $2 a gigabyte” [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- More Flexibility with the 1&1 Dynamic Cloud Server [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Hub’s future jobs may be in cloud [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cloud computing growing jobs, says Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- TurnKey Internet Launches WebMatrix, a New Application in Partnership with Microsoft [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cebit 2012: SAP Cloud Computing Strategy - Introduction - Video [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Dome9 Security Launches Industry's First Free Cloud Security for Unlimited Number of Servers [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Servers Are Refreshed With Intel's New E5 Chips [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Samsung's AllShare Play pushes pictures from phone to cloud and TV [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Google drops the price of Cloud Storage service [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Intel Server Technology: Powering the Cloud to Handle 15 Billion Connected Devices [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Swisscom IT Services Launches Cloud Storage Services Powered by CTERA Networks [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- KineticD Releases Suite of Cloud Backup Offerings for SMBs [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- First Look: Samsung Allshare Play - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Bill The Server Guy Introduces the New Intel XEON e5-2600 (Romley) Server CPU's - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- New Cisco servers have Intel Xeon E5 inside [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Cisco rolls out UCS servers with Intel Xeon E5 chips [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- From scooters to servers: The best of Launch, Day One [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Computer Basics: What is the Cloud? - Video [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Could the digital 'cloud' crash? [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Dome9 Security Launches Free Cloud Security For Unlimited Number Of Servers [Last Updated On: March 10th, 2012] [Originally Added On: March 10th, 2012]
- Cloud computing 'made in Germany' stirs debate at CeBIT [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- New Key Technology Simplifies Data Encryption in the Cloud [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- Can a private cloud drive energy efficiency in datacentres? [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Porticor's new key technology simplifies data encryption in the cloud [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Borders + Gratehouse Adds Three New Clients in Cloud Sector [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Dell to invest $700 mn in R&D, unveils 12G servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Defiant Kaleidescape To Keep Shipping Movie Servers [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Data Centre Transformation Master Class 3: Cloud Architecture - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 1/3 - Video #310 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Cloud Computing - 28/02/12 - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- SYS-CON.tv @ 9th Cloud Expo | Nand Mulchandani, CEO and Co-Founder of ScaleXtreme - Video [Last Updated On: March 13th, 2012] [Originally Added On: March 13th, 2012]
- Oni Launches New Cloud Services for Enterprises Using CA Technologies Cloud Platform [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Advanced Technology - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- SmartStyle Infrastructure - Video [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- The Hidden Risk of a Meltdown in the Cloud [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- FireHost Launches Secure Cloud Data Center in Phoenix, Arizona [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Panda Security Launches New Channel Partner Recruitment Campaign: "Security to the Power of the Cloud" [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- NetSTAR, Inc. Announces Safe and Secure Web Browsers for iPhones, iPads, and Android Devices [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Amazon Cloud Powered by 'Almost 500,000 Servers' [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- NetSTAR Announces Secure Web Browsers For iPhones, iPads, And Android Devices [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Be Prepared For When the Cloud Really Fails [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Dr. Cloud explains dinCloud's hosted virtual server solution - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- New estimate pegs Amazon's cloud at nearly half a million servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Amazon’s Web Services Uses 450K Servers [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Saving File On Internet - Cloud Computing - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- DotNetNuke Tutorial - Great hosting tool - PowerDNN Control Suite - part 2/3 - Video #311 - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Linux servers keep growing, Windows & Unix keep shrinking [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Cloud Desktop from Compute Blocks - Video [Last Updated On: March 16th, 2012] [Originally Added On: March 16th, 2012]
- Amazon EC2 cloud is made up of almost half-a-million Linux servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- HP trots out new line of “self-sufficient” servers [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Cloud Web Hosting Reviews - Australian Cloud Hosting Providers - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Using Porticor to protect data in a snapshot scenario in AWS - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- CDW - Charles Barkley - New Office - Video [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Nearly a Half Million Servers May Power Amazon Cloud [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Morphlabs CEO Winston Damarillo talks about their mCloud Rack - Video [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- AMD reaches for the cloud with new server chips [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]