Cover image
Back-end
11 minute read

K8s/Kubernetes: AWS vs. GCP vs. Azure

Kubernetes (“K8s”) won the battle of container orchestration tools. Now AWS, Azure, and Google Cloud each offer a managed Kubernetes version. How do they compare?

Kubernetes (often stylized “K8s”) won the battle of container orchestration tools years ago. Nevertheless, there are still many ways to implement Kubernetes today and make it work with various infrastructures, and many tools—some better maintained than others. Perhaps the most interesting development on that front, though, is that the top cloud providers have decided to release their own managed Kubernetes versions:

  • Microsoft Azure offers the Azure Kubernetes Service (AKS)
  • AWS offers the Amazon Elastic Kubernetes Service (EKS)
  • Google Cloud offers the Google Kubernetes Engine (GKE)

From a DevOps perspective, what do these platforms offer? Do they live up to their promises? How do their creation time and other benchmarks compare? How well do they integrate with their respective platforms, especially their CLI tools? What’s it like maintaining and working with them? Below, we’ll delve into these questions, and more.

Note: For readers who would like the concepts of a Kubernetes cluster explained before they read on, Dmitriy Kononov offers an excellent introduction.

AKS vs. EKS vs. GKE: Advertised Features

We’ve decided to group the different features available for each managed Kubernetes version into silos:

  • Global Overview
  • Networking
  • Scalability and Performance
  • Security and Monitoring
  • Ecosystem
  • Pricing

Note: These details may change over time as cloud providers regularly update their products.

Global Overview

ServiceAspect AKS EKS GKE
Year Released 2017 2018 2014
Latest Version 1.15.11 (default) - 1.18.2 (preview) 1.16.8 (default) 1.14.10 (default) - 1.16.9
Specific Components oms-agent, tunnelfront aws-node fluentd, fluentd-gcp-scaler, event-exporter, l7-default-backend
Kubernetes Control Plane Upgrade Manual Manual Automated (default) or manual
Worker Upgrades Manual Yes (easy with managed node groups) Yes: automated and manual, fine-tuning possible
SLA 99.95 percent with availability zone, 99.9 percent without 99.9 percent for EKS (master), 99.99 percent for EC2 (nodes) 99.95 percent within a region, 99.5 percent within a zone
Native Knative Support No No No (but native Istio install)
Kubernetes Control Plane Price Free $0.10/hour $0.10/hour

Kubernetes itself was Google’s project, so it makes sense that they were the first to propose a hosted version in 2014.

Of the three being compared here, Azure was next with AKS and has had some time to improve: If you remember acs-engine, which had been used to provision Kubernetes on Azure a few years ago, you will appreciate Microsoft’s effort on its replacement, aks-engine.

AWS was the last one to roll out its own version, EKS, so it sometimes can appear to be behind on the feature front, but they are catching up.

In terms of pricing, of course, things are always moving, and Google decided to join AWS in its price point of $0.10/hour, effective June 2020. Azure is the outsider here by giving out for free the AKS service, but it’s unclear how long that may last.

Another main difference lies in the upgrade feature of the cluster. The most automated upgrades are in GKE, and they are turned on by default. However, AKS vs. EKS are similar to each other here, in the sense that both require manual requests to be able to upgrade the master or worker nodes.

Networking

ServiceAspect AKS EKS GKE
Network Policies Yes: Azure Network Policies or Calico Need to install Calico Yes: Native via Calico
Load Balancing Basic or standard SKU load balancer Classic and network load balancer Container-native load balancer
Service Mesh None out of the box AWS App Mesh (based on Envoy) Istio (out of the box, but beta)
DNS Support CoreDNS customization CoreDNS + Route53 inside VPC CoreDNS + Google Cloud DNS

On the network side of things, the three cloud providers are very close to each other. They all let customers implement network policies with Calico, for example. Concerning load balancing, they all implement their integration with their own load balancer resources and give engineers the choice of what to use.

The main difference found here is based on the added value of the service mesh. AKS does not support any service mesh out of the box (although engineers can manually install Istio). AWS has developed its own service mesh called App Mesh. Finally, Google has released its own integration with Istio (though still in beta) that customers can add directly when creating the cluster.

Best bet: GKE

Scalability and Performance

ServiceAspect AKS EKS GKE
Bare Metal Nodes No Yes No
Max Nodes per Cluster 1,000 1,000 5,000
High Availability Cluster No Yes for control plan, manual across AZ for workers Yes via regional cluster, master and worker are replicated
Auto Scaling Yes via cluster autoscaler Yes via cluster autoscaler Yes via cluster autoscaler
Vertical Pod Autoscaler No Yes Yes
Node Pools Yes Yes Yes
GPU Nodes Yes Yes Yes
On-prem Available via Azure ARC (beta) No GKE on-prem via Anthos GKE

Concerning GKE vs. AKS vs. EKS performance and scalability, GKE seems to be ahead. Indeed, it supports the biggest number of nodes (5,000) and offers extensive documentation on how to properly scale a cluster. All the features for high availability are available and are easy to fine-tune. What is more, GKE recently released Anthos, a project to create an ecosystem around GKE and its functionalities; with Anthos, you can deploy GKE on-prem.

AWS does have a key advantage, though: It is the only one to allow bare-metal nodes to run your Kubernetes cluster.

As of June 2020, AKS lacks high availability for the master, which is an important aspect to consider. But, as always, that could soon change.

Best bet: GKE

Security and Monitoring

ServiceAspect AKS EKS GKE
App Secrets Encryption No Yes, possible via AWS KMS Yes, possible via Cloud KMS
Compliance HIPAA, SOC, ISO, PCI DSS HIPAA, SOC, ISO, PCI DSS HIPAA, SOC, ISO, PCI DSS
RBAC Yes Yes, and strong integration with IAM Yes
Monitoring Azure Monitor container health feature Kubernetes control plane monitoring connected to Cloudwatch, Container Insights Metrics for nodes Kubernetes Engine Monitoring and integration with Prometheus

In terms of compliance, all three cloud providers are equivalent. However, in terms of security, EKS and GKE provide another layer of security with their embedded key management services.

As for monitoring, Azure and Google Cloud provide their own monitoring ecosystem around Kubernetes. It’s worth noting that the one from Google has been recently updated to use Kubernetes Engine Monitoring, which is specifically designed for Kubernetes.

Azure provides its own container monitoring system, which was originally made for a basic, non-Kubernetes container ecosystem. They’ve added monitoring for some Kubernetes-specific metrics and resources (cluster health, deployments)—in preview mode, as of June 2020.

AWS offers lightweight monitoring for the control plane directly in Cloudwatch. To monitor the workers, you can use Kubernetes Container Insights Metrics provided via a specific CloudWatch agent you can install in the cluster.

Best bet: GKE

Ecosystem

ServiceAspect AKS EKS GKE
Marketplace Azure Marketplace (but no clear AKS integration) AWS Marketplace (250+ apps) Google Marketplace (90+ apps)
Infrastructure-as-Code (IaC) Support Terraform module
Ansible module
Terraform module
Ansible module
Terraform module
Ansible module
Documentation Weak but complete and strong community (2,000+ Stack Overflow posts) Not very thorough but strong community (1,500+ Stack Overflow posts) Extensive official documentation and very strong community (4,000+ Stack Overflow posts)
CLI Support Complete Complete, plus special separate tool eksctl (covered below) Complete

In terms of ecosystems, the three providers have different strengths and assets. AKS now has very complete documentation around its platform and is the second in terms of posts on Stack Overflow. EKS has the least number of posts on Stack Overflow, but benefits from the strength of the AWS Marketplace. GKE, as the oldest platform, has the most posts on Stack Overflow, and a decent number of apps on its marketplace, but also the most comprehensive documentation.

Best bets: GKE and EKS

Pricing

ServiceAspect AKS EKS GKE
Free Usage Cap $170 worth Not eligible for free tier $300 worth
Kubernetes Control Plane Cost Free $0.10/hour $0.10/hour (June 2020)
Reduced Price (Spot Instance/Preemptible Nodes) Yes Yes Yes
Example Price for One Month $342
3 D2 nodes
$300
3 t3.large nodes
$190
3 n1-standard-2 nodes

Concerning the price overall, even with GKE’s move to implement the $0.10/hour price point for any cluster, it remains by far the cheapest cloud. This is thanks to something specific to Google—sustained use discounts, which are applied whenever the monthly usage of on-demand resources meets a certain minimum.

It is important to note that the example price row doesn’t take into account the traffic to the Kubernetes cluster that the cloud provider can charge for.

The reason AWS doesn’t allow the use of their free tier to test an EKS cluster is that EKS requires bigger machines than the tX.micro tier, and EKS hourly pricing is not in the free tier.

Nevertheless, it can still be economical to test any of these managed Kubernetes options with a decent load using the spot/preemptible nodes of each cloud provider—that tactic will easily save 80 to 90 percent on the final price. (Of course, it is not recommended to run stateful production loads on such machines!)

Advertised Features and Google’s Advantage

When looking at the different advertised features online, it seems there is a correlation between how long the managed Kubernetes version has been on the market and the number of features. As mentioned, Google having been the initiator of the Kubernetes project seems to be an undeniable advantage, resulting in better and stronger integration with its own cloud platform.

But AKS and EKS are not to be underestimated as they mature; both can take advantage of their unique features. For example, AWS is the only one to have bare-metal node integration, and also boasts the highest number of applications in its marketplace.

Now that the advertised features for each Kubernetes offering are clear, let’s do a deeper dive with some hands-on tests.

Kubernetes: AWS vs. GCP vs. Azure in Practice

Advertising is one thing, but how do the different platforms compare when it comes to serving production loads? As a cloud engineer, I know the importance of how long it takes to spawn and to take down a cluster when enforcing infrastructure-as-code. But I also wanted to explore the possibilities of each CLI and comment on how easy (or not) each cloud provider makes it to spawn a cluster.

Cluster Creation User Experience

AKS

On AKS, spawning a cluster is similar to creating an instance in AWS. Just find the AKS menu and go through a succession of different menus. Once the config is validated, the cluster can be created, a two-step process. It’s very straightforward, and engineers can easily and quickly launch a cluster with the default settings.

EKS

Cluster creation is definitely more complex on EKS vs. AKS. First of all, and by default, AWS requires a trip to IAM first to create a new role for the Kubernetes control plane and assign the engineer to it. It is important to note as well that this cluster creation does not include the creation of the nodes, so when I measured 11 minutes on average, this is only for the master creation. The node group creation is another step for the administrator, again needing a role for workers with three necessary policies to be made via the IAM control panel.

GKE

For me, the experience of creating a cluster manually is most pleasant on GKE. After finding the Kubernetes Engine in the Google Cloud Console, click to create a cluster. Different categories of settings appear in a menu on the left. Google will prepopulate the new cluster with an easily modifiable default node pool. Last but not least, GKE has the fastest cluster-spawning time, which brings us to the next table.

Time to Spawn a Cluster

ServiceAspect AKS EKS GKE
Size 3 nodes (Ds2-v2), each having 2 vCPUs, 7 GB of RAM 3 nodes t3.large 3 nodes n1-standard-2
Time (m:ss) Average 5:45 for a full cluster 11:06 for master plus 2:40 for the node group (totalling 13:46 for a full cluster) Average 2:42 for a full cluster

I performed these tests in the same region (Frankfurt and West Europe for AKS) to remove this difference’s possible impact on spawning time. I also tried to select the same size for nodes for the cluster: Three nodes, each having two vCPUs and seven or eight GB of memory, a standard size to run a small load on Kubernetes and start experimenting. I created each cluster three times to compute an average.

In these tests, GKE remained way ahead with a spawning time always under three minutes.

Kubernetes: AWS vs. GCP vs. Azure CLI Overview

Not all CLIs are created equal, but in this case, all three CLIs are actually modules of a larger CLI. What’s it like to get up and running with each cloud provider’s CLI toolchain?

AKS CLI (via az)

After installing az tooling, then the AKS module (via az aks install-cli), engineers need to authorize the CLI to communicate with the project’s Azure account. This is a matter of getting the credentials to update the local kubeconfig file via a simple az aks get-credentials --resource-group myResourceGroup --name myAKSCluster.

Similarly, to create a cluster: az aks create --resource-group myResourceGroup --name myAKSCluster

EKS CLI (via aws or eksctl)

On AWS, we find a different approach—there are two different official CLI tools to manage EKS clusters. As always, aws can connect to AWS resources, particularly clusters. Getting credentials into a local kubeconfig can be done via: aws eks update-kubeconfig --name cluster-test.

However, engineers can also use eksctl, developed by Weaveworks and written in Go, to easily create and manage an EKS cluster. A major boon EKS provides for cloud engineers is that they can combine it with YAML configuration files to create infrastructure-as-code (IaC) since it’s working with CloudFormation. It’s definitely an asset to consider when integrating an EKS cluster into larger infrastructure on AWS.

Creating a cluster via eksctl is as easy as eksctl create cluster, no other parameters required.

GKE CLI (via gcloud)

For GKE, the steps are very similar: Install gcloud, then authenticate via gcloud init. The possibilities from there: Engineers can create, delete, describe, get credentials for, resize, update, or upgrade a cluster, or list clusters.

The syntax to create a cluster with gcloud is straightforward: gcloud container clusters create myGCloudCluster --num-nodes=1

AKS vs. EKS vs. GKE: Test Drive Results

In practice, we can see that GKE is certainly the fastest to spin up a basic cluster, in terms of both console simplicity and cluster spawn time. UX-wise, with the connect button next to the cluster, making it the most straightforward to connect to a cluster, too.

In terms of CLI tooling, the three cloud providers have implemented similar functionalities; however, we can lay the stress on the extra tool provided by Weaveworks for EKS. eksctl is the perfect tool for you to implement infrastructure-as-code on top of your preexisting AWS infrastructure, combining other services with EKS.

Managed Kubernetes Offerings Forge Ahead: AWS vs. GCP vs. Azure

For those just starting in the world of Kubernetes, the go-to implementation for me is GKE, since it’s the most straightforward. It’s easy to set up, it has a simple and fast UX for spawning, and it’s well-integrated into the Google Cloud Platform ecosystem.

Even though AWS was the last to join the race, it has a few undeniable advantages, such as bare metal nodes and the simple fact that it’s integrated with the provider with the largest mind-share.

Finally, AKS has made great progress since its creation. Tooling and feature parity likely won’t take long, meanwhile leaving room in the process to innovate. And as with any managed Kubernetes offering, for those already on the parent platform, integration will be a selling point.

Once a team has chosen a Kubernetes cloud provider, it could be interesting to look at other teams’ experiences, particularly failures. These post-mortems are a reflection of real-world cases—always a good starting point for developing one’s own cutting-edge best practices. I look forward to your comments below!

Further Reading on the Toptal Engineering Blog:

Understanding the basics

Container orchestration is the management and abstraction of all the resources revolving around running containers: configuration, resources, scaling, monitoring, networking, and tooling. Kubernetes is one of the most widely adopted container orchestration tools in the industry.

We need container orchestration to be able to efficiently manage and organize a fleet of containers running on servers. With container orchestration, we can build scalable, resilient, and powerful container-centric systems to deploy any application.

The benefit of using container orchestration with Kubernetes is to provide an abstraction layer on top of servers to run your containers. With Kubernetes, you are able to efficiently manage configuration and resources, and easily scale your infrastructure as needed.

Kubernetes is an open-source tool that has been developed based on Borg, a Google project. It is a production-grade container orchestration tool that creates a layer of abstraction on top of servers to allow the easy management of container scaling, monitoring, resource usage, networking, and configuration.

Comments

Matt Fox
What about OpenShift?
Kevin Bloch
Good question. There are many other CSPs offering managed Kubernetes services, not just OpenShift. But this article is meant to focus on the three largest cloud providers, rather than be exhaustive.
Oliver Schoenborn
Nice article, thanks for writing it! AWS monitoring via Prometheus is in beta. About this : "AWS offers lightweight monitoring for the control plane, but monitoring resources inside workers will require a third-party setup." AWS now provides Container Insights as part of Cloudwatch, easy to install and provides monitoring of metrics like CPU and memory etc for nodes, pods, namespaces. There is also container logs and log insights which are a lot like GCP's stackdriver IIRC. Although I find the logs as provided by API (kubectl or dashboard) to be easier for most tasks, but when you need to filter logs across several containers I could see it being useful to learn Insights' query language. Finally, I found CloudPosse's terraform modules super easy to use to provision an EKS cluster using IaC, even support AWS's integration with IAM which means you have powerful control over what pods can and can't do.
Tuncay Demirtepe
Thanks for the comparison, this is very useful. Could you add the Digital Ocean into your comparison. Probably the biggest difference from the "big three", their current version is 1.18.3, to my surprise it is even higher than GKE. Some comparison points: - Free Control Plane - Cost for a month (3 x 8 GB Ram + 4 cpu pod): $120
Jorge Galvis
Thanks for this article. Correct me if I'm wrong, but it looks like the first table got mixed values for the EKS and the GKE columns.
Kevin Bloch
Thanks for the suggestion. As mentioned above, this article is meant to focus on the three largest cloud providers, rather than be exhaustive.
Guillaume Dury
Hi Jorge, you are right, small hiccup for this column it is now fixed :) Thank you for the comment!
Guillaume Dury
Hi Oliver, Thank you for the comment. Indeed AWS provides Container Insights to monitor your workers and its resources. I've updated the table accordingly to mention this solution from AWS. I didn't know about the CloudPosse's terraform modules that sounds interesting, will check it out! Cheers
comments powered by Disqus