Using the Kubernetes Engine to manage ML deployments | C2C Community

Using the Kubernetes Engine to manage ML deployments

  • 31 July 2022
  • 4 replies

Userlevel 7
Badge +21

Without interruption, DevOps is implemented in the Kubernetes engine throughout the creation of AI/ML applications. To manage application deployment situations like "Continuous Deployment," "Blue-Green Deployments," "Canary Deployments," and more, Dev Ops procedures will frequently require numerous deployments. In this article, I'll go over some fundamental container scaling and management techniques so you can complete these frequent tasks when using several heterogeneous deployments.

What I will disscus:

  • Kubernet engine kubectl tool command
  • Create and manage deployment yaml files
  • How can, update, and scale deployments
  • Updating deployments and deployment styles

I hope you have some basic knowledage about Kubernet engine and DevOps theory.


About Heterogeneous deployments


Heterogeneous deployments usually involve the connection of two or more distinct infrastructure environments or regions to respond to a specific technical or operational need. Heterogeneous deployments are known as "hybrid", "multi-cloud", or "public-private", depending on the specifics of the deployment.

In this context, "heterogeneous deployments" refers to those that combine on-premises and public cloud environments with a single cloud environment or many public cloud environments (multi-cloud) (hybrid or public-private).

A variety of business and technical challenges may arise when deployments are confined to a single environment or region:

  • Maxed out resources: You might not have the compute, networking, and storage resources necessary to meet your production needs in any one environment, on-premises situations in particular.
  • Limited geographic reach: Deployments within a single environment require individuals geographically distant from each other to access a deployment. Their traffic could be traveling all over the world to a central location.
  • Limited availability: Web-wide traffic schemes challenge apps to remain fault-tolerant and resilient.
  • Vendor lock-in: Platform and infrastructure abstractions at the vendor level can prevent the transfer of applications.
  • Inflexible resources: Your resources may be limited to a specific set of calculation, storage or network offerings.

Heterogeneous deployments can help overcome these challenges, but they need to be structured through programmatic and deterministic processes and procedures.

Ad-hoc or One-off deployment procedures can make deployments or processes brittle and intolerant of failures. One-off processes can lose data or decrease traffic.

Good deployment processes should be repeatable and use proven approaches for procurement, configuration, and maintenance management.

Three common scenarios for heterogeneous deployment are multi-cloud deployments, fronting on-premises data, and continuous integration/continuous delivery (CI/CD) processes.

The following exercises have implemented some common use cases for heterogeneous deployments, as well as well-architected approaches using Kubernetes and other infrastructure resources to achieve them.

Configure your project based on your requirement:


In this discussion, I have used application name as ctwoc

Example application name: ctwoc.

Machine type and zone are used just for example purpose.

Set your working Google Cloud zone by running the following command, substituting the local zone as us-central1-a:

gcloud config set compute/zone us-central1-a

Get the sample code for creating and running containers and deployments:

gsutil -m cp -r gs://source-code-location .
cd change-your-working-direcotry

Create a cluster with five n1-standard-1 nodes (this will take a few minutes to complete):

gcloud container clusters create bootcamp --num-nodes 5 --scopes ",storage-rw"

The deployment object


Let's get started with Deployments. First let's take a look at the Deployment object. The explain command in kubectl can tell us about the Deployment object.

kubectl explain deployment

We can also see all of the fields using the --recursive option.

kubectl explain deployment --recursive

You can use the explain command to help you understand the structure of a Deployment object and understand what the individual fields do.

kubectl explain

Create a deployment

You can use your editor what you have installed into your environment. Here I have used vim editor command.

Update the deployments/auth.yaml configuration file:

vi deployments/auth.yaml

Start the editor:


Change the image in the containers section of the Deployment to the following:

- name: auth
image: "exampleimagename/auth:1.0.0"

Save the auth.yaml file: press <Esc> then type:


Press <Enter>. Now let's create a simple deployment. Examine the deployment configuration file:

Notice how the Deployment is creating one replica and it's using version 1.0.0 of the auth container.

When you run the kubectl create command to create the auth deployment, it will make one pod that conforms to the data in the Deployment manifest. This means we can scale the number of Pods by changing the number specified in the replicas field.

Go ahead and create your deployment object using kubectl create:

kubectl create -f deployments/auth.yaml

Once you have created the Deployment, you can verify that it was created.

kubectl get deployments

Once the deployment is created, Kubernetes will create a ReplicaSet for the Deployment. We can verify that a ReplicaSet was created for our Deployment:

kubectl get replicasets

We should see a ReplicaSet with a name like auth-xxxxxxx

Finally, we can view the Pods that were created as part of our Deployment. The single Pod is created by the Kubernetes when the ReplicaSet is created.


kubectl get pods

It's time to create a service for our auth deployment. You've already seen service manifest files, so we won't go into the details here. Use the kubectl create command to create the auth service.


kubectl create -f services/auth.yaml

Now, do the same thing to create and expose the hello Deployment.

kubectl create -f deployments/ctwoc.yaml
kubectl create -f services/ctwoc.yaml

And one more time to create and expose the frontend Deployment.

kubectl create secret generic tls-certs --from-file tls/
kubectl create configmap nginx-frontend-conf --from-file=nginx/frontend.conf
kubectl create -f deployments/frontend.yaml
kubectl create -f services/frontend.yaml

Note: You created a ConfigMap for the frontend.

Interact with the frontend by grabbing its external IP and then curling to it.

kubectl get services frontend

It may take a few seconds before the External-IP field is populated for your service. This is normal. Just re-run the above command every few seconds until the field is populated.

curl -ks https://<EXTERNAL-IP>

And you get the ctwoc response back.

You can also use the output templating feature of kubectl to use curl as a one-liner:

curl -ks https://`kubectl get svc frontend -o=jsonpath="{.status.loadBalancer.ingress[0].ip}"`

Scale a Deployment

Now that we have a Deployment created, we can scale it. Do this by updating the spec.replicas field. You can look at an explanation of this field using the kubectl explain command again.

kubectl explain deployment.spec.replicas

The replicas field can be most easily updated using the kubectl scale command:

kubectl scale deployment ctwoc --replicas=5

Note: It may take a minute or so for all the new pods to start up.

After the Deployment is updated, Kubernetes will automatically update the associated ReplicaSet and start new Pods to make the total number of Pods equal 5.

Verify that there are now 5 ctwoc Pods running:

kubectl get pods | grep ctwoc- | wc -l

Now scale back the application:

kubectl scale deployment ctwoc --replicas=3

Again, verify that you have the correct number of Pods.

kubectl get pods | grep ctwoc- | wc -l

I already explained Kubernetes deployments and how to manage & scale a group of Pods.


Rolling update


Deployments support updating images to a new version through a rolling update mechanism. When a Deployment is updated with a new version, it creates a new ReplicaSet and slowly increases the number of replicas in the new ReplicaSet as it decreases the replicas in the old ReplicaSet.

Trigger a rolling update

To update your Deployment, run the following command:

kubectl edit deployment ctwoc

Change the image in the containers section of the Deployment to the following:

image: exampleimagename/ctwoc:2.0.0

Save and exit.

Once you save out of the editor, the updated Deployment will be saved to your cluster and Kubernetes will begin a rolling update.

See the new ReplicaSet that Kubernetes creates.:

kubectl get replicaset

You can also see a new entry in the rollout history:

kubectl rollout history deployment/ctwoc

Pause a rolling update

If you detect problems with a running rollout, pause it to stop the update. Give that a try now:

kubectl rollout pause deployment/ctwoc

Verify the current state of the rollout:

kubectl rollout status deployment/ctwoc

You can also verify this on the Pods directly:

kubectl get pods -o jsonpath --template='{range .items
  • }{}{"\t"}{"\t"}{.spec.containers[0].image}{"\n"}{end}'

Resume a rolling update

The rollout is paused which means that some pods are at the new version and some pods are at the older version. We can continue the rollout using the resume command.

kubectl rollout resume deployment/ctwoc

When the rollout is complete, you should see the following when running the status command.

kubectl rollout status deployment/ctwoc

Rollback an update

Assume that a bug was detected in your new version. Since the new version is presumed to have problems, any users connected to the new Pods will experience those issues.

You will want to roll back to the previous version so you can investigate and then release a version that is fixed properly.

Use the rollout command to roll back to the previous version:

kubectl rollout undo deployment/ctwoc

Verify the roll back in the history:

kubectl rollout history deployment/ctwoc

Finally, verify that all the Pods have rolled back to their previous versions:

kubectl get pods -o jsonpath --template='{range .items
  • }{}{"\t"}{"\t"}{.spec.containers[0].image}{"\n"}{end}'

Great! You learned about rolling updates for Kubernetes deployments and how to update applications without downtime.

Canary deployments


When you want to test a new deployment in production with a subset of your users, use a canary deployment. Canary deployments allow you to release a change to a small subset of your users to mitigate risk associated with new releases.

Create a canary deployment

A canary deployment consists of a separate deployment with your new version and a service that targets both your normal, stable deployment as well as your canary deployment.

First, create a new canary deployment for the new version:

cat deployments/ctwoc-canary.yaml

Now create the canary deployment:

kubectl create -f deployments/ctwoc-canary.yaml

After the canary deployment is created, you should have two deployments, ctwoc and ctwoc-canary. Verify it with this kubectl command:

kubectl get deployments

On the ctwoc service, the selector uses the app:ctwocselector which will match pods in both the prod deployment and canary deployment. However, because the canary deployment has a fewer number of pods, it will be visible to fewer users.

Verify the canary deployment

You can verify the ctwoc version being served by the request:

curl -ks https://`kubectl get svc frontend -o=jsonpath="{.status.loadBalancer.ingress[0].ip}"`/version

Run this several times and you should see that some of the requests are served by test 1.0.0 and a small subset (1/4 = 25%) are served by 2.0.0.

Canary deployments in production - session affinity


In this discussion, each request sent to the Nginx service had a chance to be served by the canary deployment. But what if you wanted to ensure that a user didn't get served by the Canary deployment? A use case could be that the UI for an application changed, and you don't want to confuse the user. In a case like this, you want the user to "stick" to one deployment or the other.

You can do this by creating a service with session affinity. This way the same user will always be served from the same version. In the example below the service is the same as before, but a new sessionAffinity field has been added, and set to ClientIP. All clients with the same IP address will have their requests sent to the same version of the ctwoc application.

Due to it being difficult to set up an environment to test this, you don't need to here, but you may want to use sessionAffinity for canary deployments in production.


Blue-green deployments


Rolling updates are ideal because they allow you to deploy an application slowly with minimal overhead, minimal performance impact, and minimal downtime. There are instances where it is beneficial to modify the load balancers to point to that new version only after it has been fully deployed. In this case, blue-green deployments are the way to go.

Kubernetes achieves this by creating two separate deployments; one for the old "blue" version and one for the new "green" version. Use your existing ctwoc deployment for the "blue" version. The deployments will be accessed via a Service which will act as the router. Once the new "green" version is up and running, you'll switch over to using that version by updating the Service.


A major downside of blue-green deployments is that you will need to have at least 2x the resources in your cluster necessary to host your application. Make sure you have enough resources in your cluster before deploying both versions of the application at once.


The service

Use the existing test application ctwoc service, but update it so that it has a selector app:ctwoc, version: 1.0.0. The selector will match the existing "blue" deployment. But it will not match the "green" deployment because it will use a different version.

First update the service:

kubectl apply -f services/ctwoc-blue.yaml

NOTE: Ignore the warning that says resource service/ctwoc is missing as this is patched automatically.

Updating using Blue-Green Deployment

In order to support a blue-green deployment style, we will create a new "green" deployment for our new version. The green deployment updates the version label and the image path.

Create the green deployment:

kubectl create -f deployments/ctwoc-green.yaml

Once you have a green deployment and it has started up properly, verify that the current version of 1.0.0 is still being used:

curl -ks https://`kubectl get svc frontend -o=jsonpath="{.status.loadBalancer.ingress[0].ip}"`/version

Now, update the service to point to the new version:

kubectl apply -f services/ctwoc-green.yaml

When the service is updated, the "green" deployment will be used immediately. You can now verify that the new version is always being used.

curl -ks https://`kubectl get svc frontend -o=jsonpath="{.status.loadBalancer.ingress[0].ip}"`/version

Blue-Green Rollback

If necessary, you can roll back to the old version in the same way. While the "blue" deployment is still running, just update the service back to the old version.

kubectl apply -f services/ctwoc-blue.yaml

Once you have updated the service, your rollback will have been successful. Again, verify that the right version is now being used:

curl -ks https://`kubectl get svc frontend -o=jsonpath="{.status.loadBalancer.ingress[0].ip}"`/version

We did it. I discussed blue-green deployments and how to deploy updates to applications that need to switch versions all at once.

4 replies

Userlevel 7
Badge +58

Great post @malamin!

Thanks for sharing it!

Userlevel 7
Badge +21

You’re welcome @ilias .

Userlevel 7
Badge +16

Another great post with a great in-depth analysis, @malamin. Thank you for posting! 


@hifce, @tharun@Balamurugan nagappan, @urbanenomad@UteHlasek, @royca, @skriesch, I remember you were all working with Kubernetes! 

What do you think of @malamin’s article on managing ML deployments using the Kubernetes Engine? :)


Userlevel 7
Badge +21

Thank you so much, @Dimitris Petrakis  to consider a worthwhile post.