Tanzu Kubernetes Releases are Kubernetes distributions that are signed and supported by VMware for Tanzu Kubernetes Clusters.
More info here
There are two ways in which you can populate the Tanzu Kubernetes Releases(images) in your vSphere with Tanzu environment. These images are OVF templates that are backed by a Photon OS VM and about 16GB in size. More info here
- Use a subscribed content library.
The subscribed content library will synchronize with a public VMware content library:
https://wp-content.vmware.com/v2/latest/lib.json
You can either choose to download the images immediately or when needed.
- Use a local content library.
With the local content library you will have to download the ovf templates from
https://wp-content.vmware.com/v2/latest/
and upload to the content library. This is ideally useful for a secore environment.
You can read about the content libraries here
In my test environment, I configured a local content library and populated it with images that I downloaded as per step 2 above. When the content library has correct ovf templates with the correct name, they will be available for consumption as Tanzu Kubernetes Releases when you create Tanzu Kubernetes Clusters.
In my lab, this worked flawlesly. However, I recently ran into an issue when trying to upgrade one of the Tanzu Kubernetes Clusters(TKC); The Tanzu Kubernetes Releases did not list the newly imported images. I also tried mapping a subscribed content library to the supervisor cluster namespace and the sycned releases were also not populated.
In the excerpt below, you can see the v1.20.2 image, however no v1.20.2 release under Tanzu Kubernetes Releases
root@debian:~# k get tanzukubernetesreleases
NAME VERSION READY COMPATIBLE CREATED UPDATES AVAILABLE
v1.19.7---vmware.1-tkg.2.f52f85a 1.19.7+vmware.1-tkg.2.f52f85a True True 55d
root@debian:~#
root@debian:~# k get virtualmachineimages
NAME VERSION OSTYPE FORMAT AGE
photon-3-k8s-v1.19.7---vmware.1-tkg.1.fc82c41 v1.19.7+vmware.1-tkg.2.f52f85a vmwarePhoton64Guest ovf 55d
photon-3-k8s-v1.20.2---vmware.1-tkg.2.3e10706 v1.20.2+vmware.1-tkg.2.3e10706 vmwarePhoton64Guest ovf 7h31m
slaxie ubuntu64Guest ovf 55d
slaxy ubuntu64Guest ovf 55d
So how do the releases get populated? The Tanzu Kuberentes Grid Service takes care of populating the relevant releases when images are imported into the content library.
The following article by Michael West has some great content on troubleshooting the Tanzu Kubernetes Grid Service: https://core.vmware.com/blog/tanzu-kubernetes-grid-service-troubleshooting-deep-dive-part-1
I started troubleshooting the issue by looking at few logs:
- You begin by an SSH session to vCenter Server
- Decrypt the password to login to the Supervisor Control plane nodes
root@vcenter [ ~ ]# /usr/lib/vmware-wcp/decryptK8Pwd.py
Read key from file
Connected to PSQL
Cluster: domain-c8:dad7e875-3357-449b-809a-bf1783e3430d
IP: 172.16.0.201
PWD: zflRFNeKdd7F7RJJ1sa
------------------------------------------------------------
- Review TKG controller manager logs
root@422f78932163b6dc93e693a87e83d890 [ /var/log ]# k logs vmware-system-tkg-controller-manager-7f44566f46-bfp4k -n vmware-system-tkg -c manager
I1018 22:04:34.352715 1 main.go:72] entrypoint "msg"="Starting guest cluster controller" "buildnumber"="17831490" "buildtype"="dev" "version"="2.1"
I1018 22:04:34.352863 1 main.go:155] entrypoint "msg"="creating manager" "buildNumber"="17831490" "buildType"="dev" "options"={"podNamespace":"vmware-system-tkg","podName":"vmware-system-tkg-controller-manager","metricsAddr":"127.0.0.1:8087","cloudProvider":"external","guestClusterClientTimeout":10000000000,"nodeRemediationTimeout":3600000000000,"maxConcurrentReconciles":3,"syncPeriod":600000000000,"leaderElectionEnabled":true} "version"="2.1"
I1018 22:04:36.167604 1 listener.go:44] controller-runtime/metrics "msg"="metrics server is starting to listen" "addr"="127.0.0.1:8087"
- Login to the control plane node and look at the pods and deployments under the namespace
vmware-system-tkg
root@vcenter [ ~ ]# ssh root@172.16.0.201
root@422f78932163b6dc93e693a87e83d890 [ /var/log/containers ]# k get pods -n vmware-system-tkg
NAME READY STATUS RESTARTS AGE
vmware-system-tkg-controller-manager-7f44566f46-748b4 0/2 NodeAffinity 0 80d
Cause: User error. I had shared my lab with a colleague who had scaled down the vmware-system-tkg-controller-manager controller deployment. I scaled the deployment back up and after a little while, the Tanzu kubernetes releases listed the release I imported into the content library.
root@422f78932163b6dc93e693a87e83d890 [ /var/log ]# k get deploy -n vmware-system-tkg
NAME READY UP-TO-DATE AVAILABLE AGE
vmware-system-tkg-controller-manager 0/0 0 0 80d
vmware-system-tkg-webhook 3/3 3 3 80d
root@422f78932163b6dc93e693a87e83d890 [ /var/log ]# k scale deploy vmware-system-tkg-controller-manager --replicas=3 -n vmware-system-tkg
root@debian:~# k get tkr
NAME VERSION READY COMPATIBLE CREATED UPDATES AVAILABLE
v1.19.7---vmware.1-tkg.2.f52f85a 1.19.7+vmware.1-tkg.2.f52f85a True True 8h [1.20.2+vmware.1-tkg.2.3e10706]
v1.20.2---vmware.1-tkg.2.3e10706 1.20.2+vmware.1-tkg.2.3e10706 True True 8h
root@debian:~# k get virtualmachineimages
NAME VERSION OSTYPE FORMAT AGE
photon-3-k8s-v1.19.7---vmware.1-tkg.1.fc82c41 v1.19.7+vmware.1-tkg.2.f52f85a vmwarePhoton64Guest ovf 55d
photon-3-k8s-v1.20.2---vmware.1-tkg.2.3e10706 v1.20.2+vmware.1-tkg.2.3e10706 vmwarePhoton64Guest ovf 7h31m
slaxie ubuntu64Guest ovf 55d
slaxy