On December 8, 2022, Pacific Time, Kubernetes officially released v1.26 with the theme `Electrifying` .
As the last version in 2022, many new functions have been added, and the stability has also been significantly improved. We will introduce the update of version 1.26 from the following perspectives.
Update overview:
-
Kube APIServer: As the entry point for Kubernetes requests, this update adds 4 new KEP functions, and makes some optimizations in response compression, and 2 other functions have also been upgraded from Alpha to Beta.
-
Node: We put the updates most closely related to kubelet here, mainly including 4 new KEP functions, and 4 functions are GA in this version.
-
Storage: In terms of storage, the function of allocating volumes from snapshots of other namespaces (cross-namespaces) has been added, and 2 storage-related functions have entered the Beta stage, and 3 functions are officially GA.
-
Network: It is mainly an update to Kube Proxy, including a performance-optimized KEP, and 2 functions have entered Beta, and 4 functions have officially entered GA.
-
Resource control and coordination: mainly for the update of related resource controllers in kube-controller-manager, there are also 2 new KEP functions, 2 functions are upgraded from Alpha to Beta, and 1 function is formally GA.
-
Scheduler: Mainly added an important KEP function - PodSchedulingReadiness is used to control when the Pod can be scheduled by the scheduler, and one function has been upgraded from Alpha to Beta.
-
Observability: In terms of observability, a new mechanism for exposing component status - Component Health SLIs is added. In addition, many indicators are added to each component.
-
The kubectl command, kubeadm and client-go also have some optimizations and Bug Fixes.
-
For the functions that have been GA, according to the version iteration strategy of Kubernetes, 11 Feature Gates were also removed in 1.26 . If these Feature Gates continue to be set in the component command, the component will not start normally.
Next, let's take a look at some of the more important API deprecations and changes that affect upgrades.
01
API deprecations
and changes
PR#110618 Kubelet no longer supports the v1alpha2 version of CRI, and the connected container runtime must implement the v1 version of the container runtime interface.
This means that Kubernetes v1.26 will not support containerd 1.5.x and earlier versions; you need to upgrade to containerd 1.6.x or later before you can upgrade the node's kubelet to 1.26.
PR#112306 flowcontrol.apiserver.k8s.io adds v1beta3 version, and sets v1beta2 as the optimal version. In 1.27, v1beta3 will be set as the optimal version.
PR#113336 CSIMigrationvSphere feature has been GA and this feature cannot be turned off.
Official tip: Do not upgrade to Kubernetes v1.26 if you need to use Windows, XFS or raw blocks. You can upgrade after vSphere CSI Driver adds relevant support in versions after v2.7.x.
PR#113710 --pod-eviction-timeoutflag in kube-controller-manager command is deprecated and removed in 1.27 along with --enable-taint-managerflag.
PR#112643 DynamicKubeletConfig was discarded in 1.23, and the logic in kubelet has been removed in 1.24. This update removes the logic in FeatureGateDynamicKubeletConfig and APIServer.
PR#112120 Remove some invalid klog related flags in Kube component.
02
It was APIServer
KEP-2799 Reduction of Secret-based Service Account Tokens
Added Alpha Feature Gate —— LegacyServiceAccountTokenTracking to control whether to enable this feature.
When LegacyServiceAccountTokenTracking is enabled, the secret-based sa token will use the label kubernetes.io/legacy-token-last-used to record the last usage time.
KEP-3488 CEL for Admission Control
Related PR: PR#113314,PR#113349,PR#112994,PR#112792,PR#112926,PR#112858。
Based on the KEP-2876 CRD Validation Expression Language [1] provided by Kubernetes v1.25, this function adds a new resource under admissionregistration.k8s.io/v1alpha1—ValidatingAdmissionPolicy, which allows field validation when Validation Webhook is not used.
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "demo-policy.example.com"
Spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.spec.replicas <= 5"
This will require the resource's spec.replicas field to be less than or equal to 5.
Added Alpha Feature Gate —— ValidatingAdmissionPolicy to control whether to enable this feature
KEP-3352 Aggregated Discovery PR#113171
Currently, users can only traverse and request the Group and Version APIs to obtain the Discovery API, but this function reduces these calls to only two interfaces, /api and /apis.
Added Alpha Feature Gate —— AggregatedDiscoveryEndpoint to control whether to enable this feature.
KEP-3325 Auth API to get self user attributes PR#111333
In the authentication/v1alpha1 group, a new resource SelfSubjectReview is added to provide users with querying their own user information mapped in Kubernetes.
And the kubectl alpha auth whoami command has been added to facilitate query.
$ kubectl alpha auth whoami -o yaml
apiVersion: authentication.k8s.io/v1alpha1
kind: SelfSubjectReview
status:
userInfo:
username: jane.doe
uid: b79dbf30-0c6a-11ed-861d-0242ac120002
groups:
- students
- teachers
- system:authenticated
extra:
skills:
- reading
- learning
subjects:
- math
- sports
Added Alpha Feature Gate —— APISelfSubjectAttributesReview to control whether to enable this feature.
PR#112193 APIServer adds --aggregator-reject-forwarding-redirect flag, users can set it to false to continue forwarding the redirection response of AA (Aggregated API) Server, the default is true.
PR#113015 Custom resources can be specified through the --encryption-provider-config file, and these custom resources can be stored encrypted in etcd.
Response Compression
PR#112299 Based on load testing and production data collected from thousands of production Kubernetes clusters, the community observed that gzip compression in Kubernetes APIServer is currently suboptimal.
ISSUE: kubernetes/kubernetes#112296
Here are some reports[2] and meeting minutes[3].
PR#112309 Added DisableCompression field in kubeconfig, when set to true, it is required to no longer compress the response.
PR#112580 Add --disable-compression flag in kubectl, when set to true, it is required not to compress the response.
Functional stability upgrade
Alpha -> Beta
Feature Gate | KEP |
LegacyServiceAccountTokenNoAutoGeneration | KEP-2799 Reduction of Secret-based Service Account Tokens |
APIServerIdentity | KEP-1965 kube-apiserver identity |
03
Node (Kubelet)
KEP-3063 dynamic resource allocation PR#111023
Add resource.k8s.io/v1alpha1 group and add resources related to dynamic resource allocation under this group - 'ResourceClaim', 'ResourceClass', 'ResourceClaimTemplate', 'PodScheduling'.
Added Alpha Feature Gate —— DynamicResourceAllocation to control whether to enable this feature.
The new API is more flexible than Kubernetes' existing Device Plugins functionality, because it allows Pods to request special types of resources, which can be provided at the node level, cluster level, or according to other modes set by the user.
Similarly, the Pod structure also adds corresponding support for dynamic resource allocation.
apiVersion: v1
kind: Pod
spec:
containers:
- name: with-resource
image: busybox
command: ["sh", "-c", "set && mount && ls -la /dev/"]
resources:
claims:
- name: resource
resourceClaims:
- name: resource
source:
resourceClaimName: shared-claim
# resourceClaimTemplateName: test-inline-claim-template
KEP-3545 Improved multi-numa alignment in Topology Manager PR#112914
This function better handles NUMA (Non-Uniform Memory Access) nodes by optimizing the TopologyManager.
Add a new configurable item topologyManagerPolicyOptions field and --topology-manager-policy-options flag to kubelet config and kubectl commands respectively to set additional configuration of Topology Manager Policy
And add three Alpha Feature Gates to control the configuration of Topology Manager Policy
-
TopologyManagerPolicyOptions
-
TopologyManagerPolicyAlphaOptions
-
TopologyManagerPolicyBetaOptions
TopologyManagerPolicyOptions controls whether the TopologyManagerPolicyOptions function is supported, and the other two are used to control whether the Alpha and Beta level Options of the TopologyManagerPolicy can be set.
Of course, the TopologyManager Feature Gate needs to be enabled to use the TopologyManager function, but the Feature Gate is already in the Beta stage and does not need to be actively set.
KEP-3386 Kubelet Evented PLEG for Better Performance PR#111384
This feature allows kubelet to reduce periodic polling by relying on container runtime interface (CRI) notifications as much as possible when tracking Pod status in a node, which reduces kubelet's CPU usage
Added Alpha Feature Gate - EventedPLEG to control whether to enable this feature.
PR#86139 In previous versions, when httpGet was used in the container preStop and postStart lifecycle callbacks, even if the schemes were set to HTTPS, http was still used to access, and the headers set by the user would not be applied in the setting request.
lifecycle:
postStart:
port: 443
httpGet:
scheme: HTTPS
httpHeadlers:
- name: HEADER
value: VALUE
This PR fixes these problems, and when the https access is abnormal, it will fall back to the http request. When the fallback occurs, a LifecycleHTTPFallback event will be created for the Pod, and the kubelet_lifecycle_handler_http_fallbacks_total indicator will be updated.
In addition, an Alpha Feature Gate - ConsistentHTTPGetHandlers has been added . Users can set --feature-gates=ConsistentHTTPGetHandlers=false in the kubelet to turn off the fallback behavior.
KEP-3503 Windows allows specifying whether pods are added to the node's network namespace PR#112961
Added Alpha Feature Gate - WindowsHostNetworking to control whether to enable this feature.
PR#112414 allows merging multi-line options in /etc/resolv.conf into a single-line setting in Pods.
options ndots:1 attempts:3
options ndots:1 attempts:3 ndots:5
->
options ndots:5 attempts:3
BUG FIX
PR#113041 Fixed an issue where kubelet picked the wrong container due to duplicate container names due to lifecycle.preStop when executing kubectl exec.
PR#108832 When the container has limit.cpu set, but requests.cpu is "0", cgroups cpuShares takes the minimum value of 2 instead of using limit.cpu.
PR#112184 When kubelet only sets --cloud-provider or --node-ip, it will make sure to clear the invalid annotation in the node - alpha.kubernetes.io/provided-node-ip.
PR#113481 Fix kubectl logs --timestamps When viewing logs, the problem of chaotic random timestamps appears.
PR#112518 Fixed an issue where pods continued to run on nodes tainted with NoExecute taints when the PodDisruptionConditions feature gate was enabled.
PR#112123 Set the minimum value of cpuCFSQuotaPeriod from 1us to 1ms, setting the value below 1ms will fail the validation.
Related PR: PR#112077,PR#111520,PR#63437
Functional stability upgrade
Beta -> GA
Feature Gate | KEP |
CPUManager | KEP-3057 Graduate to CPUManager to GA |
DevicePlugins | KEP-3573 Graduate DeviceManager to GA |
KubeletCredentialProviders | KEP-2133 Kubelet Credential Provider |
WindowsHostProcessContainers | KEP-1981 Support for Windows privileged containers |
04
storage
KEP-3294 Provision volumes from cross-namespace snapshots
Related PRs: PR#113186 , PR#kubernetes-csi/external-rpovisioner#805
Before Kubernetes 1.26, with VolumeSnapshot, users could allocate volumes from snapshots. But it cannot bind to VolumeSnapshots in other namespaces in PersistentVolumeClaim.
apiVersion: v1
kind: PersistentVolumeClaim
spec:
storageClassName: csi-hostpath-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
dataSourceRef:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: new-snapshot-demo
namespace: prod
volumeMode: Filesystem
This function supports users to allocate volumes from snapshots across namespaces by setting the newly added field spec.dataSourceRef.namespace in PVC
Added Alpha Feature Gate —— CrossNamespaceVolumeDataSource to control whether to enable this feature
Functional stability upgrade
Alpha -> Beta
Feature Gate | KEP |
RetroactiveDefaultStorageClass | KEP-3333 Retroactive default StorageClass assignement |
NodeOutOfServiceVolumeDetach |
KEP-2268 Non-graceful node shutdown |
Beta -> GA
Feature Gate | KEP |
CSIMigrationAzureFile | KEP-625 In-tree storage plugin to CSI Driver Migration |
CSIMigrationvSphere | KEP-1491 vSphere in-tree to CSI driver migration |
DelegateFSGroupToCSIDriver | KEP-2317 Allow Kubernetes to supply pod's fsgroup to CSI driver on mount |
05
network
PR#110268 Optimize kube-proxy performance, it only sends rules changed in call to iptables-restore instead of whole ruleset
Added Alpha Feature Gate —— MinimizeIPTablesRestore to control whether to enable this feature.
PR#108250 kube-proxy adds flag --iptables-localhost-nodeports to allow users to prohibit access to NodePort's Service through localhost.
PR#111806 If the user requests to use ipvs, but the system is not configured correctly, it will no longer fall back to iptables mode, but return an error.
PR#113363 kube-proxy will restart if it detects that the pod.Spec.PodCIDRs assigned to Nodes have changed.
PR#112133 removes the deprecated "userspace" proxy mode.
Functional stability upgrade
Alpha -> Beta
Feature Gate | KEP |
ProxyTerminatingEndpoints | KEP-1669 Proxy Terminating Endpoints |
ExpandedDNSConfig | KEP-2595 Expanded DNS configuration |
Beta -> GA
Feature Gate | KEP |
MixedProtocolLBService | KEP-1435 Support of mixed protocols in Services with type=LoadBalancer |
ServiceIPStaticSubrange | KEP-3070 Reserve Service IP Ranges For Dynamic and Static IP Allocation |
ServiceInternalTrafficPolicy | KEP-2086 Service Internal Traffic Policy |
EndpointSliceTerminatingCondition | KEP-1672 Tracking Terminating Endpoints |
06
Resource Control and Coordination
KEP-3017 PodHealthyPolicy for PodDisruptionBudget PR#113375
Added spec.unhealthyPodEvictionPolicy field to the PodDisruptionBudget resource to control when unhealthy Pods should be considered for eviction.
There are currently two values that can be set for the spec.unhealthyPodEvictionPolicy field - IfHealthyBudget and AlwaysAllow
spec:
minAvailable: 2
selector:
matchLabels:
app: zookeeper
unhealthyPodEvictionPolicy: IfHealthyBudget
Added Alpha Feature Gate —— PDBUnhealthyPodEvictionPolicy to control whether to enable this feature
KEP-3335 StatefulSet Start Ordinal PR#112744
StatefulSet currently numbers Pods starting from 0.
This function adds a spec.ordinals.start field to StatefulSet to control the starting number of Pods.
apiVersion: apps/v1
kind: StatefulSet
spec:
ordinals:
start: 1
Added Alpha Feature Gate —— StatefulSetStartOrdinal to control whether to enable this feature
PR#112011 If multiple HPAs involve the same Pod, it will stop working and set the HPA's ScalingAction Condition to False and Reason to AmbiguousSelector. This PR also includes multiple HPAs pointing to the same Deployment.
Functional stability upgrade
Alpha -> Beta
Feature Gate | KEP |
JobPodFailurePolicy | KEP-3329 Retriable and non-retriable Pod failures for Jobs |
PodDisruptionConditions | KEP-3329 Retriable and non-retriable Pod failures for Jobs |
Beta -> GA
Feature Gate | KEP |
JobTrackingWithFinalizers | KEP-2307 Job tracking without lingering Pods |
07
Pod scheduling
KEP-3521 Pod Scheduling Readiness
Related PR: PR#113275,PR#113274,PR#113442
Not all Pods in the Pending state are ready to be scheduled, and some Pods cannot be successfully scheduled due to the lack of necessary resources, which will also bring additional work to the scheduler.
This feature adds the spec.schedulingGates field in the Pod to control whether the Pod is ready for actual scheduling.
spec:
schedulingGates:
- name: <value>
Scheduling will only start when spec.schedulingGates is cleared:
$ kubectl get pod example-po
NAME READY STATUS RESTARTS AGE
example-po 0/1 SchedulingGated 0 30s
Added Alpha Feature Gate —— PodSchedulingReadiness to control whether to enable this feature
PR#111726 Output Pending status Pod information in Scheduler's debug Dummper.
Optimization and BUG FIX
PR#111809 When using Patch to update Pod status, in addition to net.ConnectionRefused, ServiceUnavailable and InternalError errors will also be retried
A ServiceUnavailable error occurs when the APIServer is temporarily unable to process a request.
E0805 20:54:21.624945 123623 scheduler.go:356] Error updating pod foo: the server is currently unable to handle the request (patch pods foo)
InternalError usually occurs due to a temporary failure of the webhook.
E0811 23:32:30.886582 213747 scheduler.go:357] Error updating pod foo: Internal error occurred: failed calling webhook "xyz": Post "xyz": context deadline exceeded
Functional stability upgrade
Alpha -> Beta
Feature Gate | KEP |
NodeInclusionPolicy | KEP-3094 Take taints/tolerations into consideration when calculating PodTopologySpread skew |
08
Observability
KEP-3466 Kubernetes Component Health SLIs
There was no standard format to expose the health information of Kubernetes components before. This function adds a new path /metrics/slis to each component to expose the service level indicator (ServiceLevelIndicator) in the Prometheus format.
Each component needs to expose two metrics:
-
gauge: the current health check status of the component
# HELP kubernetes_healthcheck [ALPHA] This metric records the result of a single healthcheck.
# TYPE kubernetes_healthcheck gauge
kubernetes_healthcheck{name="etcd",type="healthz"} 1
kubernetes_healthcheck{name="etcd",type="readyz"} 1
-
counter: cumulative count of detected states for each health check
# HELP kubernetes_healthchecks_total [ALPHA] This metric records the results of all healthcheck.
# TYPE kubernetes_healthchecks_total counter
kubernetes_healthchecks_total{name="etcd",status="success",type="healthz"} 15
kubernetes_healthchecks_total{name="etcd",status="success",type="readyz"} 15
-
It was APIServer: PR#112741
-
Kube Controller Manager: PR#112978
-
Kube Scheduler: PR#113026
-
Kubelet: PR#113030
-
Kube Proxy: PR#113057
-
Cloud Controller Manager: PR#113340
Added Alpha Feature Gate - ComponentSLIs to control whether to enable this feature.
Some metrics indicators have been added to each component, and some indicators calculation problems have been fixed.
09
Kubectl commands
Subcommand enhancements and fixes
PR#109525 kubectl wait command supports setting non-existent fields in -o jsonpath= , which can be useful when some fields are set asynchronously
PR#111096 kubectl api-resources adds categories column in -o wide output, and adds --categories parameter to support filtering based on categories.
PR#113819 kubectl alpha event moved to top-level command kubectl events.
PR#111093 fixed kubectlrollouthistory --revision=<version>-ojson|yaml<resource> to output json/yaml, return the latest version instead of the specified revision.
PR#111571 Optimize the prompt information of the kubectl label --dry-run command to prevent users from misunderstanding that the label has been set.
before
$ kubectl label pod foo bar=baz --dry-run=server
pod/foo labeled
after
$ kubectl label pod foo bar=baz --dry-run=server
pod/foo labeled (server dry run)
PR#112556 Optimize the error message when kubectl patch uses StrategicMerigePatch to update custom resources.
PR#112700 Fix kubectl covert choose wrong api version.
PR#109505 kubectl annotate no longer throws an error when setting an annotation with the same value as the original value.
PR#110907 When executing kubectl apply, if --namespace is specified, but --prune-allowlist is not specified, non-namespace resources will be deleted. This pr just adds a warning. In 1.28, when kubectl apply specifies a namespace, Resource pv & namespace without namespace are no longer deleted.
PR#113116 kubectl apply adds --prune-allowlist flag, used with --prune flag to replace the deprecated --prune-whitelist flag.
Other
PR#113146 The kubectl explain command can use OpenAPIv3 through the environment variable KUBECTL_EXPLAIN_OPENAPIV3.
PR#112553 Kubectl escapes terminal special characters in output. Fixes CVE-2021-25743.
PR#112150 Optimize kubectl's display of invalid requests returned by APIServer.
PR#112243, PR#112261 deprecate several flags of kubectl run command, even if they are set they will be ignored.
Shell Completion
PR#113636 kubectl shell completion supports displaying command descriptions in bash.
bash-5.1$ kubectl a[tab][tab]
alpha (Commands for features in alpha)
annotate (Update the annotations on a resource)
api-resources (Print the supported API resources on the server)
api-versions (Print the supported API versions on the server, in the form of "group/version")
apply (Apply a configuration to a resource by file name or stdin)
argo (The command argo is a plugin installed by the user)
attach (Attach to a running container)
auth (Inspect authorization)
autoscale (Auto-scale a deployment, replica set, stateful set, or replication controller)
bash-5.1$ kubectl --c[tab][tab]
--cache-dir (Default cache directory)
--certificate-authority (Path to a cert file for the certificate authority)
--client-certificate (Path to a client certificate file for TLS)
--client-key (Path to a client key file for TLS)
--cluster (The name of the kubeconfig cluster to use)
--context (The name of the kubeconfig context to use)
PR#105867 provides shell completion for the kubectl plugin, and the plugin can provide shell completion for the plugin command through kube_complete-<pluginName>.
10
Kubeadm
Command fixes and enhancements
PR#113005 kubeadm join phase control-plane-preapare certs supports running with --dry-run.
PR#112945 supports dry-run mode for sub-phases, e.g. kubeadm reset phase cleanup-node --dry-run.
PR#111512 A new phase is added to the kubeadm init command -- show-join-command. Users can pass kubeadm init --skip-phase=show-join-command to skip printing the join information. This phase cannot be executed alone.
PR#112172 kubeadm reset command adds --cleanup-tmp-dir flag, which will clean up the content in /etc/kubernetes/tmp, the default is false.
PR#112732 kubeadm adds validation for mirror repository format in configuration.
PR#111783 When the CertificateAuthorityData of the kubeconfig read by kubeadm is empty, it will try to load the CA certificate from the external CertificateAuthority file.
PR#112508 Allow RSA and ECDSA format keys in preflight check.
PR#110972 kubeadm reset will try to clean up old data as much as possible during execution. Old data will be cleared when each reset phase is executed, and the default etcd data directory will be deleted when the remove-etcd-member phase is executed.
PR#112751 Fix the bug when validating ClusterConfiguration network related fields (dnsDomain, serviceSubnet, podSubnet).
Other
PR#111277 Optimize the error message when kubeadm runs subcommands.
PR#112008 Since the node-role.kubernetes.io/master taint is no longer set in the control plane nodes in 1.25, kubeadm no longer sets the node-role.kubernetes.io/master tolerance for CoreDNSDeployment.
PR#112000 The kubeadm init|join|upgrade command removes the --container-runtime flag, since since dockershim was removed, this flag has only one value that can be set --container-runtime=remote.
11
Client-Go
PR#112200client-go's SharedInformerFactory adds a Shutdown method to wait for all running informers in the Factory to end.
12
remove function
The new version removes the GA feature Feature Gate:
-
ServiceLoadBalancerClass
-
ServiceLBNodePortControl
-
CSRDuration
-
DefaultPodTopologySpread
-
NonPreemptingPriority
-
PodAffinityNamespaceSelector
-
PreferNominatedNode
-
SubOverhead
-
UnversionedKubeletConfigMap
-
IndexedJob
-
SuspendJob
13
Functional downgrade
Beta -> Alpha
LocalStorageCapacityIsolationFSQuotaMonitoring was upgraded to Beta in v1.25, but it was rolled back to Alpha due to the problem that the ConfigMap would not be synchronized to the Pod file system normally after the update.
Release history
-
Kubernetes 1.25 is officially released, a major breakthrough in many aspects
-
Kubernetes 1.24 goes mature Kubernetes
-
Kubernetes 1.23 is officially released, what are the enhancements?
-
Kubernetes 1.22 subverts your imagination: you can enable Swap, introduce a PSP replacement, and...
-
Kubernetes 1.21 shock release | PSP will be abolished, BareMetal will be enhanced
References:
[1] KEP-2876 CRD Validation Expression Language:
https://github.com/kubernetes/enhancements/issues/2876
[2] report:
https://docs.google.com/document/d/1rMlYKOVyujboAEG2epxSYdx7eyevC7dypkD_kUlBxn4/edit
[3] Meeting Minutes:
https://youtu.be/GKBqyV8y8j0
author of this article
Cai Wei
Senior Cloud Native R&D Engineer of "DaoCloud"
Founder of the open source project Clusterpedia