cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.
cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.
cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.
When sending out an pod status update, kubelet
GETs the pod from the apiserver
Terminates if the apiserver returns an not found error; otherwise, proceed to
to update.
Even after a pod has been deleted, there might still be queued up updates for
the pod. This leads to expensive, unncessary GET operations. The situation is
worse when there are batch creation/deletion of a significant number of pods
(e.g., E2E tests), leaving many updates in the queue.
This change checks whether a pod exists before GET the pod from the apiserver
to avoid redundant GETs.
The logic doesn't apply to static pods as their corresponding mirror pod may
not have been created yet, or may be in the process of recreation. Deleting the
pod status immediately resets the version of the status for the static pod,
while the apiStatusVersion remains unchanged. This could lead to incorrect
versioning and hence stale pod status in the apiserver.
The formatting function is used often in logging. This improves the readability
by shortening the length of the call. Also change the fomartted string to
include the pod UID.
The mapping of static pod <--> mirror pod UIDs was backwards in a couple
places. Fortunately, they canceled each other out. Fixed, and added a
test case.