Automatic merge from submit-queue
Eviction manager needs to start as runtime dependent module
To support disk eviction, the eviction manager needs to know if there is a dedicated device for the imagefs. In order to know that information, we need to start the eviction manager after cadvisor. This refactors the location eviction manager is started.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @vishh @ronnielai
Automatic merge from submit-queue
Allow PVs to specify supplemental GIDs
Retry of https://github.com/kubernetes/kubernetes/pull/28691 . Adds a Kubelet helper function for getting extra supplemental groups
Automatic merge from submit-queue
Add parsing code in kubelet for eviction-minimum-reclaim
The kubelet parses the eviction-minimum-reclaim flag and validates it for correctness.
The first two commits are from https://github.com/kubernetes/kubernetes/pull/29329 which has already achieved LGTM.
Automatic merge from submit-queue
CRI: add LinuxUser to LinuxContainerConfig
Following discussion in https://github.com/kubernetes/kubernetes/pull/25899#discussion_r70996068
The Container Runtime Interface should provide runtimes with User information to run the container process as (OCI being one of them).
This patch introduces a new field `user` into `LinuxContainerConfig` structure. The `user` field introduces also a new type structure `LinuxUser` which consists of `uid`, `gid` and `additional_gids`.
The `LinuxUser` struct has been embedded into `LinuxContainerConfig` to leave space for future implementations which are not Linux-related (e.g. Windows may have a different representation of _Users_).
If you feel naming can be better we can probably move `LinuxUser` to `UnixUser` also.
/cc @mrunalp @vishh @euank @yujuhong
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Automatic merge from submit-queue
Removing images with multiple tags
If an image has multiple tags, we need to remove all the tags in order to make docker image removing successful.
#28491
Automatic merge from submit-queue
Fix incorrect if conditions
When the current conditions `if inspect == nil && inspect.Config == nil && inspect.Config.Labels == nil` is true, the func containerAndPodFromLabels will return. else will not. Suppose `inspect != nil` but `inspect.Config == nil`, the current conditions will be false and the func won't return, then the below `labels := inspect.Config.Labels` will lead to panic.
Automatic merge from submit-queue
rkt: Don't return if the service file doesn't exist when killing the pod
Remove an unused logic. Also this prevents the KillPod() from failing
when the service file doesn't exist. E.g., it can be removed by garbage
collection in a rare case:
1, There are already more than `gcPolicy.MaxContainers` containers running
on the host.
2, The new pod(A) starts to run but doesn't enter 'RUNNING' state yet.
3, GC is triggered, and it sees the pod(A) is in an inactive state (not running),
and the it needs to remove the pod to force the `gcPolicy.MaxContainers`.
4, GC fails to remove the pod because `rkt rm` fails when the pod is running,
but it removes the service file anyway.
5, Follow up KillPod() call will fail because it cannot find the service file
on disk.
Also this is possible only when the pod has been in prepared state for longer
than 1 min, which sounds like another issue.
cc @kubernetes/sig-rktnetes
Automatic merge from submit-queue
Allow mounts to run in parallel for non-attachable volumes
This PR:
* Fixes https://github.com/kubernetes/kubernetes/issues/28616
* Enables mount volume operations to run in parallel for non-attachable volume plugins.
* Enables unmount volume operations to run in parallel for all volume plugins.
* Renames `GoRoutineMap` to `GoroutineMap`, resolving a long outstanding request from @thockin: `"Goroutine" is a noun`
Automatic merge from submit-queue
ImagePuller refactoring
A plain refactoring
- Moving image pullers to a new pkg/kubelet/images directory
- Hiding image pullers inside the new ImageManager
The next step is to consolidate the logic of the serialized and the parallel image pullers inside ImageManager
xref: #25577
Automatic merge from submit-queue
Kubelet: Set PruneChildren when removing image.
This is a bug introduced during switching to engine-api. https://github.com/kubernetes/kubernetes/issues/23563.
When removing image, there is an option `noprune`:
```
If prune is true, ancestor images will each attempt to be deleted quietly.
```
In go-dockerclient, the default value of the option is ["noprune=false"](https://github.com/fsouza/go-dockerclient/blob/master/image.go#L171), which means that ancestor images should be also removed. This is the expected behaviour.
However in engine-api, the option is changed to `PruneChildren`, and the default value is `PruneChildren=false`, which means that ancestor images won't be removed.
This makes `ImageRemove` only remove the first layer of the image, which causes the image garbage collection not working as expected.
This should be fixed in 1.3.
And thanks to @ronnielai for finding the bug! :)
/cc @kubernetes/sig-node
Automatic merge from submit-queue
docker_manager: Correct determineContainerIP args
This could result in the network plugin not retrieving the pod ip in a
call to SyncPod when using the `exec` network plugin.
The CNI and kubenet network plugins ignore the name/namespace arguments,
so they are not impacted by this bug.
I verified the second included test failed prior to correcting the
argument order.
Fixes#29161
cc @yujuhong
Automatic merge from submit-queue
Move ExtractPodBandwidthResources test into appropriate package
Found during #28511, this test is in the wrong package currently.
cc @kubernetes/sig-network
Allow mount volume operations to run in parallel for non-attachable
volume plugins.
Allow unmount volume operations to run in parallel for all volume
plugins.
Automatic merge from submit-queue
Delete redundant if condition
The case `containerStatus == nil` has already been checked just above. It's redundant here.
Automatic merge from submit-queue
Make kubelet continue cleanup when there is noncritical error.
Fix https://github.com/kubernetes/kubernetes/issues/29078.
Even though there is error when cleaning up pod directory or bandwidth limits, kubelet could continue cleanup the following stuff.
However, when runtime cache or runtime returns error, cleanup should fail, because the following cleanup relies on the `runningPod`.
@yujuhong
/cc @kubernetes/sig-node
This could result in the network plugin not retrieving the pod ip in a
call to SyncPod when using the `exec` network plugin.
The CNI and kubenet network plugins ignore the name/namespace arguments,
so they are not impacted by this bug.
I verified the second included test failed prior to correcting the
argument order.
Fixes#29161
Automatic merge from submit-queue
Return (bool, error) in Authorizer.Authorize()
Before this change, Authorize() method was just returning an error, regardless of whether the user is unauthorized or whether there is some other unrelated error. Returning boolean with information about user authorization and error (which should be unrelated to the authorization) separately will make it easier to debug.
Fixes#27974
Automatic merge from submit-queue
Do not skip check for cgroup creation in the systemd mount
As soon as libcontainer dependency is update in #28410, we can skip check for cgroup creation in the systemd mount. As the latest version of libcontainer should create cgroups in the sytemd mount aswell.
This is tied to the upstream issue: #27204
@vishh PTAL
Remove an unused logic. Also this prevents the KillPod() from failing
when the service file doesn't exist. E.g., it can be removed by garbage
collection in a rare case:
1, There are already more than `gcPolicy.MaxContainers` containers running
on the host.
2, The new pod(A) starts to run but doesn't enter 'RUNNING' state yet.
3, GC is triggered, and it sees the pod(A) is in an inactive state (not running),
and the it needs to remove the pod to force the `gcPolicy.MaxContainers`.
4, GC fails to remove the pod because `rkt rm` fails when the pod is running,
but it removes the service file anyway.
5, Follow up KillPod() call will fail because it cannot find the service file
on disk.
Also this is possible only when the pod has been in prepared state for longer
than 1 min, which sounds like another issue.
Before this change, Authorize() method was just returning an error,
regardless of whether the user is unauthorized or whether there
is some other unrelated error. Returning boolean with information
about user authorization and error (which should be unrelated to
the authorization) separately will make it easier to debug.
Fixes#27974
Automatic merge from submit-queue
rkt: Copy the /etc/hosts /etc/resolv.conf into pod dir before mounting.
rkt: Copy the /etc/hosts /etc/resolv.conf into pod dir before mounting.
This enables the container to modify the /etc/hosts/ /etc/resolv.conf without changing the host's ones.
With this PR, we now match the docker's behavior.
Fix https://github.com/kubernetes/kubernetes/issues/29022
cc @kubernetes/sig-rktnetes @quentin-m
This enables the container to modify the /etc/hosts/ /etc/resolv.conf
without changing the host's ones.
With this PR, we now match the docker's behavior.