Command line flag API remains the same. This allows ComponentConfig
structures (e.g. KubeletConfiguration) to express the map structure
behind feature gates in a natural way when written as JSON or YAML.
For example:
KubeletConfiguration Before:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates: "DynamicKubeletConfig=true,Accelerators=true"
```
KubeletConfiguration After:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates:
DynamicKubeletConfig: true
Accelerators: true
```
Automatic merge from submit-queue (batch tested with PRs 50988, 50509, 52660, 52663, 52250). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added device plugin e2e kubelet failure test
Signed-off-by: Renaud Gaubert <renaud.gaubert@gmail.com>
**What this PR does / why we need it**:
This is part of issue #52859 (fixes#52859)
This PR adds a e2e_node test for the device plugin.
Specifically it implements testing of failure handling by the device plugin components in case Kubelet restart / crashes.
I might try to refactor the GPU tests in a later PR.
**Special notes for your reviewer**:
@jiayingz @vishh
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52960, 52373). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Refactor eviction tests
fixes: #52203
We have a bunch of eviction tests, which each break independently, and take a large amount of time to fix.
This refactors these tests to share the core eviction testing logic. Each tests needs only to set kubelet flags, and specify which pods to run.
I decided to omit the memory eviction tests because they work. Best not to disturb them.
A large portion of the code changes are the renaming of inode_eviction_test.go -> eviction_test.go
This should probably wait until after https://github.com/kubernetes/kubernetes/pull/50392
/assign @mtaufen @Random-Liu
Automatic merge from submit-queue
Node e2e tests for the CPU Manager.
**What this PR does / why we need it**:
- Adds node e2e tests for the CPU Manager implementation in https://github.com/kubernetes/kubernetes/pull/49186.
**Special notes for your reviewer**:
- Previous PR in this series: #51180
- Only `test/e2e_node/cpu_manager_test.go` must be reviewed as a part of this PR (i.e., the last commit). Rest of the comments belong in #51357 and #51180.
- The tests have been on run on `n1-standard-n4` and `n1-standard-n2` instances on GCE.
To run this node e2e test, use the following command:
```sh
make test-e2e-node TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS="CPU Manager" SKIP="" PARALLELISM=1
```
CC @ConnorDoyle @sjenning
Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.
Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.
This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.
Related issue: #50217
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
Automatic merge from submit-queue (batch tested with PRs 50087, 39587, 50042, 50241, 49914)
Add node e2e test for Docker's shared PID namespace
Ref: https://github.com/kubernetes/kubernetes/issues/42926
This PR adds a simple test for the shared PID namespace that's enabled when Docker is 1.13.1+.
/sig node
/area node-e2e
/assign @yujuhong
**Release note**:
```
None
```
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747)
Add support for `no_new_privs` via AllowPrivilegeEscalation
**What this PR does / why we need it**:
Implements kubernetes/community#639
Fixes#38417
Adds `AllowPrivilegeEscalation` and `DefaultAllowPrivilegeEscalation` to `PodSecurityPolicy`.
Adds `AllowPrivilegeEscalation` to container `SecurityContext`.
Adds the proposed behavior to `kuberuntime`, `dockershim`, and `rkt`. Adds a bunch of unit tests to ensure the desired default behavior and that when `DefaultAllowPrivilegeEscalation` is explicitly set.
Tests pass locally with docker and rkt runtimes. There are also a few integration tests with a `setuid` binary for sanity.
**Release note**:
```release-note
Adds AllowPrivilegeEscalation to control whether a process can gain more privileges than it's parent process
```
Automatic merge from submit-queue
add dockershim checkpoint node e2e test
Add a bunch of disruptive cases to test kubelet/dockershim's checkpoint work flow.
Some steps are quite hacky. Not sure if there is better ways to do things.
This PR adds two features:
1. add support for isolating the emptyDir volume use. If user
sets a size limit for emptyDir volume, kubelet's eviction manager
monitors its usage
and evict the pod if the usage exceeds the limit.
2. add support for isolating the local storage for container overlay. If
the container's overly usage exceeds the limit defined in container
spec, eviction manager will evict the pod.
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
Automatic merge from submit-queue
node e2e: improve the validate OOM score test for infra containers
The test blindly checked all "pause" processes on the node, assuming
they were all infra containers. This change takes a snapshot of all
existing "pause" processes on the node, and exclude them in the
validation. The test still relies on the fact that it runs exclusively
on the node. If that assumption changes, we will need other methods to
locate the PIDs of the infra containers.
This fixes#37580
Automatic merge from submit-queue
test/e2e_node: prepull images with CRI
Part of https://github.com/kubernetes/kubernetes/issues/40739
- This PR builds on top of #40525 (and contains one commit from #40525)
- The second commit contains a tiny change in the `Makefile`.
- Third commit is a patch to be able to prepull images using the CRI (as opposed to run `docker` to pull images which doesn't make sense if you're using CRI most of the times)
Marked WIP till #40525 makes its way into master
@Random-Liu @lucab @yujuhong @mrunalp @rhatdan