Automatic merge from submit-queue
Node Conformance Test: Containerize the node e2e test
For #30122, #30174.
Based on #32427, #32454.
**Please only review the last 3 commits.**
This PR packages the node e2e test into a docker image:
- 1st commit: Add `NodeConformance` flag in the node e2e framework to avoid starting kubelet and collecting system logs. We do this because:
- There are all kinds of ways to manage kubelet and system logs, for different situation we need to mount different things into the container, run different commands. It is hard and unnecessary to handle the complexity inside the test suite.
- 2nd commit: Remove all `sudo` in the test container. We do this because:
- In most container, there is no `sudo` command, and there is no need to use `sudo` inside the container.
- It introduces some complexity to use `sudo` inside the test. (https://github.com/kubernetes/kubernetes/issues/29211, https://github.com/kubernetes/kubernetes/issues/26748) In fact we just need to run the test suite with `sudo`.
- 3rd commit: Package the test into a docker container with corresponding `Makefile` and `Dockerfile`. We also added a `run_test.sh` script to start kubelet and run the test container. The script is only for demonstration purpose and we'll also use the script in our node e2e framework. In the future, we should update the script to start kubelet in production way (maybe with `systemd` or `supervisord`).
@dchen1107 @vishh
/cc @kubernetes/sig-node @kubernetes/sig-testing
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
Release alpha version node test container gcr.io/google_containers/node-test-ARCH:0.1 for users to verify their node setup.
```
Automatic merge from submit-queue
Adding cadcading deletion support for federated secrets
Ref https://github.com/kubernetes/kubernetes/issues/33612
Adding cascading deletion support for federated secrets.
The code is same as that for namespaces. Just ensuring that DeletionHelper functions are called at right places in secret_controller.
Also added e2e tests.
cc @kubernetes/sig-cluster-federation @caesarxuchao
```release-note
federation: Adding support for DeleteOptions.OrphanDependents for federated secrets. Setting it to false while deleting a federated secret also deletes the corresponding secrets from all registered clusters.
```
Automatic merge from submit-queue
Rename experimental-runtime-integration-type to experimental-cri
Also rename the field in the component config to `EnableCRI`
The e2e tests cover cases like cluster size changed, parameters
changed, ConfigMap got deleted, autoscaler pod got deleted, etc.
They are separated into a fast part(could be run parallelly) and
a slow part(put in [serial]). The fast part of the e2e tests cost
around 50 seconds to run.
Automatic merge from submit-queue
Rename ScheduledJobs to CronJobs
I went with @smarterclayton idea of registering named types in schema. This way we can support both the new (CronJobs) and old (ScheduledJobs) resource name. Fixes#32150.
fyi @erictune @caesarxuchao @janetkuo
Not ready yet, but getting close there...
**Release note**:
```release-note
Rename ScheduledJobs to CronJobs.
```
This allows us to interrupt/kill the executed command if it exceeds the
timeout (not implemented by this commit).
Set timeout in Exec probes. HTTPGet and TCPSocket probes respect the
timeout, while Exec probes used to ignore it.
Add e2e test for exec probe with timeout. However, the test is skipped
while the default exec handler doesn't support timeouts.
Automatic merge from submit-queue
Adding more e2e tests for federated namespace cascading deletion and fixing bugs
Ref https://github.com/kubernetes/kubernetes/issues/33612
Adding more e2e tests for testing cascading deletion of federated namespace.
New tests are now verifying that cascading deletion happen when DeletionOptions.OrphanDependents=false and it does not happen when DeleteOptions.OrphanDependents=true.
Also updated deletion helper to always add OrphanFinalizer. generic registry will remove it if DeleteOptions.OrphanDependents=false. Also updated namespace registry to do the same.
We need to add the orphan finalizer to keep the orphan by default behavior. We assume that its dependents are going to be orphaned and hence add that finalizer. If user does not want the orphan behavior, he can do so using DeleteOptions and then the registry will remove that finalizer.
cc @kubernetes/sig-cluster-federation @caesarxuchao @derekwaynecarr
Automatic merge from submit-queue
Node Conformance Test: Add system verification
For #30122 and #29081.
This PR introduces system verification test in node e2e and conformance test. It will run before the real test. Once the system verification fails, the test will just fail. The output of the system verification is like this:
```
I0909 23:33:20.622122 2717 validators.go:45] Validating os...
OS: Linux
I0909 23:33:20.623274 2717 validators.go:45] Validating kernel...
I0909 23:33:20.624037 2717 kernel_validator.go:79] Validating kernel version
KERNEL_VERSION: 3.16.0-4-amd64
I0909 23:33:20.624146 2717 kernel_validator.go:93] Validating kernel config
CONFIG_NAMESPACES: enabled
CONFIG_NET_NS: enabled
CONFIG_PID_NS: enabled
CONFIG_IPC_NS: enabled
CONFIG_UTS_NS: enabled
CONFIG_CGROUPS: enabled
CONFIG_CGROUP_CPUACCT: enabled
CONFIG_CGROUP_DEVICE: enabled
CONFIG_CGROUP_FREEZER: enabled
CONFIG_CGROUP_SCHED: enabled
CONFIG_CPUSETS: enabled
CONFIG_MEMCG: enabled
I0909 23:33:20.679328 2717 validators.go:45] Validating cgroups...
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
I0909 23:33:20.679454 2717 validators.go:45] Validating docker...
DOCKER_GRAPH_DRIVER: aufs
```
It verifies the system following a predefined `SysSpec`:
``` go
// DefaultSysSpec is the default SysSpec.
var DefaultSysSpec = SysSpec{
OS: "Linux",
KernelVersion: []string{`3\.[1-9][0-9].*`, `4\..*`}, // Requires 3.10+ or 4+
// TODO(random-liu): Add more config
KernelConfig: KernelConfig{
Required: []string{
"NAMESPACES", "NET_NS", "PID_NS", "IPC_NS", "UTS_NS",
"CGROUPS", "CGROUP_CPUACCT", "CGROUP_DEVICE", "CGROUP_FREEZER",
"CGROUP_SCHED", "CPUSETS", "MEMCG",
},
Forbidden: []string{},
},
Cgroups: []string{"cpu", "cpuacct", "cpuset", "devices", "freezer", "memory"},
RuntimeSpec: RuntimeSpec{
DockerSpec: &DockerSpec{
Version: []string{`1\.(9|\d{2,})\..*`}, // Requires 1.9+
GraphDriver: []string{"aufs", "overlay", "devicemapper"},
},
},
}
```
Currently, it only supports:
- Kernel validation: version validation and kernel configuration validation
- Cgroup validation: validating whether required cgroups subsystems are enabled.
- Runtime Validation: currently, only validates docker graph driver.
The validating framework is ready. The specific validation items could be added over time.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
lister-gen updates
- Remove "zz_generated." prefix from generated lister file names
- Add support for expansion interfaces
- Switch to new generated JobLister
@deads2k @liggitt @sttts @mikedanese @caesarxuchao for the lister-gen changes
@soltysh @deads2k for the informer / job controller changes
Automatic merge from submit-queue
Add cmd support to gcp auth provider plugin
**What this PR does / why we need it**:
Adds ability for gcp auth provider plugin to get access token by shelling out to an external command. We need this because for GKE, kubectl should be using gcloud credentials. It currently uses google application default credentials, which causes confusion if user has configured both with different permissions (previously the two were almost always identical).
**Which issue this PR fixes**:
Addresses #35530 with gcp-only solution, as generic cmd plugin was deemed not useful for other providers.
**Special notes for your reviewer**:
Configuration options are to support whatever future command gcloud provides for printing access token of active user. Also works with existing command (`gcloud auth print-access-token`)
```release-note
```
Automatic merge from submit-queue
Remove GetRootContext method from VolumeHost interface
Remove the `GetRootContext` call from the `VolumeHost` interface, since Kubernetes no longer needs to know the SELinux context of the Kubelet directory.
Per #33951 and #35127.
Depends on #33663; only the last commit is relevant to this PR.
Automatic merge from submit-queue
Per Volume Inode Accounting
Collects volume inode stats using the same find command as cadvisor. The command is "find _path_ -xdev -printf '.' | wc -c". The output is passed to the summary api, and will be consumed by the eviction manager.
This cannot be merged yet, as it depends on changes adding the InodesUsed field to the summary api, and the eviction manager consuming this. Expect tests to fail until this happens.
DEPENDS ON #35137
Automatic merge from submit-queue
[AppArmor] Hold bad AppArmor pods in pending rather than rejecting
Fixes https://github.com/kubernetes/kubernetes/issues/32837
Overview of the fix:
If the Kubelet needs to reject a Pod for a reason that the control plane doesn't understand (e.g. which AppArmor profiles are installed on the node), then it might contiinuously try to run the pod on the same rejecting node. This change adds a concept of "soft rejection", in which the Pod is admitted, but not allowed to run (and therefore held in a pending state). This prevents the pod from being retried on other nodes, but also prevents the high churn. This is consistent with how other missing local resources (e.g. volumes) is handled.
A side effect of the change is that Pods which are not initially runnable will be retried. This is desired behavior since it avoids a race condition when a new node is brought up but the AppArmor profiles have not yet been loaded on it.
``` release-note
Pods with invalid AppArmor configurations will be held in a Pending state, rather than rejected (failed). Check the pod status message to find out why it is not running.
```
@kubernetes/sig-node @timothysc @rrati @davidopp
Automatic merge from submit-queue
Update how we detect overlapping deployments
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#24152
**Special notes for your reviewer**: cc @kubernetes/deployment
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
NONE
```
When looking for overlapping deployments, we should also find other deployments that select current deployment's pods,
not just the ones whose pods are selected by current deployment.
Automatic merge from submit-queue
New command: "kubeadm token generate"
As part of #33930, this PR adds a new top-level command to kubeadm to just generate a token for use with the init/join commands. Otherwise, users are left to either figure out how to generate a token on their own, or let `kubeadm init` generate a token, capture and parse the output, and then use that token for `kubeadm join`.
At this point, I was hoping for feedback on the CLI experience, and then I can add tests. I spoke with @mikedanese and he didn't like the original propose of `kubeadm util generate-token`, so here are the runners up:
```
$ kubeadm generate-token # <--- current implementation
$ kubeadm generate token # in case kubeadm might generate other things in the future?
$ kubeadm init --generate-token # possibly as a subcommand of an existing one
```
Currently, the output is simply the token on one line without any padding/formatting:
```
$ kubeadm generate-token
1087fd.722b60cdd39b1a5f
```
CC: @kubernetes/sig-cluster-lifecycle
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
New kubeadm command: generate-token
```
Automatic merge from submit-queue
Add --force to kubectl delete and explain force deletion
--force is required for --grace-period=0. --now is == --grace-period=1.
Improve command help to explain what graceful deletion is and warn about
force deletion.
Part of #34160 & #29033
```release-note
In order to bypass graceful deletion of pods (to immediately remove the pod from the API) the user must now provide the `--force` flag in addition to `--grace-period=0`. This prevents users from accidentally force deleting pods without being aware of the consequences of force deletion. Force deleting pods for resources like StatefulSets can result in multiple pods with the same name having running processes in the cluster, which may lead to data corruption or data inconsistency when using shared storage or common API endpoints.
```
Automatic merge from submit-queue
NPD: Add e2e test for NPD v0.2.
Node problem detector has been updated after v0.1, including:
1. Add lookback support. It will lookback for configured time to search for possible kernel panic before node reboot.
2. Get node name via downward api.
This PR updates the test to test the new NPD behavior.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Made changes to DELETE API to let v1.DeleteOptions be passed in as a queryParameter
**Which issue this PR fixes** _(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)_: fixes#34856
```release-note
DELETE requests can now pass in their DeleteOptions as a query parameter or a body parameter, rather than just as a body parameter.
```
Automatic merge from submit-queue
Set the annotation only if the test requires it.
**What this PR does / why we need it**: Fixes StatefulSet flake
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/36107
**Special notes for your reviewer**: We shouldn't be setting the debug annotation in all our tests, only the ones that bring statefulset pods up one after another. In the absence of the annotation, we have the new default behavior governed by https://github.com/kubernetes/kubernetes/pull/35739
**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-apps @bprashanth @calebamiles
Automatic merge from submit-queue
Temporarily disable GCI mounter in e2e node tests
This is just so we have an off-switch ready to go if we need it. Don't merge unless we need to disable this functionality in the e2e node tests.
Automatic merge from submit-queue
Controller changes for perma failed deployments
This PR adds support for reporting failed deployments based on a timeout
parameter defined in the spec. If there is no progress for the amount
of time defined as progressDeadlineSeconds then the deployment will be
marked as failed by a Progressing condition with a ProgressDeadlineExceeded
reason.
Follow-up to https://github.com/kubernetes/kubernetes/pull/19343
Docs at kubernetes/kubernetes.github.io#1337
Fixes https://github.com/kubernetes/kubernetes/issues/14519
@kubernetes/deployment @smarterclayton
Automatic merge from submit-queue
add secret type to RBD secrets in examples and e2e test
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This is a followup to recent changes in secret type matching
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
@kubernetes/sig-storage @liggitt
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Automatic merge from submit-queue
Federated ConfigMap controller
Based on the secrets controller. E2e tests will come in the next PR.
**Release note**:
``` release-note
Federated ConfigMap controller. Supports all the API that regular ConfigMap has.
```
cc: @quinton-hoole @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
Remove statefulset e2e test setup for alpha
Depends on #35731, once statefulset is beta, it doesn't need special treatment for alpha version in e2e test
cc @erictune @foxish @kubernetes/sig-apps
Automatic merge from submit-queue
Fixed kibana test problem
After #36103 a problem was introduced. If you change basePath, dynamic bundle optimization is performed upon startup, which can take several minutes.
Following PR fixed this problem by waiting until optimization is completed.
@piosz
Edit: This problem has no known workarounds if basePath is dynamic (like in our case)
Reference: https://discuss.elastic.co/t/how-to-avoid-dynamic-bundle-optimization/37367
Automatic merge from submit-queue
Remove non-generic options from genericapiserver.Config
Remove non-generic options from genericapiserver.Config. Changes the discovery CIDR/IP information to an interface and then demotes several fields.
I haven't pulled from them genericapiserver.Options, but that's a future option we have. Segregation as as a followup at the very least.
Automatic merge from submit-queue
Enable NFS volume test
This PR fixes the dockerfile for NFS server image and enable NFSv4.
After using containeried mounts approach on GCI, this test should pass.
The functionality used to exist entirely in the NC which would
previously clean up pods and nodes together. Now, we simply
wait for the PodGC to see that the node is now deleted and clean up the
pods. This may take a while and hence we set a 1 minute timeout.
Automatic merge from submit-queue
Switch DisruptionBudget api from bool to int allowed disruptions [only v1beta1]
Continuation of #34546. Apparently it there is some bug that prevents us from having 2 different incompatibile version of API in integration tests. So in this PR v1alpha1 is removed until testing infrastructure is fixed.
Base PR comment:
Currently there is a single bool in disruption budget api that denotes whether 1 pod can be deleted or not. Every time a pod is deleted the apiserver filps the bool to false and the disruptionbudget controller sets it to true if more deletions are allowed. This works but it is far from optimal when the user wants to delete multiple pods (for example, by decreasing replicaset size from 10000 to 8000).
This PR adds a new api version v1beta1 and changes bool to int which contains a number of pods that can be deleted at once.
cc: @davidopp @mml @wojtek-t @fgrzadkowski @caesarxuchao