Instead discard metric values for pods that are unready and have never
been ready (they may report misleading values, the original reason for
introducing scale up forbidden window).
Use per pod metric when pod is:
- Ready, or
- Not ready but creation timestamp and last readiness change are more
than 10s apart.
In the latter case we asume the pod was ready but later became unready.
We want to use metrics for such pods because sometimes such pods are
unready because they were getting too much load.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Switch to multi arch test/images with manifests
**What this PR does / why we need it**:
Recently we updated the test container images to use multi-arch fat manifests and pushed the new images to the `gcr.io/kubernetes-e2e-test-images` repository. In this changeset, we are switching to using the new images and cleaning up some of the unused image definitions from manifest.go. We are removing the folders corresponding to the unused images as well.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66626
**Special notes for your reviewer**:
/cc @mkumatag
/cc @luxas
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
need ExpectNoError check
**What this PR does / why we need it**:
err need ExpectNoError check
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
update exit code to 0 if patch not needed
**Release note**:
```release-note
The `kubectl patch` command no longer exits with exit code 1 when a redundant patch results in a no-op
```
The specific logic in the `patch` command that exited with code 1, was only doing so when there was no diff between an existing object and its patched counterpart. (In case of errors, we just return those, which eventually ends up exiting with code 1 anyway). This patch removes this block, as we should not be treating patch no-ops as errors.
Fixes https://github.com/kubernetes/kubernetes/issues/58212
cc @soltysh
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding details to Conformance Tests using RFC 2119 standards.
This PR is part of the conformance documentation. This is to provide more formal specification using RFC 2119 keywords to describe the test so that who ever is running conformance tests do not have to go through the code to understand why and what is tested.
The documentation information added here into each of the tests eventually result into a document which is currently checked in at location https://github.com/cncf/k8s-conformance/blob/master/docs/KubeConformance-1.9.md
I would like to have this PR reviewed for v1.10 as I consider it important to strengthen the conformance documents.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
should get return err and check it
**What this PR does / why we need it**:
should get return err and check it
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66827, 60550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding details to Conformance Tests using RFC 2119 standards.
This PR is part of the conformance documentation. This is to provide more formal specification using RFC 2119 keywords to describe the test so that who ever is running conformance tests do not have to go through the code to understand why and what is tested.
The documentation information added here into each of the tests eventually result into a document which is currently checked in at location https://github.com/cncf/k8s-conformance/blob/master/docs/KubeConformance-1.9.md
I would like to have this PR reviewed for v1.10 as I consider it important to strengthen the conformance documents.
Automatic merge from submit-queue (batch tested with PRs 66827, 60550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix expected pod creations/deletions estimate in load test
Fixing a wrongly computed expectation for first round of scaling in load test. Luckily as a side-effect, this should also speed up load test by:
- 25s on 100-node cluster
- 2m on 500-node cluster (so helps our presubmit - https://github.com/kubernetes/test-infra/issues/8348)
- 20m on 5000-node cluster
That said, I'm not 100% sure if this won't start causing failures in practice.
/cc @gmarek @mborsz
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65570, 65616). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Retry scheduling on StorageClass events
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56163
**Special notes for your reviewer**:
I have taken over #60006.
It's hard to test in e2e, because we cannot know reschedule of pod is triggered by which event (periodically service/node events will move pods to active queue too). ~~I'll add integration tests for this functionality after [this PR](https://github.com/kubernetes/kubernetes/pull/65296) get merged.~~ (already added)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix some style error using gofmt
**What this PR does / why we need it**:
fix some style error using gofmt
**Release note**:
```release-note
NONE
```
Make annotations with newlines display a more consistent left edge, and indent the value
when the annotation is too long to give the value more space. Shorten the width of the
trimmed annotation to a value more consistent with our `-o wide` value.
Instead of putting the key and value flush with a `=` separator, make annotations closer
to fields than to labels by using `: ` as a separator.
Inline scripts may use newlines in these fields, and properly indenting makes the output more readable:
```
Command:
/bin/bash
-c
#!/bin/bash
echo "inline script should be indented"
```
Automatic merge from submit-queue (batch tested with PRs 66445, 66643, 60551). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding details to Conformance Tests using RFC 2119 standards.
This PR is part of the conformance documentation. This is to provide more formal specification using RFC 2119 keywords to describe the test so that who ever is running conformance tests do not have to go through the code to understand why and what is tested.
The documentation information added here into each of the tests eventually result into a document which is currently checked in at location https://github.com/cncf/k8s-conformance/blob/master/docs/KubeConformance-1.9.md
I would like to have this PR reviewed for v1.10 as I consider it important to strengthen the conformance documents.
Automatic merge from submit-queue (batch tested with PRs 66445, 66643, 60551). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve the output of `kubectl get events`
Events have long shown the most data of the core objects in their output, but that data is of varying use to a user. Following the principle that events are intended for the system to communicate information back to the user, and that Message is the primary human readable field, this commit alters the default columns to ensure event is shown with the most width given to the message, and all other fields organized by their relevance to the message.
1. Events are no longer sorted in the printer (this was a bug and was broken with paging and server side rendering)
2. Only the last seen, type, reason, kind, and message fields are shown by default, which makes the message prominent
3. Source, subobject, count, and first seen are only shown under `-o wide`
4. The duration fields were changed to be the more precise output introduced for job duration (2-3 sig figs)
5. Prioritized the column order for scanning - when, how important, what kind of error, what kind of object, and the message.
6. Trim trailing newlines on the message.
```release-note
Improved the output of `kubectl get events` to prioritize showing the message, and move some fields to `-o wide`.
```
```
$ kubectl get events --sort-by lastTimestamp
LAST SEEN TYPE REASON KIND MESSAGE
16m Normal SawCompletedJob CronJob Saw completed job: image-mirror-origin-v3.11-quay-1532581200
16m Normal SuccessfulDelete CronJob Deleted job image-mirror-origin-v3.11-quay-1532577600
14m Normal Scheduled Pod Successfully assigned 50c42204-9091-11e8-b2a1-0a58ac101869 to origin-ci-ig-n-fqfh
14m Normal Pulling Pod pulling image "docker-registry.default.svc:5000/ci/commenter:latest"
14m Normal Created Pod Created container
14m Normal Pulled Pod Successfully pulled image "docker-registry.default.svc:5000/ci/commenter:latest"
14m Normal Started Pod Started container
14m Normal SandboxChanged Pod Pod sandbox changed, it will be killed and re-created.
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m14s Normal ScaleDown Pod deleting pod for node scale down
4m13s Normal SuccessfulCreate ReplicationController Created pod: tide-30-hmncf
4m13s Normal Scheduled Pod Successfully assigned tide-30-hmncf to origin-ci-ig-n-x64l
4m12s Normal SuccessfulCreate ReplicationController Created pod: console-jenkins-operator-16-dd5k8
4m12s Normal SuccessfulCreate ReplicationController Created pod: sinker-23-scfmt
```
Automatic merge from submit-queue (batch tested with PRs 66445, 66643, 60551). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm: Improve kubeadm init cmd tests
**What this PR does / why we need it**:
This PR improves kubeadm init cmd tests in the following ways:
- Fix a few cases that were always successful (despite completely wrong).
- Add more test cases (for different configs in particular)
- Use dry run, to avoid modifying the system and using kubeadm reset
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes NONE
**Special notes for your reviewer**:
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews
/area kubeadm
/assign @luxas
/assign @timothysc
**Release note**:
```release-note
NONE
```
this PR is the result of splitting https://github.com/kubernetes/kubernetes/pull/65793 into 2 sections
1) This part, addressing the refactor so eligible-test-for-conformance can use get rid of privileged security context.
2) a second part that will address the promotion of the testcases to be in conformance suite
Changes:
a) demoted privileged mode for these tests (not needed)
b) regular tests (the other ones existing in the file) will still be using privileged security context.
c) adding privilegedSecurityContext field to VolInfo, so each volume-flavor can let the test know if the security context has to be privileged or not.
This allows granular changes and updates per volume).
d) fixing formatting issue.
Automatic merge from submit-queue (batch tested with PRs 66270, 60554, 66816). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "Passing `KUBE_TEST_ARGS` variable to make through process environment"
This reverts commit fda0edcd1c.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66782
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66270, 60554, 66816). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding details to Conformance Tests using RFC 2119 standards.
This PR is part of the conformance documentation. This is to provide more formal specification using RFC 2119 keywords to describe the test so that who ever is running conformance tests do not have to go through the code to understand why and what is tested.
The documentation information added here into each of the tests eventually result into a document which is currently checked in at location https://github.com/cncf/k8s-conformance/blob/master/docs/KubeConformance-1.9.md
I would like to have this PR reviewed for v1.10 as I consider it important to strengthen the conformance documents.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
delete unused events
**What this PR does / why we need it**:
events (HostNetworkNotSupported, UndefinedShaper) is unused since #47058
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up podpreset admission controller unused methods
**What this PR does / why we need it**:
As the title.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Default extensions/v1beta1 Deployment's ProgressDeadlineSeconds to MaxInt32
**What this PR does / why we need it**: Default values should be set in all API versions, because defaulting happens whenever a serialized version is read. When we switched to `apps/v1` as the storage version in `1.10` (#58854), `extensions/v1beta1` `DeploymentSpec.ProgressDeadlineSeconds` gets `apps/v1` default value (`600`) instead of being unset.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66135
**Special notes for your reviewer**: We need to cherrypick this fix to 1.10 and 1.11. Note that this fix will only help people who haven't upgraded to 1.10 or 1.11 when the storage version is changed.
@kubernetes/sig-apps-bugs
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
cpumanager: validate topology in static policy
**What this PR does / why we need it**:
This patch adds a check for the static policy state validation. The check fails if the CPU topology obtained from cadvisor doesn't match with the current topology in the state file.
If the CPU topology has changed in a node, cpumanager static policy might try to assign non-present cores to containers.
For example in my test case, static policy had the default CPU set of `0-1,4-7`. Then kubelet was shut down and CPU 7 was offlined. After restarting the kubelet, CPU manager tries to assign the non-existent CPU 7 to containers which don't have exclusive allocations assigned to them:
Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)
This breaks the exclusivity, since the CPUs from the shared pool don't get assigned to non-exclusive containers, meaning that they can execute on the exclusive CPUs.
**Release note**:
```release-note
Added CPU Manager state validation in case of changed CPU topology.
```
Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
expose GC graph via debug handler
Many times when debugging GC problems, it's important to understand the state of the GC graph at a given point in time. This pull adds the ability to dump that graph in DOT format for later consumption. It does this by exposing an additional debug handler and allowing any controller init function to produce such a handler that is included under debug.
Sample full output
```
curl http://localhost:10252/debug/controllers/garbagecollector/graph
digraph full {
// Node definitions.
0 [
label="uid=8581a030-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Pod.v1/kube-dns-7b479ccbc6-qz468
"
group=""
version="v1"
kind="Pod"
namespace="kube-system"
name="kube-dns-7b479ccbc6-qz468"
uid="8581a030-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
1 [
label="uid=822052fc-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Deployment.v1.apps/kube-dns
"
group="apps"
version="v1"
kind="Deployment"
namespace="kube-system"
name="kube-dns"
uid="822052fc-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
2 [
label="uid=857bd8ac-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
ReplicaSet.v1.apps/kube-dns-7b479ccbc6
"
group="apps"
version="v1"
kind="ReplicaSet"
namespace="kube-system"
name="kube-dns-7b479ccbc6"
uid="857bd8ac-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
// Edge definitions.
0 -> 2;
2 -> 1;
}
```
You can also select via UID and have all transitive dependencies output:
```
curl http://localhost:10252/debug/controllers/garbagecollector/graph?uid=8581a030-9043-11e8-ad4a-54e1ad486dd3
digraph full {
// Node definitions.
0 [
label="uid=822052fc-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Deployment.v1.apps/kube-dns
"
group="apps"
version="v1"
kind="Deployment"
namespace="kube-system"
name="kube-dns"
uid="822052fc-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
1 [
label="uid=8581a030-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Pod.v1/kube-dns-7b479ccbc6-qz468
"
group=""
version="v1"
kind="Pod"
namespace="kube-system"
name="kube-dns-7b479ccbc6-qz468"
uid="8581a030-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
2 [
label="uid=857bd8ac-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
ReplicaSet.v1.apps/kube-dns-7b479ccbc6
"
group="apps"
version="v1"
kind="ReplicaSet"
namespace="kube-system"
name="kube-dns-7b479ccbc6"
uid="857bd8ac-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
// Edge definitions.
1 -> 2;
2 -> 0;
}
```
And with some sample rendering:
```
curl http://localhost:10252/debug/controllers/garbagecollector/graph | dot -T svg -o project.svg
```
produces
![gc](https://user-images.githubusercontent.com/8225098/43223895-8e33c126-9022-11e8-8ad9-6b2f986fd974.png)
@kubernetes/sig-api-machinery-pr-reviews
/assign @caesarxuchao @liggitt
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Default some unbound cluster/gce env vars
**What this PR does / why we need it**:
Sets defaults for two env vars used by cluster/gce/* scripts so as to
avoid the following warnings when bringing a cluster up for test
```
METADATA_CONCEALMENT_NO_FIREWALL: unbound variable
CUSTOM_KUBE_DASHBOARD_BANNER: unbound variable
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60850
```release-note
NONE
```
This API return error when underlying 'inspect' command
fails. This makes ImagePullCheck to fail as it doesn't expect
runtime.ImageExists to return an error even if image doesn't exist.
Fixed this by returning error nil even when inspect command fails.
Test the cases where the number of CPUs available in the system is
smaller or larger than the number of CPUs known in the state, which
should lead to a panic. This covers both CPU onlining and offlining. The
case where the number of CPUs matches is already covered by the
"non-corrupted state" test.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
dd status=none does not exist on macOS
**What this PR does / why we need it**:
When running cluster/kubectl.sh on macOS 10.13.6, the use of the
`status=none` operand leads to `dd: unknown operand status` being
printed out as an error message. Redirecting to /dev/null does
the same thing, supressing transfer status.
```release-note
NONE
```