Automatic merge from submit-queue (batch tested with PRs 66270, 60554, 66816). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Adding details to Conformance Tests using RFC 2119 standards.
This PR is part of the conformance documentation. This is to provide more formal specification using RFC 2119 keywords to describe the test so that who ever is running conformance tests do not have to go through the code to understand why and what is tested.
The documentation information added here into each of the tests eventually result into a document which is currently checked in at location https://github.com/cncf/k8s-conformance/blob/master/docs/KubeConformance-1.9.md
I would like to have this PR reviewed for v1.10 as I consider it important to strengthen the conformance documents.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
delete unused events
**What this PR does / why we need it**:
events (HostNetworkNotSupported, UndefinedShaper) is unused since #47058
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up podpreset admission controller unused methods
**What this PR does / why we need it**:
As the title.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Default extensions/v1beta1 Deployment's ProgressDeadlineSeconds to MaxInt32
**What this PR does / why we need it**: Default values should be set in all API versions, because defaulting happens whenever a serialized version is read. When we switched to `apps/v1` as the storage version in `1.10` (#58854), `extensions/v1beta1` `DeploymentSpec.ProgressDeadlineSeconds` gets `apps/v1` default value (`600`) instead of being unset.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66135
**Special notes for your reviewer**: We need to cherrypick this fix to 1.10 and 1.11. Note that this fix will only help people who haven't upgraded to 1.10 or 1.11 when the storage version is changed.
@kubernetes/sig-apps-bugs
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
cpumanager: validate topology in static policy
**What this PR does / why we need it**:
This patch adds a check for the static policy state validation. The check fails if the CPU topology obtained from cadvisor doesn't match with the current topology in the state file.
If the CPU topology has changed in a node, cpumanager static policy might try to assign non-present cores to containers.
For example in my test case, static policy had the default CPU set of `0-1,4-7`. Then kubelet was shut down and CPU 7 was offlined. After restarting the kubelet, CPU manager tries to assign the non-existent CPU 7 to containers which don't have exclusive allocations assigned to them:
Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)
This breaks the exclusivity, since the CPUs from the shared pool don't get assigned to non-exclusive containers, meaning that they can execute on the exclusive CPUs.
**Release note**:
```release-note
Added CPU Manager state validation in case of changed CPU topology.
```
Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
expose GC graph via debug handler
Many times when debugging GC problems, it's important to understand the state of the GC graph at a given point in time. This pull adds the ability to dump that graph in DOT format for later consumption. It does this by exposing an additional debug handler and allowing any controller init function to produce such a handler that is included under debug.
Sample full output
```
curl http://localhost:10252/debug/controllers/garbagecollector/graph
digraph full {
// Node definitions.
0 [
label="uid=8581a030-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Pod.v1/kube-dns-7b479ccbc6-qz468
"
group=""
version="v1"
kind="Pod"
namespace="kube-system"
name="kube-dns-7b479ccbc6-qz468"
uid="8581a030-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
1 [
label="uid=822052fc-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Deployment.v1.apps/kube-dns
"
group="apps"
version="v1"
kind="Deployment"
namespace="kube-system"
name="kube-dns"
uid="822052fc-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
2 [
label="uid=857bd8ac-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
ReplicaSet.v1.apps/kube-dns-7b479ccbc6
"
group="apps"
version="v1"
kind="ReplicaSet"
namespace="kube-system"
name="kube-dns-7b479ccbc6"
uid="857bd8ac-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
// Edge definitions.
0 -> 2;
2 -> 1;
}
```
You can also select via UID and have all transitive dependencies output:
```
curl http://localhost:10252/debug/controllers/garbagecollector/graph?uid=8581a030-9043-11e8-ad4a-54e1ad486dd3
digraph full {
// Node definitions.
0 [
label="uid=822052fc-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Deployment.v1.apps/kube-dns
"
group="apps"
version="v1"
kind="Deployment"
namespace="kube-system"
name="kube-dns"
uid="822052fc-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
1 [
label="uid=8581a030-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
Pod.v1/kube-dns-7b479ccbc6-qz468
"
group=""
version="v1"
kind="Pod"
namespace="kube-system"
name="kube-dns-7b479ccbc6-qz468"
uid="8581a030-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
2 [
label="uid=857bd8ac-9043-11e8-ad4a-54e1ad486dd3
namespace=kube-system
ReplicaSet.v1.apps/kube-dns-7b479ccbc6
"
group="apps"
version="v1"
kind="ReplicaSet"
namespace="kube-system"
name="kube-dns-7b479ccbc6"
uid="857bd8ac-9043-11e8-ad4a-54e1ad486dd3"
missing="false"
beingDeleted="false"
deletingDependents="false"
virtual="false"
];
// Edge definitions.
1 -> 2;
2 -> 0;
}
```
And with some sample rendering:
```
curl http://localhost:10252/debug/controllers/garbagecollector/graph | dot -T svg -o project.svg
```
produces
![gc](https://user-images.githubusercontent.com/8225098/43223895-8e33c126-9022-11e8-8ad9-6b2f986fd974.png)
@kubernetes/sig-api-machinery-pr-reviews
/assign @caesarxuchao @liggitt
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Default some unbound cluster/gce env vars
**What this PR does / why we need it**:
Sets defaults for two env vars used by cluster/gce/* scripts so as to
avoid the following warnings when bringing a cluster up for test
```
METADATA_CONCEALMENT_NO_FIREWALL: unbound variable
CUSTOM_KUBE_DASHBOARD_BANNER: unbound variable
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#60850
```release-note
NONE
```
This API return error when underlying 'inspect' command
fails. This makes ImagePullCheck to fail as it doesn't expect
runtime.ImageExists to return an error even if image doesn't exist.
Fixed this by returning error nil even when inspect command fails.
Test the cases where the number of CPUs available in the system is
smaller or larger than the number of CPUs known in the state, which
should lead to a panic. This covers both CPU onlining and offlining. The
case where the number of CPUs matches is already covered by the
"non-corrupted state" test.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
dd status=none does not exist on macOS
**What this PR does / why we need it**:
When running cluster/kubectl.sh on macOS 10.13.6, the use of the
`status=none` operand leads to `dd: unknown operand status` being
printed out as an error message. Redirecting to /dev/null does
the same thing, supressing transfer status.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66284, 66690). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Exit gce kube-up.sh early if openssl is LibreSSL
**What this PR does / why we need it**:
macOS has an openssl binary, but it's actually LibreSSL, which doesn't play well with the easyrsa script that cluster/gce/util.sh uses to generate certs
Instead of waiting until we generate certs to discover easyrsa doesn't work, consider openssl a prereq for gce, and include a check for the version string starting with OpenSSL
Also, mirror kube-up.sh's "... calling" output in kube-down.sh
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixeskubernetes/community#1954
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move the` k8s.io/kubernetes/pkg/util/pointer` package to` k8s.io/utils/pointer`
**What this PR does / why we need it**:
Move `k8s.io/kubernetes/pkg/util/pointer` to `shared utils` directory, so that we can use it easily.
Close#66010 accidentally, and can't reopen it, so the same as #66010
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66489, 66728, 66739). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove incomplete check of ipvs modules in hack/local-up-cluster.sh
**What this PR does / why we need it**:
Currently `hack/local-up-cluster.sh` executes `sudo modprobe -a ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4` to check whether the ipvs required modules exist, which leaves out the scenario https://github.com/kubernetes/kubernetes/issues/63801 mentioned.
Since `func CanUseIPVSProxier` in `pkg/proxy/ipvs/proxier.go` covers all scenarios, maybe we should just remove this part instead of adding codes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66489, 66728, 66739). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Speed up volume modifications on AWS
Volume becomes reusable after it reached optimizing state.
/sig storage
/sig aws
```release-note
Make EBS volume expansion faster
```
cc @d-nishi @kokhang
Automatic merge from submit-queue (batch tested with PRs 66489, 66728, 66739). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reuse iptablesContainerPortalArgs, remove function iptablesContainerNodePortArgs
**What this PR does / why we need it**:
reuse iptablesContainerPortalArgs, remove function iptablesContainerNodePortArgs
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Simplify device manager: make endpoint stateless
While reviewing devicemanager code, found the caching layer on endpoint is redundant.
Here are the 3 related objects in picture:
**devicemanager <-> endpoint <-> plugin**
plugin is the source of truth for devices and device health status.
devicemanager maintain healthyDevices, unhealthyDevices, allocatedDevices based on updates
from plugin.
So there is no point for endpoint to cache devices, this patch is removing the cache layer,
endpoint becomes stateless, which i believe should be the case (but i do welcome review
if i missed something here).
also removing the Manager.Devices() since i didn't find any caller of this other than test.
if we need to get all devices from manager in future, it just need to return healthyDevices + unhealthyDevices, so don't have to call endpoint after all.
This patch makes code more readable, data model been simplified.
**What this PR does / why we need it**:
this patch simplify the device manager code, make it more maintainable.
**Which issue(s) this PR fixes** *:
this is a refactor of device manager code
**Special notes for your reviewer**:
will need to rebase the code if #58755 get checked-in first.
**Release note**:
```release-note
None
```
/sig node
/cc @jiayingz @RenaudWasTaken @vishh @saad-ali @vikaschoudhary16 @vladimirvivien @anfernee
Automatic merge from submit-queue (batch tested with PRs 66686, 66760). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Error in return value should be tested
**What this PR does / why we need it**:
Error in return value should be tested
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
test image for a release 1.7 based sample-apiserver
**What this PR does / why we need it**:
In our e2e test suite we have use an image "gcr.io/kubernetes-e2e-test-images:k8s-aggregator-sample-apiserver:1.7v2". We need a way to build a fresh image that can we can use instead of that one. Especially we need one that has a multi-arch fat manifest so e2e tests can be run across multiple architectures.
This is especially important since we are in the process of promoting the test in question to the conformance suite - https://github.com/kubernetes/kubernetes/pull/63947
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
/cc @mkumatag
/cc @ixdy
/cc @luxas
**Release note**:
```release-note
NONE
```
This patch adds a check for the static policy state validation. The
check fails if the CPU topology obtained from cadvisor doesn't match
with the current topology in the state file.
If the CPU topology has changed in a node, cpu manager static policy
might try to assign non-present cores to containers.
For example in my test case, static policy had the default CPU set of
0-1,4-7. Then kubelet was shut down and CPU 7 was offlined. After
restarting the kubelet, CPU manager tries to assign the non-existent CPU
7 to containers which don't have exclusive allocations assigned to them:
Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6)
This breaks the exclusivity, since the CPUs from the shared pool don't
get assigned to non-exclusive containers, meaning that they can execute
on the exclusive CPUs.
the caching layer on endpoint is redundant.
Here are the 3 related objects in picture:
devicemanager <-> endpoint <-> plugin
Plugin is the source of truth for devices
and device health status.
devicemanager maintain healthyDevices,
unhealthyDevices, allocatedDevices based on updates
from plugin.
So there is no point for endpoint caching devices,
this patch is removing this caching layer on endpoint,
Also removing the Manager.Devices() since i didn't
find any caller of this other than test, i am adding a
notification channel to facilitate testing,
If we need to get all devices from manager in future,
it just need to return healthyDevices + unhealthyDevices,
we don't have to call endpoint after all.
This patch makes code more readable, data model been simplified.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Switch off leader election for scheduler and kube/cloud controller
**What this PR does / why we need it**:
We now have leader election on by default, for a single node
local-up-cluster, this is not needed. Let's switch it off
This will reduce the flakiness and timeouts we see in the local e2e CI jobs.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66723, 66732). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix a duplicate case in TestUpdateNodeObjects
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
enable etcd logging in local-e2e jobs
**What this PR does / why we need it**:
We are not saving the etcd logs and just redirecting the output
to /dev/null. In this change, we set ETCD_LOGFILE to the same
directory where we log other kube relates processes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
See example run in http://gcsweb.k8s.io/gcs/kubernetes-jenkins/pr-logs/pull/66608/pull-kubernetes-local-e2e/254/artifacts/kubetest-local263115757/
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62444, 66358, 66724, 66726). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move kubelet serving cert rotation to beta
xref https://github.com/kubernetes/features/issues/267
This is exercised in the alpha gke e2es, and can be enabled in the non-alpha gke e2es once it no longer requires an alpha feature gate.
```release-note
Kubelet serving certificate bootstrapping and rotation has been promoted to beta status.
```
Automatic merge from submit-queue (batch tested with PRs 62444, 66358, 66724, 66726). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Grab docker log using a soft link in local-up-cluster
**What this PR does / why we need it**:
Would be useful to debug problems like timeouts and missing images etc
for the local e2e jobs.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 62444, 66358, 66724, 66726). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GetMasterHost should not return port
**What this PR does / why we need it**:
masterUrl.Host may include port, e.g. 10.114.51.201:8443
GetMasterHost() should not return port
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
refactor server.go to simplify the invokes to kubeFlags and KubeConfiguration
**What this PR does / why we need it**:
Since kubeFlags and KubeletConfiguration have been fields of KubeletServer, we just need to pass the reference of KubeletServer to the following functions. This will simplify the migrations of flags such as BootstrapCheckpointPath and others, as we don't have to specify from which object the migrated field comes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
@mtaufen
**Release note**:
```release-note
NONE
```