Automatic merge from submit-queue
Node authorizer
This PR implements the authorization portion of https://github.com/kubernetes/community/blob/master/contributors/design-proposals/kubelet-authorizer.md and kubernetes/features#279:
* Adds a new authorization mode (`Node`) that authorizes requests from nodes based on a graph of related pods,secrets,configmaps,pvcs, and pvs:
* Watches pods, adds edges (secret -> pod, configmap -> pod, pvc -> pod, pod -> node)
* Watches pvs, adds edges (secret -> pv, pv -> pvc)
* When both Node and RBAC authorization modes are enabled, the default RBAC binding that grants the `system:node` role to the `system:nodes` group is not automatically created.
* Tightens the `NodeRestriction` admission plugin to require identifiable nodes for requests from users in the `system:nodes` group.
This authorization mode is intended to be used in combination with the `NodeRestriction` admission plugin, which limits the pods and nodes a node may modify. To enable in combination with RBAC authorization and the NodeRestriction admission plugin:
* start the API server with `--authorization-mode=Node,RBAC --admission-control=...,NodeRestriction,...`
* start kubelets with TLS boostrapping or with client credentials that place them in the `system:nodes` group with a username of `system:node:<nodeName>`
```release-note
kube-apiserver: a new authorization mode (`--authorization-mode=Node`) authorizes nodes to access secrets, configmaps, persistent volume claims and persistent volumes related to their pods.
* Nodes must use client credentials that place them in the `system:nodes` group with a username of `system:node:<nodeName>` in order to be authorized by the node authorizer (the credentials obtained by the kubelet via TLS bootstrapping satisfy these requirements)
* When used in combination with the `RBAC` authorization mode (`--authorization-mode=Node,RBAC`), the `system:node` role is no longer automatically granted to the `system:nodes` group.
```
```release-note
RBAC: the automatic binding of the `system:node` role to the `system:nodes` group is deprecated and will not be created in future releases. It is recommended that nodes be authorized using the new `Node` authorization mode instead. Installations that wish to continue giving all members of the `system:nodes` group the `system:node` role (which grants broad read access, including all secrets and configmaps) must create an installation-specific ClusterRoleBinding.
```
Follow-up:
- [ ] enable e2e CI environment with admission and authorizer enabled (blocked by kubelet TLS bootstrapping enablement in https://github.com/kubernetes/kubernetes/pull/40760)
- [ ] optionally enable this authorizer and admission plugin in kubeadm
- [ ] optionally enable this authorizer and admission plugin in kube-up
Automatic merge from submit-queue
Switch gcloud compute copy-files to scp
gcloud is deprecating `gcloud compute copy-files` and switching to `gcloud compute scp`. Make the change before things start to break.
https://cloud.google.com/sdk/gcloud/reference/compute/copy-files
Warnings we get: `W0529 10:28:59.097] WARNING: `gcloud compute copy-files` is deprecated. Please use `gcloud compute scp` instead. Note that `gcloud compute scp` does not have recursive copy on by default. To turn on recursion, use the `--recurse` flag.`
/cc @jlowdermilk
Automatic merge from submit-queue
Support grabbing test suite metrics
**What this PR does / why we need it**:
Add support for grabbing metrics that cover the entire test suite's execution.
Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.
If you enable `--gather-suite-metrics-at-teardown`, the metrics file is written to a file with a name such as `MetricsForE2ESuite_2017-05-25T20:25:57Z.json` in the `--report-dir`. If you don't specify `--report-dir`, the metrics are written to the test log output.
I'd like to enable this for some of the `pull-*` CI jobs, which will require a separate PR to test-infra.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
@kubernetes/sig-testing-pr-reviews @smarterclayton @wojtek-t @gmarek @derekwaynecarr @timothysc
Automatic merge from submit-queue (batch tested with PRs 45534, 37212, 46613, 46350)
[e2e]Fix define redundant parameter
When timeout to reach HTTP service, redundant parameter make the
error is nil.
Automatic merge from submit-queue
no-snat test
This test checks that Pods can communicate with each other in the same cluster without SNAT.
I intend to create a job that runs this in small clusters (\~3 nodes) at a low frequency (\~once per day) so that we have a signal as we work on allowing multiple non-masquerade CIDRs to be configured (see [kubernetes-incubator/ip-masq-agent](https://github.com/kubernetes-incubator/ip-masq-agent), for example).
/cc @dnardo
Automatic merge from submit-queue
Added k82cn as one of scheduler approver.
According to the requirement of Approver at [community-membership.md](https://github.com/kubernetes/community/blob/master/community-membership.md), I meet the requirements as follow; so I'd like to add myself as an approver of scheduler.
* Reviewer of the codebase for at least 3 months
[k82cn]: [~3 months](6cc40678b6 )
* Primary reviewer for at least 10 substantial PRs to the codebase
[k82cn] Reviewed [40 PRs](https://github.com/issues?q=assignee%3Ak82cn+is%3Aclosed)
* Reviewed or merged at least 30 PRs to the codebase
[k82cn]: 71 merged PRs in kubernetes/kubernetes, and ~100 PRs in kuberentes at https://goo.gl/j2D1fR
As an approver,
* I agree to only approve familiar PRs
* I agree to be responsive to review/approve requests as per community expectations
* I agree to continue my reviewer work as per community expectations
* I agree to continue my contribution, e.g. PRs, mentor contributors
Automatic merge from submit-queue (batch tested with PRs 46407, 46457)
GCE - Refactor API for firewall and backend service creation
**What this PR does / why we need it**:
- Currently, firewall creation function actually instantiates the firewall object; this is inconsistent with the rest of GCE api calls. The API normally gets passed in an existing object.
- Necessary information for firewall creation, (`computeHostTags`,`nodeTags`,`networkURL`,`subnetworkURL`,`region`) were private to within the package. These now have public getters.
- Consumers might need to know whether the cluster is running on a cross-project network. A new `OnXPN` func will make that information available.
- Backend services for regions have been added. Global ones have been renamed to specify global.
- NamedPort management of instance groups has been changed from an `AddPortsToInstanceGroup` func (and missing complementary `Remove...`) to a single, simple `SetNamedPortsOfInstanceGroup`
- Addressed nitpick review comments of #45524
ILB needs the regional backend services and firewall refactor. The ingress controller needs the new `OnXPN` func to decide whether to create a firewall.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Move NetworkPolicy to v1
Move NetworkPolicy to v1
@kubernetes/sig-network-misc
**Release note**:
```release-note
NetworkPolicy has been moved from `extensions/v1beta1` to the new
`networking.k8s.io/v1` API group. The structure remains unchanged from
the beta1 API.
The `net.beta.kubernetes.io/network-policy` annotation on Namespaces
to opt in to isolation has been removed. Instead, isolation is now
determined at a per-pod level, with pods being isolated if there is
any NetworkPolicy whose spec.podSelector targets them. Pods that are
targeted by NetworkPolicies accept traffic that is accepted by any of
the NetworkPolicies (and nothing else), and pods that are not targeted
by any NetworkPolicy accept all traffic by default.
Action Required:
When upgrading to Kubernetes 1.7 (and a network plugin that supports
the new NetworkPolicy v1 semantics), to ensure full behavioral
compatibility with v1beta1:
1. In Namespaces that previously had the "DefaultDeny" annotation,
you can create equivalent v1 semantics by creating a
NetworkPolicy that matches all pods but does not allow any
traffic:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: default-deny
spec:
podSelector:
This will ensure that pods that aren't match by any other
NetworkPolicy will continue to be fully-isolated, as they were
before.
2. In Namespaces that previously did not have the "DefaultDeny"
annotation, you should delete any existing NetworkPolicy
objects. These would have had no effect before, but with v1
semantics they might cause some traffic to be blocked that you
didn't intend to be blocked.
```
Automatic merge from submit-queue
Apply KubeProxyEndpointLagTimeout to ESIPP tests
Fixes#46533.
The previous construction of ESIPP tests is weird, so I redo it a bit.
A 30 seconds `KubeProxyEndpointLagTimeout` is introduced, as these tests ain't verifying performance, may be better to not make it too tight.
/assign @thockin
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Make GCE load-balancers create health checks for nodes
From #14661. Proposal on kubernetes/community#552. Fixes#46313.
Bullet points:
- Create nodes health check and firewall (for health checking) for non-OnlyLocal service.
- Create local traffic health check and firewall (for health checking) for OnlyLocal service.
- Version skew:
- Don't create nodes health check if any nodes has version < 1.7.0.
- Don't backfill nodes health check on existing LBs unless users explicitly trigger it.
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default.
An existing LoadBalancer will have health check attached to it when:
- Change Service.Spec.Type from LoadBalancer to others and flip it back.
- Any effective change on Service.Spec.ExternalTrafficPolicy.
```
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)
Remove the reduplicated case judement
This patch remove the reduplicated case judgement
Automatic merge from submit-queue (batch tested with PRs 45809, 46515, 46484, 46516, 45614)
Fix incorrect printf format
**What this PR does / why we need it**: changes `%s` to `%d` for something that is actually an `int` (found via `make vet`).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)
Controller history
**What this PR does / why we need it**:
Implements the ControllerRevision API object and clientset to allow for the implementation of StatefulSet update and DaemonSet history
```release-note
ControllerRevision type added for StatefulSet and DaemonSet history.
```
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)
Summary Test looks at pods that have containers that restart.
Occasionally, the node can report extra containers that had been restarted through the summary API.
This test change tests a pod that restarts, and hopefully should allow us to reproduce and debug this behavior.
/assign @dchen1107
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 46429, 46308, 46395, 45867, 45492)
Bump Go version to 1.8.3
This PR also removed this patched version of Go 1.8.1 which we used to use to workaround performance problem of Go 1.8.1.
Fix https://github.com/kubernetes/kubernetes/issues/45216
Ref #46391
@timothysc @bradfitz
Automatic merge from submit-queue (batch tested with PRs 46124, 46434, 46089, 45589, 46045)
Copy kubeconfig to kubemark master
This should save the effort of digging through jenkins agent and its container to get the kubeconfig.
Ideally we should have kubectl directly working on the kubemark master, but I'm facing some issues due to older version of kubectl present by default on the node.
cc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 45949, 46009, 46320, 46423, 46437)
e2e tests for storage policy support in Kubernetes
This PR covers e2e test cases for vSphere storage policy support in Kubernetes - #46176.
The following test scenario have been implemented.
- Specify only SPBM storage policy name.
- Verify if the disk is provisioned on a compatible datastore with max free space.
- Specify a storage policy name which is not defined on VC.
- Verify if PVC create errors out that no pbm profile with this policy is found.
- Specify both SPBM storage policy name and VSAN capabilities together.
- Verify if PVC create errors out that you can't use both SPBM policy name with VSAN capabilities. You can only specify one.
- Specify SPBM storage policy name with user specified datastore which is non-compatible.
- Verify if PVC create errors out that it can't provision a disk on a non-compatible datastore.
@jeffvance @divyenpatel
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 45949, 46009, 46320, 46423, 46437)
Unregister some metrics
delete some registered metrics since they are not observed
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45269, 46219, 45966)
Add overriding Stackdriver API endpoint
Allow using Stackdriver test endpoint.
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)
GCE - Retrieve subnetwork name/url from gce.conf
**What this PR does / why we need it**:
Features like ILB require specifying the subnetwork if the network is type manual.
**Notes:**
The network URL can be [constructed](68e7e18698/pkg/cloudprovider/providers/gce/gce.go (L211-L217)) by fetching instance metadata; however, the subnetwork is not provided through this feature. Users must specify the subnetwork name/url through the gce.conf.
Although multiple subnets can exist in the same region for a network, the cloud provider will only use one subnet url for creating LBs.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)
Subresources are not included in apiserver prometheus metrics
Subresources are very often completely different code paths and errors
generated on those code paths are important to distinguish.
@kubernetes/sig-api-machinery-pr-reviews
```release-note
The Prometheus metrics for the kube-apiserver for tracking incoming API requests and latencies now return the `subresource` label for correctly attributing the type of API call.
```
Automatic merge from submit-queue (batch tested with PRs 45913, 46065, 46352, 46363, 46373)
Fix CheckPodsCondition to print out the correct podName
From a couple CIs (https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-serial/1114, https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-gci-qa-serial-master/2246, https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gke-pre-release/2187), all indicate we print out the wrong pod name in CheckPodsCondition for _"Pod XXX failed to be running and ready, or succeeded."_:
```
I0524 02:09:50.173] May 24 02:09:50.173: INFO: Waiting for pod heapster-v1.3.0-3806988011-kzkg6 in namespace 'kube-system' status to be 'running and ready, or succeeded'(found phase: "Running", readiness: false) (4m55.033881993s elapsed)
I0524 02:09:52.178] May 24 02:09:52.178: INFO: Waiting for pod heapster-v1.3.0-3806988011-kzkg6 in namespace 'kube-system' status to be 'running and ready, or succeeded'(found phase: "Running", readiness: false) (4m57.03848264s elapsed)
I0524 02:09:54.183] May 24 02:09:54.182: INFO: Waiting for pod heapster-v1.3.0-3806988011-kzkg6 in namespace 'kube-system' status to be 'running and ready, or succeeded'(found phase: "Running", readiness: false) (4m59.043463323s elapsed)
I0524 02:09:56.183] May 24 02:09:56.183: INFO: Pod fluentd-gcp-v2.0-6wf67 failed to be running and ready, or succeeded.
I0524 02:09:56.184] May 24 02:09:56.183: INFO: Wanted all 23 pods to be running and ready, or succeeded. Result: false. Pods: [heapster-v1.3.0-3806988011-kzkg6 kube-proxy-bootstrap-e2e-minion-group-bbwn rescheduler-v0.3.0-bootstrap-e2e-master monitoring-influxdb-grafana-v4-1q59k l7-default-backend-1044750973-zgxsc etcd-server-events-bootstrap-e2e-master kube-apiserver-bootstrap-e2e-master kube-proxy-bootstrap-e2e-minion-group-6nqb kube-proxy-bootstrap-e2e-minion-group-mzbz fluentd-gcp-v2.0-chd2x kube-dns-806549836-f8p46 fluentd-gcp-v2.0-44x97 kube-dns-autoscaler-2528518105-vlg8t fluentd-gcp-v2.0-p1h4b kube-controller-manager-bootstrap-e2e-master l7-lb-controller-v0.9.3-bootstrap-e2e-master kubernetes-dashboard-2917854236-tn3nx kube-dns-806549836-fq2fp kube-scheduler-bootstrap-e2e-master etcd-empty-dir-cleanup-bootstrap-e2e-master kube-addon-manager-bootstrap-e2e-master etcd-server-bootstrap-e2e-master fluentd-gcp-v2.0-6wf67]
I0524 02:09:56.184] May 24 02:09:56.183: INFO: At least one pod wasn't running and ready or succeeded at test start.
I0524 02:09:56.184] [AfterEach] [k8s.io] Restart [Disruptive]
```
Check the codes and found we always print out the last pod name, which is random. Pass the pod name into channel to fix.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45913, 46065, 46352, 46363, 46373)
Detect cohabitating resources in etcd storage test
**What this PR does / why we need it**:
This change updates the etcd storage path test to detect cohabitating resources by looking at their expected location in etcd. This was not detected in the past because the GVK check did not span across groups.
To limit noise from failures caused by multiple objects at the same location in etcd, the test now fails when different GVRs share the same expected path. Thus every object is expected to have a unique path.
@liggitt PTAL
Signed-off-by: Monis Khan <mkhan@redhat.com>
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46299, 46309, 46311, 46303, 46150)
Dont attach a GPU to ubuntu test machines for node e2e serial tests
This should fix flakes in the e2e_node serial suite.
@vishh I think this is what you were asking for...
/assign @vishh
Automatic merge from submit-queue (batch tested with PRs 46299, 46309, 46311, 46303, 46150)
Move docker validation test to separate project.
Docker validation test is leaking VMs because new docker version `DOCKER_VERSION=17.05.0-c` totally breaks the new gci image `GCE_IMAGES=gci-test-60-9579-0-0` with the `gci-docker-version` metadata specified.
The test successfully created the instance, but timed out when checking VM aliveness, and leaked the VM.
I've cleaned up all leaked VMs. This PR moves docker validation node e2e test into a separate project to not influencing other node e2e test.
@kewu1992 We should fix the docker automated validation test.
/cc @dchen1107 @yujuhong @abgworrall
Automatic merge from submit-queue (batch tested with PRs 42042, 46139, 46126, 46258, 46312)
Remove unused test properties
Issue: #42676
A separate serial memcg suite was created for the initial stages of re-enabling memcg notifications. Now that all e2e tests have memcg notifications enabled, this suite is no longer needed.
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
Chaosmonkey - Signal stop to tests and wait for done when disruption fails
**What this PR does / why we need it**:
Prevents tests from leaking resources because their Teardown was never called when test disruption fails.
**Which issue this PR fixes**
First problem of #45842
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46149, 45897, 46293, 46296, 46194)
GC: update required verbs for deletable resources, allow list of ignored resources to be customized
The garbage collector controller currently needs to list, watch, get,
patch, update, and delete resources. Update the criteria for
deletable resources to reflect this.
Also allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
cc @caesarxuchao @deads2k @smarterclayton @mfojtik @liggitt @sttts @kubernetes/sig-api-machinery-pr-reviews
Automatic merge from submit-queue
Enable "kick the tires" support for Nvidia GPUs in COS
This PR provides an installation daemonset that will install Nvidia CUDA drivers on Google Container Optimized OS (COS).
User space libraries and debug utilities from the Nvidia driver installation are made available on the host in a special directory on the host -
* `/home/kubernetes/bin/nvidia/lib` for libraries
* `/home/kubernetes/bin/nvidia/bin` for debug utilities
Containers that run CUDA applications on COS are expected to consume the libraries and debug utilities (if necessary) from the host directories using `HostPath` volumes.
Note: This solution requires updating Pod Spec across distros. This is a known issue and will be addressed in the future. Until then CUDA workloads will not be portable.
This PR updates the COS base image version to m59. This is coupled with this PR for the following reasons:
1. Driver installation requires disabling a kernel feature in COS.
2. The kernel API for disabling this interface changed across COS versions
3. If the COS image update is not handled in this PR, then a subsequent COS image update will break GPU integration and will require an update to the installation scripts in this PR.
4. Instead of having to post `3` PRs, one each for adding the basic installer, updating COS to m59, and then updating the installer again, this PR combines all the changes to reduce review overhead and latency, and additional noise that will be created when GPU tests break.
**Try out this PR**
1. Get Quota for GPUs in any region
2. `export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci`
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1" cluster/kube-up.sh`
4. `kubectl create -f cluster/gce/gci/nvidia-gpus/cos-installer-daemonset.yaml`
5. Run your CUDA app in a pod.
**Another option is to run a e2e manually to try out this PR**
1. Get Quota for GPUs in any region
2. export `KUBE_GCE_ZONE=<zone-with-gpus>` KUBE_NODE_OS_DISTRIBUTION=gci
3. `NODE_ACCELERATORS="type=nvidia-tesla-k80,count=1"`
4. `go run hack/e2e.go -- --up`
5. `hack/ginkgo-e2e.sh --ginkgo.focus="\[Feature:GPU\]"`
The e2e will install the drivers automatically using the daemonset and then run test workloads to validate driver integration.
TODO:
- [x] Update COS image version to m59 release.
- [x] Remove sleep from the install script and add it to the daemonset
- [x] Add an e2e that will run the daemonset and run a sample CUDA app on COS clusters.
- [x] Setup a test project with necessary quota to run GPU tests against HEAD to start with https://github.com/kubernetes/test-infra/pull/2759
- [x] Update node e2e serial configs to install nvidia drivers on COS by default
Automatic merge from submit-queue (batch tested with PRs 45587, 46286)
PDB Max Unavailable Field
Completes https://github.com/kubernetes/features/issues/285
```release-note
Adds a MaxUnavailable field to PodDisruptionBudget
```
Individual commits are self-contained; Last commit can be ignored because it is autogenerated code.
cc @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews
Allow the list of resources the garbage collector controller should
ignore to be customizable, so downstream integrators can add their own
resources to the list, if necessary.
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)
Scheduler should use a shared informer, and fix broken watch behavior for cached watches
Can be used either from a true shared informer or a local shared
informer created just for the scheduler.
Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event. This means that we broke behavior when introducing the watch cache. This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.
```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect. If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent. However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version. That meant the new object was not matched by the filter. This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)
Use shared informers in gc controller if possible
Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.
cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 38990, 45781, 46225, 44899, 43663)
Support parallel scaling on StatefulSets
Fixes#41255
```release-note
StatefulSets now include an alpha scaling feature accessible by setting the `spec.podManagementPolicy` field to `Parallel`. The controller will not wait for pods to be ready before adding the other pods, and will replace deleted pods as needed. Since parallel scaling creates pods out of order, you cannot depend on predictable membership changes within your set.
```
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)
test/images/network-tester:bump rc/pod image version to 1.9
Current image version is 1.9,update the image version of the associated json file to 1.9
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46133, 46211, 46224, 46205, 45910)
Make CPU request for heapster in kubemark scale with the number of Nodes
This change updates the etcd storage path test to detect cohabitating
resources by looking at their expected location in etcd. This was not
detected in the past because the GVK check did not span across groups.
To limit noise from failures caused by multiple objects at the same
location in etcd, the test now fails when different GVRs share the same
expected path. Thus every object is expected to have a unique path.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Automatic merge from submit-queue
Add script to dump kubemark master logs
First step towards solving the issue https://github.com/kubernetes/kubernetes/issues/46109.
cc @kubernetes/test-infra-maintainers @wojtek-t @gmarek
Packaged the script as a docker container stored in gcr.io/google-containers
A daemonset deployment is included to make it easy to consume the installer
A cluster e2e has been added to test the installation daemonset along with verifying installation
by using a sample CUDA application.
Node e2e for GPUs updated to avoid running on nodes without GPU devices.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
Previously, the scheduler created two separate list watchers. This
changes the scheduler to be able to leverage a shared informer, whether
passed in externally or spawned using the new in place method. This
removes the last use of a "special" informer in the codebase.
Allows someone wrapping the scheduler to use a shared informer if they
have more information avaliable.
Tokens controller previously needed a bit of extra help in order to be
safe for concurrent use. The new MutationCache allows it to keep a local
cache and still use a shared informer. The filtering event handler lets
it only see changes to secrets it cares about.
Automatic merge from submit-queue (batch tested with PRs 46014, 46152)
Updated test/test_owners.csv for federation test cases
To the best of my knowledge have updated the test owners for federation e2e test cases. PTAL and comment if any concern.
**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-federation-pr-reviews @fejta
/assign @madhusudancs
Automatic merge from submit-queue (batch tested with PRs 46033, 46122, 46053, 46018, 45981)
Command tree and exported env in kubectl plugins
This is part of `kubectl` plugins V1:
- Adds support to several env vars passing context information to the plugin. Plugins can make use of them to connect to the REST API, access global flags, get the path of the plugin caller (so that `kubectl` can be invoked) and so on. Exported env vars include
- `KUBECTL_PLUGINS_DESCRIPTOR_*`: the plugin descriptor fields
- `KUBECTL_PLUGINS_GLOBAL_FLAG_*`: one for each global flag, useful to access namespace, context, etc
- ~`KUBECTL_PLUGINS_REST_CLIENT_CONFIG_*`: one for most fields in `rest.Config` so that a REST client can be built.~
- `KUBECTL_PLUGINS_CALLER`: path to `kubectl`
- `KUBECTL_PLUGINS_CURRENT_NAMESPACE`: namespace in use
- Adds support for plugins as child of other plugins so that a tree of commands can be built (e.g. `kubectl myplugin list`, `kubectl myplugin add`, etc)
**Release note**:
```release-note
Added support to a hierarchy of kubectl plugins (a tree of plugins as children of other plugins).
Added exported env vars to kubectl plugins so that plugin developers have access to global flags, namespace, the plugin descriptor and the full path to the caller binary.
```
@kubernetes/sig-cli-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 46033, 46122, 46053, 46018, 45981)
Log age of stats used for evictions during eviction tests
I recently added prometheus metrics for the age of the metrics used for evictions #43031. It would be nice to surface these during eviction tests, so I can better assess how old stats are, and whether or not the age of stats causes extra evictions.
This isnt super-high priority, and can be done after code-freeze, since it is a testing improvement. Feel free to take a look whenever either of you has time.
/assign @mtaufen
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 45346, 45903, 45958, 46042, 45975)
ResourceQuota admission control injects registry
**What this PR does / why we need it**:
The `ResourceQuota` admission controller works with a registry that maps a GroupKind to an Evaluator. The registry used in the existing plug-in is not injectable, which makes usage of the ResourceQuota plug-in in other API server contexts difficult. This PR updates the code to support late injection of the registry via a plug-in initializer.
Automatic merge from submit-queue (batch tested with PRs 45996, 46121, 45707, 46011, 45564)
Fix waitForNPods in restart.go
From https://github.com/kubernetes/kubernetes/issues/45991#issuecomment-302292404.
Don't redefine `pods` so we can return real pod names instead of empty array.
/assign @dchen1107 @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add node e2e tests for hostNetwork
**What this PR does / why we need it**:
Add node e2e tests for hostNetwork.
**Which issue this PR fixes**
Part of #44118.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @Random-Liu @yujuhong
Automatic merge from submit-queue (batch tested with PRs 45950, 45968)
[Federation] Remove redundant e2e for secret and daemonset
Federation of daemonset and secret types is now implemented by the sync controller, and e2e testing for each type is provided via crud lifecycle e2e tests. This renders the legacy e2e tests for these types redundant, and this commit removes those tests.
The secret wait and delete functions required by the ingress e2e tests have been retained and moved to ingress.go.
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue
Make real proxier in hollow-proxy optional (default=true)
Ref https://github.com/kubernetes/kubernetes/pull/45622
This allows using real proxier for hollow proxy, but we use the fake one by default.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
Add explicit image tag to cockroachdb example and test
@gyliu513
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
Add integration test for deployment
We have no integration test for Deployment currently. In this PR, add an integration test which covers an e2e test (create a new RollingUpdate deployment), add more replicas to the Deployment, and set minReadySeconds so that we can test maxUnavailable.
Plan to add more integration tests that cover e2e tests after this initial PR is merged.
@kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
[Federation] Add integration testing for cluster addition
This PR adds integration testing of the sync controller for cluster addition. This ensures coverage equivalency between the integration tests and the old controller unit tests, so those tests are removed by this PR.
Resolves#45257
cc: @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue
Remove the deprecated --babysit-daemons kubelet flag
```release-note
Removes the deprecated kubelet flag --babysit-daemons
```
This flag has been deprecated for over a year (git blame says marked deprecated on March 1, 2016).
Relatively easy removal - nothing in the Kubelet relies on it anymore.
There was still some stuff in the provisioning scripts. It was easy to rip out, but in general we probably need to be more disciplined about updating the provisioning scripts at the same time that we initially mark things deprecated.
Automatic merge from submit-queue
Move all API related annotations into annotation_key_constants.go
Separate from #45869. See https://github.com/kubernetes/kubernetes/pull/45869#discussion_r116839411 for details.
This PR does nothing but move constants around :)
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Federation of daemonset and secret types is now implemented by the
sync controller, and e2e testing for each type is provided via crud
lifecycle e2e tests. This renders the legacy e2e tests for these types
redundant, and this commit removes those tests.