Commit Graph

1210 Commits (e92f6eb0fe377f6d8943d3eca82d8ac28c6bb8e6)

Author SHA1 Message Date
Michael Taufen 33d4fe4074 dynamic config test: use a hyphen between the config name and the unique suffix 2017-12-21 16:05:50 -08:00
Michael Taufen 6ee191ab74 Refactor kubelet config controller bootstrap process
This makes the bootstrap feel much more linear and as a result it is
easier to read.

Also simplifies status reporting for local config.
2017-12-21 15:24:56 -08:00
Kubernetes Submit Queue 2a1cdfffaa
Merge pull request #57221 from mtaufen/kc-event
Automatic merge from submit-queue (batch tested with PRs 57434, 57221, 57417, 57474, 57481). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Send an event just before the Kubelet restarts to use a new config

**What this PR does / why we need it**:
This PR makes the Kubelet send events for configuration changes. This makes it much easier to see a recent history of configuration changes for the Kubelet. 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56895

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```

/cc @dchen1107 @liggitt @dashpole
2017-12-20 17:42:37 -08:00
Kubernetes Submit Queue 09e5ce3056
Merge pull request #57434 from yguo0905/fix-gke-env-test
Automatic merge from submit-queue (batch tested with PRs 57434, 57221, 57417, 57474, 57481). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Node e2e test: do not return error if Docker's check-config.sh fails

**What this PR does / why we need it**:

This PR fixes the issue in https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2enode-ubuntudev2-k8sbeta-gkespec/1#k8sio-gke-system-requirements-conformance-featuregkeenv-the-docker-configuration-validation-should-pass

https://github.com/moby/moby/pull/28007 changed the Docker's check-config.sh to return a non-zero exit code if the check fails. But we expect certain configs to be missing, so we should ignore the exit code.

This needs to be cherry-picked into 1.9.

**Release note**:

```release-note
None
```

/area node-e2e
/kind bug
/assign @dchen1107
2017-12-20 17:42:34 -08:00
Michael Taufen d5d7d6d684 Send an event just before the Kubelet restarts to use a new config 2017-12-20 13:02:55 -08:00
Kubernetes Submit Queue 666b95743a
Merge pull request #54278 from foxyriver/fix-exit
Automatic merge from submit-queue (batch tested with PRs 54278, 56259, 56762). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

make sure of deleting archive

**What this PR does / why we need it**:

Exit() causes the current program to exit with the given status code, but deferred function does not run.
2017-12-20 12:32:32 -08:00
Yang Guo 7e1d74ead2 node_e2e: do not return error if Docker's check-config.sh fails 2017-12-19 20:50:09 -08:00
Wojciech Tyczynski 4e8526dc6b
Revert "Version bump to etcd v3.2.11, grpc v1.7.5" 2017-12-19 15:25:06 +01:00
Kubernetes Submit Queue a7b404ec7f
Merge pull request #57160 from jpbetz/etcd-client-3.2.11
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Version bump to etcd v3.2.11, grpc v1.7.5

Fix https://github.com/kubernetes/kubernetes/issues/56114: Update to etcd client 3.2.11

Version bumps:

- etcd from 3.1.10 to 3.2.11
- grpc from 1.3.0 to 1.7.5
- grpc-gateway from v1.1.0-25-g84398b9 to v1.3.0

TODO:

- [x] Apply etcd [3.2 client upgrade guide](https://github.com/coreos/etcd/blob/master/Documentation/upgrades/upgrade_3_2.md)
- [x] Apply grpc API changes in 1.6.0 and 1.7.0 [release notes](https://github.com/grpc/grpc-go/releases)
- [x] bbolt was pulled in transitively, why? We have tests that embed etcd, so we must vendor the etcd server and all it's dependencies.
- [x] Upgrade to containerd v1.0.0? Currently kubernetes depends on containerd v1.0.0-beta.2-159-g27d450a0 which depends on grpc v1.3.0, but  containerd v1.0.0 depends on grpc 1.7.2. Not needed. The containerd grpc upgrade required [no code changes](ce3e32680d).
- [x] Fix all failing tests
- [x] Ensure we can safely upgrade grpc to 1.7.5 given that docker and cAdvisor still depend on grpc 1.3.0 (both in the versions we vend and on master for both projects). Should we hold off on this change until we have a docker release that uses gprc 1.7.x?
- [x] Wait for grpc 1.7.5 to be released (it will include https://github.com/grpc/grpc-go/pull/1747). Once released, bump grpc version in this PR and remove workarounds in `hack/godep-save.sh`.

Transitive dependencies on grpc:
- docker depends on grpc, but according to the package dependency graph (`go list -f '{{ .Deps }}'`) there are no dependencies from kubernetes to grpc via docker packages.
- containerd v1.0.0 depends on grpc 1.7.2, we should upgrade to containerd v1.0.0 soon, this can be done in a separate PR
- cadvisor depends on grpc 1.3.0 on master, it should upgrade it to grpc 1.7.5, this can be done in a separate PR

**Release note**:

```release-note
Upgrade to etcd client 3.2.11 and grpc 1.7.5 to improve HA etcd cluster stability.
```
2017-12-19 02:14:36 -08:00
Chris Glass 3e3385821a Do not require the linux headers to be installed.
The linux headers take significant disk space and are not necessary to
run kubernetes on a GKE node. User logging on to a node can trivially
install the kernel headers should they need to by running "apt-get install
linux-headers-$(uname -r)".

Signed-off-by: Chris Glass <chris.glass@canonical.com>
2017-12-19 10:58:13 +01:00
Chris Glass be41d1e2ce Do not require the vim package to be installed
The minimal Ubuntu image used on GKE nodes provides the vim editor as
part of system packages, as "vim.tiny". People logging on the nodes have a vim
environment available despite the "vim" package not being installed.

Signed-off-by: Chris Glass <chris.glass@canonical.com>
2017-12-19 10:58:13 +01:00
Kubernetes Submit Queue c16a336eaa
Merge pull request #55475 from ScorpioCPH/fix-e2e-local-test
Automatic merge from submit-queue (batch tested with PRs 55475, 57155, 57260, 57222). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix e2e local test

**What this PR does / why we need it**:

Fix some issue on local e2e_node test: `Can't start e2e service "kubelet"`

**Which issue(s) this PR fixes**:

Fixes https://github.com/kubernetes/kubernetes/issues/54622

**Special notes for your reviewer**:

**Release note**:

```release-note

```
/sig node
2017-12-18 19:45:38 -08:00
Kubernetes Submit Queue 6eb6955371
Merge pull request #56108 from tpepper/trivial
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

e2e_node: small fixes to setup_host.sh for Ubuntu Trusty

**What this PR does / why we need it**:
Two small fixes for how nsenter is built from source and installed on Ubuntu Trusty:
1. Use mktemp for the creating the build directory instead of a hard coded name
2. Use current (2.31) util-linux instead of 3+ year old version

**Which issue(s) this PR fixes**:

Fixes #56106 

**Special notes for your reviewer**:

See https://github.com/kubernetes/kubernetes/issues/56106 for some thoughts on other ways to address this.  My patch for util-linux 2.31 may just be a band-aid?

**Release note**:
```release-note
NONE
```
2017-12-18 16:36:26 -08:00
Joe Betz 94f2ed6849 Fix build and test errors from etcd 3.2.11 upgrade 2017-12-18 16:17:42 -08:00
Tim Pepper 3f4294cdf9 e2e_node: use newer util-linux
The e2e_node test environment setup script is hard coded to pull down a
quite old version of util-linux in order to build nsenter on trusty,
which sadly is well known to not include an nsenter executable.

While "just a test", it's unfortunate to be building from really old
util-linux sources when newer are available.

Signed-off-by: Tim Pepper <tpepper@vmware.com>
2017-12-18 13:18:36 -08:00
Tim Pepper 5256e26f68 e2e_node: use mktemp when building nsenter on trusty
There's a bit of a hack in place to insure nsenter is present on Ubuntu
trusty, which doesn't otherwise include it.  This downloads util-linux
to a hard coded directory in /tmp which is a bad practice.  Even though
"this is just a test case" it should properly use mktemp.

Signed-off-by: Tim Pepper <tpepper@vmware.com>
2017-12-18 13:18:36 -08:00
Tim Hockin f7be352a67 gcloud docker now auths k8s.gcr.io by default 2017-12-18 09:18:34 -08:00
Tim Hockin eba5b6092a Use k8s.gcr.io vanity domain for container images 2017-12-18 09:18:34 -08:00
Kubernetes Submit Queue 0a940bfd10
Merge pull request #56250 from ingvagabund/extend-node-e2e-test-suite-with-containerized-kubelet
Automatic merge from submit-queue (batch tested with PRs 56250, 56809, 56812, 56792, 56724). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Extend node e2e test suite with containerized Kubelet

How to run it?
```sh
make test-e2e-node \
    TEST_ARGS='--kubelet-containerized=true --hyperkube-image=hyperkube-amd64:1.9 --kubelet-flags="<FLAGS>"' \
    FOCUS="Conformance"
```
2017-12-16 07:46:38 -08:00
Kubernetes Submit Queue 8415e0c608
Merge pull request #56661 from xiangpengzhao/move-kubelet-constants
Automatic merge from submit-queue (batch tested with PRs 56410, 56707, 56661, 54998, 56722). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move some kubelet constants to a common place

**What this PR does / why we need it**:
More context, see: https://github.com/kubernetes/kubernetes/issues/56516
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56516
[thanks @ixdy for verifying this!]

**Special notes for your reviewer**:
@ixdy how can I verify #56516 against this locally?

/cc @ixdy @mtaufen 

**Release note**:

```release-note
NONE
```
2017-12-16 05:46:35 -08:00
Kubernetes Submit Queue 6c5eb50c8d
Merge pull request #55984 from derekwaynecarr/summary-tests
Automatic merge from submit-queue (batch tested with PRs 55954, 56037, 55866, 55984, 54994). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet summary api test updates

**What this PR does / why we need it**
Fixes https://github.com/kubernetes/kubernetes/issues/55985
Improve the accuracy of the test as follows:
- ensure memory bound checks for unconstrained group are limited by actual node capacity
- grow the fs capacity bounds so we can run on larger drives (i.e. my dev laptop)

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-12-13 23:25:59 -08:00
Kubernetes Submit Queue 7b0e6b7f14
Merge pull request #56909 from dashpole/fix_local_storage_test
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[Test Fix] Fix Broken LocalStorageEviction test

**What this PR does / why we need it**:
In #56643, I added a new test case, but did not rename the pod.  The test failed because the new pod can not be created since there is a name collision.
It fixes these new test failures: https://k8s-testgrid.appspot.com/sig-node-kubelet#kubelet-flaky-gce-e2e&include-filter-by-regex=LocalStorageCapacityIsolationEviction

**Special notes for your reviewer**:
I tested it this time

**Release note**:
```release-note
NONE
```

/assign @yujuhong 
/sig node
/kind bug
/priority critical-urgent
2017-12-07 01:30:05 -08:00
Kubernetes Submit Queue f875ee596d
Merge pull request #56519 from jiayingz/e2e-node-readiness
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make sure node is ready before calling getLocalNode to fix test failure.

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/56518

**Special notes for your reviewer**:

**Release note**:

```release-note

```
2017-12-06 18:29:19 -08:00
David Ashpole 239217f84a fix test 2017-12-06 16:55:28 -08:00
David Ashpole 7294a1d8fa make requirements less precise for disk eviction test 2017-12-06 12:58:53 -08:00
Kubernetes Submit Queue a48f11c225
Merge pull request #55448 from yguo0905/spec-change-for-new-kernel
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix GKE system spec for OS images with kernel version 4.10+

Two changes are required for validating images with kernel version 4.10+.

1. Update the spec in gke.yaml to allow 4.10+ kernel.
2. Update the GKE environment docker validation to allow `CONFIG_DEVPTS_MULTIPLE_INSTANCES` to be missing in >= 4.8 kernel because this option has been removed in 4.8.

**Release note**:

```
None
```

/assign @dchen1107
2017-12-06 08:20:24 -08:00
xiangpengzhao 1f2262e6b0 Move some kubelet constants to a common place. 2017-12-01 11:24:04 +08:00
Kubernetes Submit Queue f9c93154fc
Merge pull request #56497 from rphillips/fixes/fix_disk_metrics_coreos
Automatic merge from submit-queue (batch tested with PRs 56497, 56500, 55018, 56544, 56425). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

e2e: eviction test redirect dd stderr

**What this PR does / why we need it**: Redirects `dd` stderr to /dev/null

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56234 

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-11-29 15:25:58 -08:00
Jiaying Zhang 8d9a2e09c4 Make sure node is ready before calling getLocalNode to fix test failure. 2017-11-28 15:18:17 -08:00
Kubernetes Submit Queue 8226973ae8
Merge pull request #52144 from andyxning/fix_network_value_for_stats_summary
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix network value for stats summary for multiple network interfaces

This PR is part of [Heapster #1788](https://github.com/kubernetes/heapster/pull/1788). 

The original reason is when there are more than one none `lo`, `docker0`, `veth` network interfaces instead of just one `eth0`, the network interface value is only partial and does not correct. For now, summary stats api only gets the eth0 network interface values.

The original issues about this can be find in [Heapster #1058](https://github.com/kubernetes/heapster/issues/1058) and [Cadvisor #1593](https://github.com/google/cadvisor/issues/1593).

```release-note
Fix stats summary network value when multiple network interfaces are available.
```

/cc @DirectXMan12 @piosz @xiangpengzhao @vishh @timstclair
2017-11-28 14:59:08 -08:00
Ryan Phillips 769b9eed35 e2e: eviction test redirect dd stderr
* redirect stderr to /dev/null

Fixes #56234
2017-11-28 08:42:17 -06:00
Jan Chaloupka 607e863f85 WIP: extend node e2e test suite with containerized Kubelet 2017-11-28 15:29:27 +01:00
Aaron Crickenberger 040b80d9a7 Add [sig-node] to some unowned e2e_node tests
Follow the SIGDescribe pattern used in test/e2e/foo tests
2017-11-27 11:35:44 -05:00
Kubernetes Submit Queue 1098a3012a
Merge pull request #56228 from balajismaniam/cpuman-test-feature-tag
Automatic merge from submit-queue (batch tested with PRs 52767, 55065, 55148, 56228, 56221). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add Feature tag to CPU Manager node e2e test.

**What this PR does / why we need it**: Adds a `Feature` tag to the CPU manager node e2e tests.

CC @ConnorDoyle
2017-11-22 19:49:39 -08:00
Balaji Subramaniam 7acdf3e15e Add Feature tag to CPU Manager node e2e test. 2017-11-22 11:49:35 -08:00
Connor Doyle 6b05ac10f0 Add balajismaniam, ConnorDoyle node-e2e approvers 2017-11-22 10:01:14 -08:00
Jing Xu a66ee2eb3f Add pod-level metric for CPU and memory stats
This PR adds the pod-level metrics for CPU and memory stats. cAdvisor
can get all pod cgroup information so we can add this pod-level CPU and
memory stats information from the corresponding pod cgroup
2017-11-22 09:25:23 -08:00
Jordan Liggitt ae7dccf2e9
Revert "Kubelet flags take precedence over config from files/ConfigMaps"
This reverts commit cbebb61450.
2017-11-21 23:55:43 -05:00
Kubernetes Submit Queue d7a96d5e88
Merge pull request #56097 from mtaufen/kc-file-e2e-test
Automatic merge from submit-queue (batch tested with PRs 51494, 56097, 56072, 56175). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet flags take precedence over config from files/ConfigMaps

Changes the Kubelet configuration flag precedence order so that flags
take precedence over config from files/ConfigMaps.

See:
https://docs.google.com/document/d/18-MsChpTkrMGCSqAQN9QGgWuuFoK90SznBbwVkfZryo/

Also modifies e2e node test suite to transform all relevant Kubelet flags into
a config file before starting tests when the KubeletConfigFile feature gate is
true, and turns on the KubeletConfigFile gate for all e2e node tests.
This allows the alpha dynamic Kubelet config feature to continue to 
work in tests after the precedence change.

fixes #56171

Related: https://github.com/kubernetes/features/issues/281


```release-note
CLI flags passed to the Kubelet now take precedence over Kubelet config files and dynamic Kubelet config. This helps ensure backwards compatible behavior across Kubelet binary updates.
```
2017-11-21 19:49:27 -08:00
Kubernetes Submit Queue 754017bef4
Merge pull request #56105 from balajismaniam/enable-cpuman-only-when-not-skipped
Automatic merge from submit-queue (batch tested with PRs 55340, 55329, 56168, 56170, 56105). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable cpu manager only if the node e2e test is not skipped.

**What this PR does / why we need it**: This PR enables cpu manager in Kubelet only if the node e2e tests are not skipped. This change fixes the failures seen in https://k8s-testgrid.appspot.com/sig-node-kubelet#kubelet-serial-gce-e2e. 

Fixes #56144
2017-11-21 18:56:39 -08:00
Kubernetes Submit Queue 3bb6eeeb07
Merge pull request #55340 from jiayingz/metrics
Automatic merge from submit-queue (batch tested with PRs 55340, 55329, 56168, 56170, 56105). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adds device plugin allocation latency metric.

For #53497


**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

```
2017-11-21 18:56:29 -08:00
Kubernetes Submit Queue 94a8d81172
Merge pull request #55447 from jingxu97/Nov/podmetric
Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add Pod-level local ephemeral storage metric in Summary API

This PR adds pod-level ephemeral storage metric into Summary API.
Pod-level ephemeral storage usage is the sum of all containers and local
ephemeral volume including EmptyDir (if not backed up by memory or
hugepages), configueMap, and downwardAPI.
Address issue #55978

**Release note**:
```release-note
Add pod-level local ephemeral storage metric in Summary API. Pod-level ephemeral storage reports the total filesystem usage for the containers and emptyDir volumes in the measured Pod.
```
2017-11-21 17:57:34 -08:00
Michael Taufen cbebb61450 Kubelet flags take precedence over config from files/ConfigMaps
Changes the Kubelet configuration flag precedence order so that flags
take precedence over config from files/ConfigMaps.

See issue #56171 for more details.

Also modifies e2e node test suite to transform all relevant Kubelet
flags into a config file before starting tests when the
KubeletConfigFile feature gate is true, and turns on the
KubeletConfigFile gate for all e2e node tests. This allows the alpha
dynamic Kubelet config feature to continue to work in tests after
the precedence change.
2017-11-21 16:02:27 -08:00
Kubernetes Submit Queue 03b7d77be4
Merge pull request #54316 from dashpole/disk_request_eviction
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Take disk requests into account during evictions

fixes #54314

This PR is part of the local storage feature, and it makes the eviction manager take disk requests into account during disk evictions.
This uses the same eviction strategy as we do for memory.
Disk requests are only considered when the LocalStorageCapacityIsolation feature gate is enabled.  This is enforced by adding a check for the feature gate in getRequests().
I have added unit testing to ensure that previous behavior is preserved when the feature gate is disabled.
Most of the changes are testing.  Reviewers should focus on changes in **eviction/helpers.go**

/sig node
/assign @jingxu97  @vishh
2017-11-21 14:31:47 -08:00
Jiaying Zhang 048bafdd0b Adds device plugin registration count metric and allocation latency metric. 2017-11-21 13:44:10 -08:00
Balaji Subramaniam 16e0f12253 Enable cpu manager only if the test is not skipped.
- Also, if KubeReserved is nil, allocate a map.
2017-11-21 10:48:54 -08:00
David Ashpole 8b3bd5ae60 take disk requests into account during evictions 2017-11-21 10:21:30 -08:00
Jiaying Zhang 990113ce60 Extends gpu_device_plugin e2e_node test to verify that scheduled pods
can continue to run even after device plugin deletion and kubelet
restarts.
2017-11-20 23:40:27 -08:00
Kubernetes Submit Queue 9fe2a62b90
Merge pull request #55338 from dashpole/remove_disk_allocatable
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove Ephemeral Storage Allocatable Evictions

Issue #52336

Rationale and docs change: https://github.com/kubernetes/community/pull/1275

cc @kubernetes/sig-node-pr-reviews 
cc @derekwaynecarr @vishh 
/assign @jingxu97 
/assign @dchen1107
2017-11-20 21:43:24 -08:00
Jing Xu 75ef18c4d3 Add Pod-level local ephemeral storage metric in Summary API
This PR adds pod-level ephemeral storage metric into Summary API.
Pod-level ephemeral storage usage is the sum of all containers and local
ephemeral volume including EmptyDir (if not backed up by memory or
hugepages), configueMap, and downwardAPI.
2017-11-20 16:32:38 -08:00
Kubernetes Submit Queue 3679b54b19
Merge pull request #55898 from dashpole/fix_flaky_allocatable
Automatic merge from submit-queue (batch tested with PRs 54837, 55970, 55912, 55898, 52977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix Flaky Allocatable Setup Tests

**What this PR does / why we need it**:
Fixes a flaky node e2e serial test.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55830

**Special notes for your reviewer**:
The test was flaking because we were reading the node status before the restarted kubelet had written it.
This fixes this by waiting until we see an updated node status (looking at the condition's heartbeat time).
This also fixes an incorrect error message.

**Release note**:
```release-note
NONE
```
2017-11-18 13:13:24 -08:00
Kubernetes Submit Queue 7d1085e122
Merge pull request #54837 from xiangpengzhao/conf-test
Automatic merge from submit-queue (batch tested with PRs 54837, 55970, 55912, 55898, 52977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use framework.ConformanceIt for node e2e conformance tests

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #54726 #53909

**Special notes for your reviewer**:
/cc @mml 

**Release note**:

```release-note
NONE
```
2017-11-18 13:13:17 -08:00
Kubernetes Submit Queue ef3b27cbd4
Merge pull request #55642 from dashpole/disable_cadvisor_disk_for_cri
Automatic merge from submit-queue (batch tested with PRs 55642, 55897, 55835, 55496, 55313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Disable container disk metrics when using the CRI stats integration

Issue: https://github.com/kubernetes/kubernetes/issues/51798

As explained in the issue, runtimes which make use of the CRI Stats API still have the performance overhead of collecting those same stats through cAdvisor.
The CRI Stats API has metrics for CPU, Memory, and Disk.  This PR significantly reduces the added overhead due to collecting these stats in both cAdvisor and in the runtime.
This PR disables container disk metrics, which are very expensive to collect.

This PR does not disable node-level disk stats, as the "Raw" container handler does not currently respect ignoring DiskUsageMetrics.
This PR factors out the logic for determining whether or not to use the CRI stats provider into a helper function, as cAdvisor is instantiated before it is passed to the kubelet as a dependency.

cc @kubernetes/sig-node-pr-reviews @derekwaynecarr  
/kind feature
/sig node

/assign @Random-Liu @derekwaynecarr
2017-11-18 10:46:30 -08:00
David Ashpole 527611ee41 remove disk allocatable evictions 2017-11-18 10:34:59 -08:00
Derek Carr db89b46ce7 kubelet summary api test updates 2017-11-17 22:30:49 -05:00
Andy Xie 64a8edfbcf fix network value for stats summary 2017-11-18 10:17:59 +08:00
xiangpengzhao 7fdea2b0cf Use framework.ConformanceIt for node e2e conformance tests 2017-11-17 17:28:20 +08:00
Michael Taufen 1085b6f730 Lift embedded structure out of eviction-related KubeletConfiguration fields
- Changes the following KubeletConfiguration fields from `string` to
`map[string]string`:
  - `EvictionHard`
  - `EvictionSoft`
  - `EvictionSoftGracePeriod`
  - `EvictionMinimumReclaim`
- Adds flag parsing shims to maintain Kubelet's public flags API, while
enabling structured input in the file API.
- Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag
parsing shim living in the kubeletconfig API group, and replaces it
with the `MapStringString` shim introduced in this PR. Flag parsing
shims belong in a common place, not in the kubeletconfig API.
I manually audited these to ensure that this wouldn't cause errors
parsing the command line for syntax that would have previously been
error free (`kubeletconfig.ConfigurationMap` was unique in that it
allowed keys to be provided on the CLI without values. I believe this was
done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag,
which rightfully accepts value-free keys, and that this shim was then
just copied to `kubeletconfig`). Fortunately, the affected fields
(`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect
non-empty strings in the values of the map, and as a result passing the
empty string is already an error. Thus requiring keys shouldn't break
anyone's scripts.
- Updates code and tests accordingly.

Regarding eviction operators, directionality is already implicit in the
signal type (for a given signal, the decision to evict will be made when
crossing the threshold from either above or below, never both). There is
no need to expose an operator, such as `<`, in the API. By changing
`EvictionHard` and `EvictionSoft` to `map[string]string`, this PR
simplifies the experience of working with these fields via the
`KubeletConfiguration` type. Again, flags stay the same.

Other things:
- There is another flag parsing shim, `flags.ConfigurationMap`, from the
shared flag utility. The `NodeLabels` field still uses
`flags.ConfigurationMap`. This PR moves the allocation of the
`map[string]string` for the `NodeLabels` field from
`AddKubeletConfigFlags` to the defaulter for the external
`KubeletConfiguration` type. Flags are layered on top of an internal
object that has undergone conversion from a defaulted external object,
which means that previously the mere registration of flags would have
overwritten any previously-defined defaults for `NodeLabels` (fortunately
there were none).
2017-11-16 18:35:13 -08:00
David Ashpole 8f3e2f315e fix flaky allocatable test 2017-11-16 11:16:58 -08:00
Kubernetes Submit Queue 779105673a
Merge pull request #55188 from mindprince/accelerator-monitoring
Automatic merge from submit-queue (batch tested with PRs 55798, 49579, 54862, 55188, 51990). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add monitoring support for hardware accelerators

Currently only NVIDIA GPU monitoring is implemented.

Feature repo issue: https://github.com/kubernetes/features/issues/369
cAdvisor PR: https://github.com/google/cadvisor/pull/1762

/kind feature
/sig node
/sig instrumentation
/area hw-accelerators

**Release note**:
```release-note
Kubelet now exposes metrics for NVIDIA GPUs attached to the containers.
```
2017-11-16 03:09:21 -08:00
Yang Guo 7eb7cfe3ef Add a cloud-init script to disable live-restore 2017-11-14 21:40:13 -08:00
David Ashpole 220edbc6e3 disable container disk metrics when using the CRI stats integration 2017-11-14 11:43:08 -08:00
Kubernetes Submit Queue 41fe3ed5bc
Merge pull request #54405 from resouer/clean-docker-dep
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[Part 1] Remove docker dep in kubelet startup

**What this PR does / why we need it**:

Remove dependency of docker during kubelet start up.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Part 1 of #54090 

**Special notes for your reviewer**:
Changes include:

1. Move docker client initialization into dockershim pkg.
2. Pass a docker `ClientConfig` from kubelet to dockershim
3. Pass parameters needed by `FakeDockerClient` thru `ClientConfig` to dockershim

(TODO, the second part) Make dockershim tolerate when dockerd is down, otherwise it will still fail kubelet

Please note after this PR, kubelet will still fail if dockerd is down, this will be fixed in the subsequent PR by making dockershim tolerate dockerd failure (initializing docker client in a separate goroutine), and refactoring cgroup and log driver detection. 

**Release note**:

```release-note
Remove docker dependency during kubelet start up 
```
2017-11-13 03:59:53 -08:00
Rohit Agarwal 9c38abd482 Expose accelerator metrics in the summary API. 2017-11-10 14:59:43 -08:00
Yang Guo ed8cd396dd Use whitelisted test image 2017-11-10 14:16:27 -08:00
Yang Guo 8ea9417a37 Adjust GKE spec to validate images with kernel version 4.10+ 2017-11-10 09:47:08 -08:00
Penghao Cen 22b04c828b Append --feature-gates option iff TestContext.FeatureGates is not nil 2017-11-10 19:42:22 +08:00
Dr. Stefan Schimanski bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski 012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Kubernetes Submit Queue f7dc3966a4
Merge pull request #47497 from mikedanese/binary
Automatic merge from submit-queue (batch tested with PRs 54773, 52523, 47497, 55356, 49429). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

don't check in mounter binary

```release-note
GCI mounter is moved from the manifests tarball to the server tarball.
```
2017-11-08 22:11:53 -08:00
Kubernetes Submit Queue fdeeed1001
Merge pull request #54688 from yanxuean/besteffort
Automatic merge from submit-queue (batch tested with PRs 53645, 54734, 54586, 55015, 54688). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

e2e-node:the value of bestEffortCgroup is wrong

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>

**What this PR does / why we need it**:
The value of bestEffortCgroup is wrong in e2e-node. The test case is invalid actually.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-11-06 15:33:50 -08:00
zouyee 68c5ce19b8 [test/e2e_node]Redirect dl.k8s.io to the kubernetes-release GCS bucket 2017-11-02 12:18:50 +08:00
Harry Zhang de1c305356 Remove docker dep in kubelet startup
Update bazel
2017-11-01 10:03:01 +08:00
Kubernetes Submit Queue 94935721d5
Merge pull request #54160 from mtaufen/runtime-config-to-flags
Automatic merge from submit-queue (batch tested with PRs 54160, 54016). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move runtime-related flags from KubeletConfiguration to KubeletFlags

With respect to https://github.com/kubernetes/kubernetes/pull/53833#issuecomment-336317287, move runtime-related flags out of KubeletConfiguration.

Broader issue: https://github.com/kubernetes/features/issues/281

```release-note
NONE
```
2017-10-31 01:23:15 -07:00
Mike Danese bef68f7dbc cluster: build gci mounter like other go binaries 2017-10-30 13:56:09 -07:00
Kubernetes Submit Queue eff1a84638
Merge pull request #52256 from feiskyer/credential-provider-test
Automatic merge from submit-queue (batch tested with PRs 49762, 52256). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add node e2e tests for pulling images from credential providers

**What this PR does / why we need it**:

Add node e2e tests for pulling images from credential providers.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Refer https://github.com/kubernetes/kubernetes/pull/51870#issuecomment-328234010

**Special notes for your reviewer**:

/assign @yujuhong @Random-Liu 

1. We still need to add ResetDefaultDockerProviderExpiration for facilitating tests
2. Do we need a separate image for pulling private image from credential provider?
3. Any suggestion of also adding this for sandbox images? the pause image is a global config of kubelet, but we only need to set a private one for just one test case. 

**Release note**:

```release-note
NONE
```
2017-10-27 22:48:28 -07:00
yanxuean f849ebdefa e2e-node:the value of bestEffortCgroup is wrong
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-27 17:10:53 +08:00
Kevin 4c8539cece use core client with explicit version globally 2017-10-27 15:48:32 +08:00
Kubernetes Submit Queue 931bc9edf4 Merge pull request #53730 from bsteciuk/kubeadm-windows-e2e_node
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add Windows support to the system verification check

**What this PR does / why we need it**:  This PR (in conjunction with https://github.com/kubernetes/kubernetes/pull/53553 ) adds initial support for adding a Windows worker node to a Kubernetes cluster using
 kubeadm.  It was suggested on that PR to open a separate PR for the changes in test/e2e_node for review by sig-node devs.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #364 in conjuction with #53553 

**Special notes for your reviewer**:

**Release note**:

```release-note
Add Windows support to the system verification check
```
2017-10-26 19:23:01 -07:00
Pengfei Ni 2bbba1f662 Add node e2e tests for pulling images from credential providers 2017-10-26 20:55:13 +08:00
Bob Steciuk 94db64fcb3 Add Windows support to the system verification check
Pulled SysSpecs out of types.go and created two os specific implementations with build tags

Similarly created conditionally compiled implementations of KernelValidationHelper to get Kernel version in os specific manner, as well as os specific docker endpoints (socket vs named pipes)
2017-10-25 18:55:47 -04:00
Kubernetes Submit Queue 1336cc0b05 Merge pull request #53051 from tanshanshan/test925
Automatic merge from submit-queue (batch tested with PRs 53051, 52489, 53920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix todo

**What this PR does / why we need it**:
fix todo 
thanks
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-24 21:38:17 -07:00
Michael Taufen f90b46c784 Move runtime-related flags from KubeletConfiguration to KubeletFlags 2017-10-23 11:15:48 -07:00
Sen Lu 0538b65421 Add a notice for node e2e config files 2017-10-20 16:10:13 -07:00
foxyriver 7d71129ff0 delete archive 2017-10-20 11:40:52 +08:00
Di Xu f7f3577035 use multi-arch busybox for e2e 2017-10-19 10:36:31 +08:00
Dr. Stefan Schimanski cad0364e73 Update bazel 2017-10-18 17:24:04 +02:00
Dr. Stefan Schimanski 7773a30f67 pkg/api/legacyscheme: fixup imports 2017-10-18 17:23:55 +02:00
Kubernetes Submit Queue 855551dc80 Merge pull request #51250 from dixudx/bump_cni_v0.6.0
Automatic merge from submit-queue (batch tested with PRs 53106, 52193, 51250, 52449, 53861). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

bump CNI to v0.6.0

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #49480

**Special notes for your reviewer**:
/assign @luxas @bboreham @feiskyer 

**Release note**:

```release-note
bump CNI to v0.6.0
```
2017-10-16 14:47:23 -07:00
Jeff Grafton aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
Di Xu dba448c2a6 Update all binary download references to v0.6.0 2017-10-14 22:24:49 +08:00
David Ashpole 539fddb49d kubelet evictions take priority into account 2017-10-12 13:15:05 -07:00
Kubernetes Submit Queue 0f5f82fa44 Merge pull request #53416 from krzyzacy/nodeconfig-path
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a flag to customize config relative dir

So while migrating nodee2e configs to test-infra, I found out that I'd need to have a better support for [user-data](https://github.com/kubernetes/test-infra/blob/master/jobs/e2e_node/image-config.yaml#L11). However it's not wise to use an [absolute path](https://github.com/kubernetes/test-infra/blob/master/jobs/config.json#L9309), having the config dir to be configurable will be a better solution here, and as well for later on support run local node tests from test-infra.

Currently the job references to the image configs from test-infra, but read metadata from kubernetes, which is wrong :-\


/assign @yguo0905 @Random-Liu
2017-10-12 00:57:29 -07:00
Kubernetes Submit Queue 8c8709d4de Merge pull request #53581 from Random-Liu/add-containerd-validation-node-e2e
Automatic merge from submit-queue (batch tested with PRs 53668, 53624, 52639, 53581, 51215). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add extra log and node env metadata support.

This PR:
1) Make log collection logic extensible via flags, so that we could collect more daemon logs in this PR. (e.g. `containerd.log` and `cri-containerd.log`)
2) Add extra node metadata from specified environment variable. (e.g. `PULL_REFS` in prow).

@krzyzacy I'll change the test-infra side soon. Let's discuss whether we should move/copy this code to test infra in your refactoring.

/cc @dchen1107 @yujuhong @abhi @mikebrow 

```release-note
NONE
```
2017-10-11 17:00:06 -07:00
Michael Taufen 8180536bed Mulligan: Remove deprecated and experimental fields from KubeletConfiguration
Revert "Merge pull request #51857 from kubernetes/revert-51307-kc-type-refactor"

This reverts commit 9d27d92420, reversing
changes made to 2e69d4e625.

See original: #51307

We punted this from 1.8 so it could go through an API review. The point
of this PR is that we are trying to stabilize the kubeletconfig API so
that we can move it out of alpha, and unblock features like Dynamic
Kubelet Config, Kubelet loading its initial config from a file instead
of flags, kubeadm and other install tools having a versioned API to rely
on, etc.

We shouldn't rev the version without both removing all the deprecated
junk from the KubeletConfiguration struct, and without (at least
temporarily) removing all of the fields that have "Experimental" in
their names. It wouldn't make sense to lock in to deprecated fields.
"Experimental" fields can be audited on a 1-by-1 basis after this PR,
and if found to be stable (or sufficiently alpha-gated), can be restored
to the KubeletConfiguration without the "Experimental" prefix.
2017-10-11 09:52:39 -07:00
Random-Liu 6d132e8e18 Add extra log and node env support.
Signed-off-by: Random-Liu <taotaotheripper@gmail.com>
2017-10-10 18:07:08 -07:00
Michael Taufen 131b419596 Make feature gates loadable from a map[string]bool
Command line flag API remains the same. This allows ComponentConfig
structures (e.g. KubeletConfiguration) to express the map structure
behind feature gates in a natural way when written as JSON or YAML.

For example:

KubeletConfiguration Before:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates: "DynamicKubeletConfig=true,Accelerators=true"
```

KubeletConfiguration After:
```
apiVersion: kubeletconfig/v1alpha1
kind: KubeletConfiguration
featureGates:
  DynamicKubeletConfig: true
  Accelerators: true
```
2017-10-10 09:37:51 -07:00
Kubernetes Submit Queue aaf14d4619 Merge pull request #53525 from sttts/sttts-scheme-copier-romoval
Automatic merge from submit-queue (batch tested with PRs 53525, 53652). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apimachinery: remove ObjectCopier interface(s)

The big commit is a mechanical, transitive removal of the copier interfaces in all structs and function calls.
2017-10-10 08:31:41 -07:00
Dr. Stefan Schimanski ecb65a6a71 Update generated files 2017-10-07 11:28:47 +02:00
Kubernetes Submit Queue d9bc7f0896 Merge pull request #52606 from Random-Liu/local-node-e2e-return-error
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Let local node e2e return error.

Fixes #52665

Let `make test-e2e-node` return error when it fails. Now it always returns exit code 0, whenever it fails or not.

@yguo0905 Could you help me review this?

Signed-off-by: Lantao Liu <lantaol@google.com>
2017-10-06 21:53:03 -07:00