Commit Graph

913 Commits (381729f1fb88e01d35b7429b86476ca6304b639e)

Author SHA1 Message Date
Pengfei Ni a696e86bb0 Add node e2e tests for hostPid 2017-04-07 13:41:58 +08:00
Kubernetes Submit Queue 08fefc9d9a Merge pull request #42769 from timchenxiaoyu/acrosstypo
Automatic merge from submit-queue

fix across typo

fix across typo


NONE
2017-04-05 14:28:26 -07:00
Kubernetes Submit Queue 0a1385178d Merge pull request #43248 from yujuhong/pause_proc
Automatic merge from submit-queue

node e2e: improve the validate OOM score test for infra containers

The test blindly checked all "pause" processes on the node, assuming
they were all infra containers. This change takes a snapshot of all
existing "pause" processes on the node, and exclude them in the
validation. The test still relies on the fact that it runs exclusively
on the node. If that assumption changes, we will need other methods to
locate the PIDs of the infra containers.

This fixes #37580
2017-04-03 20:20:53 -07:00
Slava Semushin be78d03afb test/e2e*: add/update README.md files. 2017-04-03 19:05:50 +02:00
Kubernetes Submit Queue 25a87fa19c Merge pull request #40804 from runcom/prepull-cri
Automatic merge from submit-queue

test/e2e_node: prepull images with CRI

Part of https://github.com/kubernetes/kubernetes/issues/40739

- This PR builds on top of #40525 (and contains one commit from #40525)
- The second commit contains a tiny change in the `Makefile`.
- Third commit is a patch to be able to prepull images using the CRI (as opposed to run `docker` to pull images which doesn't make sense if you're using CRI most of the times)

Marked WIP till #40525 makes its way into master

@Random-Liu @lucab @yujuhong @mrunalp @rhatdan
2017-04-01 03:08:35 -07:00
Antonio Murdaca 2634f57f7f
test/e2e_node: prepull images with CRI
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-04-01 10:18:56 +02:00
Kubernetes Submit Queue 659ea8708f Merge pull request #43407 from sjenning/selinux-npd-refactor
Automatic merge from submit-queue

tests: e2e-node: refactor node-problem-detector test to avoid selinux…

Fixes https://github.com/kubernetes/kubernetes/issues/43401

This test creates a file in /tmp on the host and creates a HostPath volume in the container to so that the host can inject messages into the logfile being read by the node problem detector in the container.  However, selinux prohibits the container from reading files out of /tmp on the host.

This PR modifies the test to create the log file in an EmptyDir volume instead, which will be properly labeled for container access.

@derekwaynecarr
2017-03-31 18:30:46 -07:00
Derek Carr 7ded90eafb Add derekwaynecarr to approvers list for test/e2e_node 2017-03-31 18:00:15 -04:00
Yu-Ju Hong 09cf5c8192 Node e2e: Remove CRI/non-CRI configs 2017-03-30 12:07:05 -07:00
Seth Jennings 8f6f6bf141 tests: e2e-node: refactor node-problem-detector test to avoid selinux issues 2017-03-29 10:26:59 -05:00
deads2k 8e26fa25da wire in aggregation 2017-03-27 09:44:10 -04:00
Kubernetes Submit Queue 0afb06b600 Merge pull request #43083 from alejandroEsc/ae/osx/e2e_node_problem_detector
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)

Darwin won't build: syscall.Sysinfo issue.

**What this PR does / why we need it**: On darwin had problems building and testing because of syscall.Sysinfo_t etc which is a linux specific command. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

**Special notes for your reviewer**: Definitely would like another set of eyes on the bootTime function, it will have to be inaccurate but open to suggestions about improving this for darwin.


**Release note**:
```
NONE
```
2017-03-25 21:22:26 -07:00
Kubernetes Submit Queue fb537762fc Merge pull request #42297 from YuPengZTE/devErrorf
Automatic merge from submit-queue (batch tested with PRs 42237, 42297, 42279, 42436, 42551)

should replace errors.New(fmt.Sprintf(...)) with fmt.Errorf(...)

Signed-off-by: yupengzte <yu.peng36@zte.com.cn>



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-03-24 14:16:23 -07:00
caleb miles f4d9bbc7d8
Bump CNI consumers to latest version
- vendored CNI plugins properly handle `DEL` on missing resources
- [based on v0.5.1](https://github.com/kubernetes/kubernetes/issues/43488#issuecomment-288525151)
2017-03-22 16:03:13 -07:00
Derek Carr 5c8b957779 Fix faulty assumptions in summary API testing 2017-03-20 14:56:11 -04:00
Seth Jennings 5156f582fe e2e-node: refactor lifecycle test to avoid selinux issues 2017-03-20 11:56:10 -05:00
Alejandro Escobar ee741f23af added a space
bazel update

added new files to reflect that only one method has changed between arch types.

forgot to add changes to a commit.

changes made and gfmt run.

changed node_problem_detector to node_problem_detector_linux and made it linux only.

updated bazel
2017-03-20 06:40:27 -07:00
Yu-Ju Hong 056e343e03 node e2e: improve the validate OOM score test for infra containers
The test blindly checked all "pause" processes on the node, assuming
they were all infra containers. This change takes a snapshot of all
existing "pause" processes on the node, and exclude them in the
validation. The test still relies on the fact that it runs exclusively
on the node. If that assumption changes, we will need other methods to
locate the PIDs of the infra containers.
2017-03-16 15:39:03 -07:00
Kubernetes Submit Queue 5e0f0047dd Merge pull request #43242 from timstclair/summary-test
Automatic merge from submit-queue

Relax 'misc' container memory constraints

Fixes https://github.com/kubernetes/kubernetes/issues/40607

/cc @dchen1107
2017-03-16 15:25:23 -07:00
Tim St. Clair 827dd340d4
Relax 'misc' container memory constraints 2017-03-16 12:08:22 -07:00
Kubernetes Submit Queue 6656ffc300 Merge pull request #43165 from Random-Liu/update-npd
Automatic merge from submit-queue

Update npd to the official v0.3.0 release.

Update npd to the official release v0.3.0.

This also fixes a npd bug https://github.com/kubernetes/node-problem-detector/pull/98.

@dchen1107 @kubernetes/node-problem-detector-reviewers
2017-03-16 11:23:43 -07:00
Random-Liu c4b3fd4e63 Update npd to the official v0.3.0 release. 2017-03-15 14:26:12 -07:00
Tim St. Clair 9a1236ae20
Add process debug information to summary test 2017-03-14 17:45:12 -07:00
Vishnu Kannan 8ed9bff073 handle container restarts for GPUs
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Random-Liu f81460e35d Change the junit file name format to `junit_image-name_id.xml`,
and make the gci image name shorter.
2017-03-09 16:47:48 -08:00
Kubernetes Submit Queue 7c08e817a5 Merge pull request #42734 from dashpole/deletion_timeout
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

Create DefaultPodDeletionTimeout for e2e tests

In our e2e and e2e_node tests, we had a number of different timeouts for deletion.
Recent changes to the way deletion works (#41644, #41456) have resulted in some timeouts in e2e tests.  #42661 was the most recent fix for this.
Most of these tests are not meant to test pod deletion latency, but rather just to clean up pods after a test is finished.
For this reason, we should change all these tests to use a standard, fairly high timeout for deletion.

cc @vishh @Random-Liu
2017-03-09 15:06:53 -08:00
Kubernetes Submit Queue eefa2ef1bb Merge pull request #42425 from apprenda/kubeadm_189_docker_version
Automatic merge from submit-queue (batch tested with PRs 42762, 42739, 42425, 42778)

kubeadm: update docker version for CE and EE

**What this PR does / why we need it**: Update regex for docker version to also capture new CE and EE versions. 

**Which issue this PR fixes**: fixes #https://github.com/kubernetes/kubeadm/issues/189

**Special notes for your reviewer**: /cc @jbeda @luxas

**Release note**:
```release-note
NONE
```
2017-03-09 02:51:40 -08:00
timchenxiaoyu 767719ea9c fix across typo 2017-03-09 09:07:21 +08:00
David Ashpole 3806d386df use default timeout for deletion 2017-03-08 14:40:19 -08:00
Derek McQuay 35f07095d8
kubeadm: validators pass warnings and errors
This change allows validators to pass warnings as well as errors. This
was needed because of how support for docker 1.13+ and the new EE and CE
versions is currently being handled.
2017-03-08 14:35:26 -08:00
Kubernetes Submit Queue bf7f42d362 Merge pull request #42499 from dashpole/memcg_test_suite
Automatic merge from submit-queue

New e2e node test suite with memcg turned on

The flag --experimental-kernal-memcg-notification was initially added to allow disabling an eviction feature which used memcg notifications to make memory evictions more reactive.
As documented in #37853, memcg notifications increased the likelihood of encountering soft lockups, especially on CVM.

This feature would valuable to turn on, at least for GCI, since soft lockup issues were less prevalent on GCI and appeared (at the time) to be unrelated to memcg notifications.

In the interest of caution, I would like to monitor serial tests on GCI with --experimental-kernal-memcg-notification=true.

cc @vishh @Random-Liu @dchen1107 @kubernetes/sig-node-pr-reviews
2017-03-07 18:47:40 -08:00
Kubernetes Submit Queue 0d60fc4013 Merge pull request #42687 from dashpole/flaky_to_serial
Automatic merge from submit-queue (batch tested with PRs 42664, 42687)

[Fix Flaky Tests] E2e Node Flaky test suite runs serially

The [e2e Node Flaky Test Suite](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e&width=20) has been failing with strange errors.
This is because the tests in that suite are meant to be run serially, but are running in parallel, since that was left out of the config.  This PR fixes this by changing the Flaky test suite to serial

cc @Random-Liu
2017-03-07 17:51:17 -08:00
David Ashpole b0d138692e make the flaky suite run serially. Should prevent all the dynamic config errors 2017-03-07 15:12:17 -08:00
David Ashpole 0e20caf3fb new suite with memcg turned on 2017-03-07 14:14:08 -08:00
David Ashpole 6a0d5506c2 use default timeout 2017-03-07 11:45:59 -08:00
Derek McQuay eeefd2ca87
kubeadm: fail on docker version 1.13+, CE, and EE 2017-03-07 10:20:32 -08:00
Derek Carr 48d822eafe cgroup names created by kubelet should be lowercased 2017-03-06 11:19:21 -05:00
yupengzte 363f321f32 should replace errors.New(fmt.Sprintf(...)) with fmt.Errorf(...)
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
2017-03-06 09:14:48 +08:00
Kubernetes Submit Queue cb0728c50f Merge pull request #42457 from yujuhong/do_not_panic
Automatic merge from submit-queue (batch tested with PRs 42456, 42457, 42414, 42480, 42370)

node e2e: apparmor test should fail instead of panicking

This doesn't fix #42420, but at least stop the test from panicking.
2017-03-04 00:17:42 -08:00
Kubernetes Submit Queue f9ccee7714 Merge pull request #42435 from dashpole/timestamps_for_fsstats
Automatic merge from submit-queue (batch tested with PRs 42369, 42375, 42397, 42435, 42455)

[Bug Fix]: Avoid evicting more pods than necessary by adding Timestamps for fsstats and ignoring stale stats

Continuation of #33121.  Credit for most of this goes to @sjenning.  I added volume fs timestamps.

**why is this a bug** 
This PR attempts to fix part of https://github.com/kubernetes/kubernetes/issues/31362 which results in multiple pods getting evicted unnecessarily whenever the node runs into resource pressure. This PR reduces the chances of such disruptions by avoiding reacting to old/stale metrics.
Without this PR, kubernetes nodes under resource pressure will cause unnecessary disruptions to user workloads. 
This PR will also help deflake a node e2e test suite.

The eviction manager currently avoids evicting pods if metrics are old.  However, timestamp data is not available for filesystem data, and this causes lots of extra evictions.
See the [inode eviction test flakes](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e) for examples.
This should probably be treated as a bugfix, as it should help mitigate extra evictions.

cc: @kubernetes/sig-storage-pr-reviews  @kubernetes/sig-node-pr-reviews @vishh @derekwaynecarr @sjenning
2017-03-03 23:21:48 -08:00
Kubernetes Submit Queue 2d319bd406 Merge pull request #42204 from dashpole/allocatable_eviction
Automatic merge from submit-queue

Eviction Manager Enforces Allocatable Thresholds

This PR modifies the eviction manager to enforce node allocatable thresholds for memory as described in kubernetes/community#348.
This PR should be merged after #41234. 

cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests @vishh 

** Why is this a bug/regression**

Kubelet uses `oom_score_adj` to enforce QoS policies. But the `oom_score_adj` is based on overall memory requested, which means that a Burstable pod that requested a lot of memory can lead to OOM kills for Guaranteed pods, which violates QoS. Even worse, we have observed system daemons like kubelet or kube-proxy being killed by the OOM killer.
Without this PR, v1.6 will have node stability issues and regressions in an existing GA feature `out of Resource` handling.
2017-03-03 20:20:12 -08:00
Kubernetes Submit Queue 67500b3947 Merge pull request #42443 from Random-Liu/fix-node-e2e-npd
Automatic merge from submit-queue (batch tested with PRs 42443, 38924, 42367, 42391, 42310)

Cast system uptime to time.Duration to fix cross build.

Fixes https://github.com/kubernetes/kubernetes/issues/42441.

Cast system uptime to `time.Duration` to avoid different behavior on different architectures.

@sjenning @ixdy @ncdc
2017-03-03 18:08:38 -08:00
Kubernetes Submit Queue 98eae9b222 Merge pull request #42341 from dashpole/critial_pod_test
Automatic merge from submit-queue

Critial pod test uses allocatable instead of capacity

This solves #42239.

When this test was first introduced, pods could request up to the capacity of the node.
With the addition of allocatable introduced in #41234, this is no longer the case, and pods can only use up to allocatable.

This should be included in 1.6, as it is a bug related to a 1.6 feature.

cc @vish @yujuhong
2017-03-03 14:34:37 -08:00
Yu-Ju Hong 1d907dbf4f node e2e: apparmor test should fail instead of panicking 2017-03-02 16:36:52 -08:00
David Ashpole a90c7951d4 add volume timestamps 2017-03-02 15:01:59 -08:00
Random-Liu d41c2503e7 Cast system uptime to time.Duration to fix cross build. 2017-03-02 14:48:09 -08:00
Kubernetes Submit Queue 1d97472361 Merge pull request #41928 from Random-Liu/move-npd-test-to-node-e2e
Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)

Move node problem detector test into node e2e.

Move current NPD e2e test into node e2e.

In fact, current NPD e2e test is only a functionality test for NPD. It creates test NPD pod, sets test configuration, generates test logs and verifies test result.
It doesn't actually test the NPD really deployed in the cluster.

So it doesn't actually need to run in cluster e2e. Running it in node e2e will:
1) Make it easier to run the test.
2) Make it more light weight to introduce this as a pre/post submit test in NPD repo in the future.

Except this, I'm working on a cluster e2e to run some basic functionality test and benchmark test against the real NPD deployed in the cluster. Will send the PR later.

/cc @dchen1107 @kubernetes/node-problem-detector-reviewers
2017-03-02 10:51:18 -08:00
David Ashpole ac612eab8e eviction manager changes for allocatable 2017-03-02 07:36:24 -08:00
David Ashpole 5fa6515509 critial pod test uses allocatable instead of capacity 2017-03-01 09:57:17 -08:00
Kubernetes Submit Queue ed479163fa Merge pull request #42116 from vishh/gpu-experimental-support
Automatic merge from submit-queue

Extend experimental support to multiple Nvidia GPUs

Extended from #28216

```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with  support for multiple Nvidia GPUs. 
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```

1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```

TODO:

- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
2017-03-01 04:52:50 -08:00
Kubernetes Submit Queue cda109d224 Merge pull request #36828 from mtaufen/eviction-test-thresholds
Automatic merge from submit-queue (batch tested with PRs 42216, 42136, 42183, 42149, 36828)

Set custom threshold for memory eviction test

I am hoping this helps with memory eviction flakes, e.g. https://github.com/kubernetes/kubernetes/issues/32433 and https://github.com/kubernetes/kubernetes/issues/31676

/cc @derekwaynecarr @calebamiles @dchen1107
2017-02-28 21:17:05 -08:00
Vishnu kannan 318f4e102a adding an e2e for GPUs
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-28 13:42:08 -08:00
Kubernetes Submit Queue 81d01a84e0 Merge pull request #41944 from jingxu97/Feb/mounter
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)

Use chroot for containerized mounts

This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-28 09:20:21 -08:00
Vishnu kannan 9a65640789 fix go vet issues
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-27 21:24:45 -08:00
Vishnu Kannan cc5f5474d5 add support for node allocatable phase 2 to kubelet
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-27 21:24:44 -08:00
timchenxiaoyu fb213582e6 fix typo retries 2017-02-27 11:52:01 +08:00
Kubernetes Submit Queue 16f87fe7d8 Merge pull request #40952 from dashpole/premption
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)

Guaranteed admission for Critical Pods

This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.

cc: @vishh
2017-02-26 12:57:59 -08:00
Jing Xu ac22416835 Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-24 13:46:26 -08:00
David Ashpole b798df8c44 check that innocent pod survives after evictions 2017-02-23 11:52:25 -08:00
David Ashpole c58970e47c critical pods can preempt other pods to be admitted 2017-02-23 10:31:20 -08:00
Kubernetes Submit Queue bfdeaf302c Merge pull request #41652 from ncdc/shared-informers-13-namespace
Automatic merge from submit-queue (batch tested with PRs 39855, 41433, 41567, 41887, 41652)

Switch namespace controller to shared informer

@smarterclayton @derekwaynecarr @gmarek @wojtek-t @deads2k @sttts @liggitt @kubernetes/sig-scalability-pr-reviews
2017-02-23 09:36:38 -08:00
Kubernetes Submit Queue 59f4c5911a Merge pull request #41819 from dchen1107/master
Automatic merge from submit-queue (batch tested with PRs 38957, 41819, 41851, 40667, 41373)

Bump GCI to gci-stable-56-9000-84-2

Changelogs since gci-beta-56-9000-80-0:

- Fixed google-accounts-daemon breaks on GCI when network is unavailable.
- Fixed iptables-restore performance regression.

cc/ @adityakali @Random-Liu @fabioy
2017-02-22 19:59:33 -08:00
Random-Liu 1c8e127973 Move node problem detector test into node e2e. 2017-02-22 14:35:46 -08:00
Derek Carr 43ae6f49ad Enable per pod cgroups, fix defaulting of cgroup-root when not specified 2017-02-21 16:34:22 -05:00
Dawn Chen 57fe26111e Update node-e2e to gci-stable-56-9000-84-2 2017-02-21 10:05:44 -08:00
Andy Goldstein 99313cc394 Switch namespace controller to shared informer 2017-02-17 12:34:27 -05:00
Yu-Ju Hong 0189da49ce Add non-cri configurations for node e2e tests 2017-02-15 11:02:53 -08:00
Kubernetes Submit Queue 4ac7fd9d19 Merge pull request #40934 from dashpole/density_test_cadvisor
Automatic merge from submit-queue

delete cadvisor pod after test

tracing looks at events for pod deletion and volume teardown.  SInce the cadvisor pod has more than 1 volume, this can make results harder to analyze.
This PR moves the deletion of the cadvisor pod to after the logPodCreateThroughput call, since that marks the "end" of the test.

cc: @dchen1107 @Random-Liu
2017-02-14 17:40:32 -08:00
Random-Liu 1226c5794a Print running containers in infra container oom score test. 2017-02-10 17:45:21 -08:00
Kubernetes Submit Queue 558c37aee3 Merge pull request #41112 from janetkuo/no-watch-until
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)

e2e test flakes: remove some uses of watch.Until in e2e tests

`watch.Until` is somewhat broken and is causing quite a lot of test flakes. See https://github.com/kubernetes/kubernetes/issues/39879#issuecomment-277966375 and https://github.com/kubernetes/kubernetes/issues/31345 for more context.

@wojtek-t @yujuhong @kargakis
2017-02-10 01:40:41 -08:00
Kubernetes Submit Queue f5c07157a8 Merge pull request #41092 from yujuhong/cri-docker1_10
Automatic merge from submit-queue (batch tested with PRs 41037, 40118, 40959, 41084, 41092)

CRI node e2e: add tests for docker 1.10
2017-02-09 16:44:44 -08:00
Kubernetes Submit Queue b7772e4f89 Merge pull request #40048 from mtaufen/remove-deprecated-flags
Automatic merge from submit-queue (batch tested with PRs 41121, 40048, 40502, 41136, 40759)

Remove deprecated kubelet flags that look safe to remove

Removes:
```
--config
--auth-path
--resource-container
--system-container
```
which have all been marked deprecated since at least 1.4 and look safe to remove.

```release-note
The deprecated flags --config, --auth-path, --resource-container, and --system-container were removed.
```
2017-02-09 14:27:45 -08:00
David Ashpole ab2ce9cd73 lengthen pod deletion timeout to prevent flakes 2017-02-08 13:12:51 -08:00
Janet Kuo 7c89359cc8 Address comments: remove unused resourceVersion in e2e util wait loop; poll pods every 2 seconds 2017-02-08 13:05:11 -08:00
Yu-Ju Hong 3d78271dd9 CRI node e2e: add tests for docker 1.10
This is part of #38164
2017-02-08 10:21:12 -08:00
Random-Liu 4e231ee3dc Remove angle brackets in the test name. 2017-02-07 16:22:59 -08:00
Michael Taufen 982df56c52 Replace uses of --config with --pod-manifest-path 2017-02-07 14:32:37 -08:00
Kubernetes Submit Queue 5d0377d2e2 Merge pull request #41027 from dchen1107/master
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)

Bump GCI to gci-beta-56-9000-80-0

cc/ @Random-Liu @adityakali 

Changelogs since gci-dev-56-8977-0-0 (currently used in Kubernetes):
 - "net.ipv4.conf.eth0.forwarding" and "net.ipv4.ip_forward" may get reset to 0
 - Track CVE-2016-9962 in Docker in GCI
 - Linux kernel CVE-2016-7097
 - Linux kernel CVE-2015-8964
 - Linux kernel CVE-2016-6828
 - Linux kernel CVE-2016-7917
 - Linux kernel CVE-2016-7042
 - Linux kernel CVE-2016-9793
 - Linux kernel CVE-2016-7039 and CVE-2016-8666
 - Linux kernel CVE-2016-8655
 - Toolbox: allow docker image to be loaded from local tarball
 - Update compute-image-package in GCI 
 - Change the product name on /etc/os-release (to COS)
 - Remove 'dogfood' from HWID_OVERRIDE in /etc/lsb-release
 - Include Google NVME extensions to optimize LocalSSD performance.
 - /proc/<pid>/io missing on GCI (enables process stats accounting)
 - Enable BLK_DEV_THROTTLING

cc/ @roberthbailey @fabioy for GKE cluster update
2017-02-06 20:57:14 -08:00
Dawn Chen 687aa5768b Update node-e2e tests to gci-beta-56-9000-80-0 2017-02-06 09:25:48 -08:00
Michael Taufen 945d223738 Set custom threshold for memory eviction test 2017-02-03 16:48:34 -08:00
Derek Carr 2ab9f0384e Update test e2e nodes to use new flag 2017-02-03 17:21:37 -05:00
Derek Carr 04a909a257 Rename cgroups-per-qos flag to not be experimental 2017-02-03 17:10:53 -05:00
David Ashpole 4cd60e2393 delete cadvisor pod after test 2017-02-03 10:33:43 -08:00
Kubernetes Submit Queue 12a80380bc Merge pull request #40874 from dashpole/density_test_volumes
Automatic merge from submit-queue (batch tested with PRs 40864, 40666, 38382, 40874)

Density Test includes deletion and volumes

Moved the calls to deletePodSync to BEFORE logDensityTimeSeries.  This is because the parser considers a line printed in logDensityTimeSeries to be the "end" of the test.  This change includes deletion in the "test window", but makes no other changes.

I also added volumes to the test, so that we can make sure that mounting and unmounting volumes are also taken into account for performance profiling.
2017-02-02 21:04:52 -08:00
Kubernetes Submit Queue 7201f3b989 Merge pull request #40884 from Random-Liu/update-to-docker-1-12-6
Automatic merge from submit-queue (batch tested with PRs 40884, 40809, 40845, 40866, 40875)

Node E2E: Create new ubuntu image with docker 1.12.6.

We should test the newest docker 1.12 version - 1.12.6.

/cc @dchen1107 @yujuhong @kubernetes/sig-node-pr-reviews
2017-02-02 18:53:47 -08:00
Kubernetes Submit Queue 8a8f6ca849 Merge pull request #40525 from lucab/to-k8s/node-e2e-local-cri
Automatic merge from submit-queue (batch tested with PRs 40812, 39903, 40525, 40729)

test/node_e2e: wire-in cri-enabled local testing

This commit wires-in the pre-existing `--container-runtime` flag for
local node_e2e testing.
This is needed in order to further skip docker specific testing
and validation.

Local CRI node_e2e can now be performed via
`make test-e2e-node RUNTIME=remote REMOTE=false`
which will also take care of passing the appropriate argument to
the kubelet.
2017-02-02 13:57:48 -08:00
Random-Liu ec7f34a24b Create new ubuntu image with docker 1.12.6. 2017-02-02 11:52:54 -08:00
David Ashpole ad73b325f3 changed density test to use volumes, and include deletion before logging 2017-02-02 08:51:01 -08:00
Luca Bruno 42bdbe5c82
test/node_e2e: wire-in "container-runtime" for local tests
This commit wires-in the pre-existing `--container-runtime` flag for
local node_e2e testing.
This is needed in order to further skip docker specific testing
and validation.

Local CRI node_e2e can now be performed via
`make test-e2e-node RUNTIME=remote REMOTE=false`
which will also take care of passing the appropriate arguments to
the kubelet.
2017-02-01 20:34:51 +00:00
Kubernetes Submit Queue f272781259 Merge pull request #40529 from lucab/to-k8s/e2e_node-kubelet-busybox-argv0
Automatic merge from submit-queue (batch tested with PRs 40529, 40630)

test/e2e_node: tie together expected string and exec

This commit ties together busybox-sh invocation and test expectation
to avoid subtle mismatches between exec command and output string.
2017-02-01 00:16:37 -08:00
Kubernetes Submit Queue e5d647988e Merge pull request #39049 from ixdy/node-e2e-ssh-key
Automatic merge from submit-queue

Add flag to node e2e test specifying location of ssh privkey

**What this PR does / why we need it**: in CI, the ssh private key is not always located at `$HOME/.ssh`, so it's helpful to be able to override it.

@krzyzacy here's my resurrected change. I'm not sure why I neglected to follow-through on it originally.

**Release note**:

```release-note
NONE
```
2017-01-31 13:40:26 -08:00
deads2k c9a008dff3 move util/intstr to apimachinery 2017-01-30 12:46:59 -05:00
Dr. Stefan Schimanski 44ea6b3f30 Update generated files 2017-01-29 21:41:45 +01:00
Dr. Stefan Schimanski bc6fdd925d pkg/api/resource: move to apimachinery 2017-01-29 21:41:44 +01:00
Lucas Käldström 84006601a0
Upgrade go version in Makefiles to 1.7, use qemu 2.7, armel => armhf and goarm=6 => goarm=7 and use go 1.7.4 2017-01-27 20:04:24 +02:00
Luca Bruno 05bff300f3
test/e2e_node: tie together expected string and exec
This commit ties together busybox-sh invocation and test expectation
to avoid subtle mismatches between exec command and output string.
2017-01-26 17:14:06 +00:00
deads2k 2734f8f892 move dynamic and discovery clients 2017-01-26 08:37:06 -05:00
Kubernetes Submit Queue 7bce538d0b Merge pull request #40326 from mtaufen/gci-cos-node-tests
Automatic merge from submit-queue

Prep node_e2e for GCI to COS name change

GCI will soon change name in etc/os-release from "gci" to "cos".

This prepares the node_e2e tests to deal with that change and also updates some comments/log messages/var names in anticipation.
2017-01-25 20:15:32 -08:00
Dr. Stefan Schimanski a0137e9b28 Update generated files 2017-01-25 19:49:45 +01:00
Dr. Stefan Schimanski d7eb3b6870 pkg/util: move uuid and strategicpatch into k8s.io/apimachinery 2017-01-25 19:45:09 +01:00
deads2k b0b156b381 make tools/cache authoritative 2017-01-25 08:29:45 -05:00
Kubernetes Submit Queue df42444742 Merge pull request #40216 from sttts/sttts-more-cutoffs
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)

genericapiserver: more dependency cutoffs

- cut-off pkg/api.Resource and friends - lgtm
- authn plugins -> k8s.io/apiserver - 
- webhook authz plugin -> k8s.io/apiserver - lgtm
- ~~pkg/cert -> k8s.io/apimachinery (will rebase on @deads2k's PR also moving it)~~
- split pkg/config into kubelet config merger and flags - lgtm
- split feature gate between generic apiserver and kube - lgtm
- move pkg/util/flag into k8s.io/apiserver - lgtm
2017-01-24 16:26:00 -08:00
Dr. Stefan Schimanski 2b8e938128 Update generated files 2017-01-24 20:56:03 +01:00
Dr. Stefan Schimanski a6b2ebb50c pkg/flag: make feature gate extensible and split between generic and kube 2017-01-24 20:56:03 +01:00
Dr. Stefan Schimanski 56d60cfae6 pkg/util: move flags from pkg/util/config to pkg/util/flags 2017-01-24 20:56:03 +01:00
Clayton Coleman be6d2933df
refactor: Move *Options references to metav1 2017-01-24 13:41:51 -05:00
Michael Taufen 924ec4711b Prep node_e2e for GCI to COS name change 2017-01-23 15:00:39 -08:00
Clayton Coleman 469df12038
refactor: move ListOptions references to metav1 2017-01-23 17:52:46 -05:00
Clayton Coleman 244734171e
Add conformance tests for terminationMessage(Path|Policy)
Test root, non-root, success and message, failure and message.
2017-01-23 12:26:37 -05:00
Sen Lu cb4ea07229 Add --ssh-user to conformance script as well 2017-01-20 16:07:13 -08:00
deads2k ee6752ef20 find and replace 2017-01-20 08:04:53 -05:00
deads2k c587b8a21e re-run client-gen 2017-01-20 08:02:36 -05:00
Kubernetes Submit Queue 582aa0d793 Merge pull request #40107 from dashpole/flaky_dynamic_kconfig
Automatic merge from submit-queue

turn on dynamic config for flaky tests

Added dynamic config to inode eviction node e2e tests in #39546, but did not enable it for flaky tests.  This PR enables this feature for the flaky test suite
2017-01-18 20:13:00 -08:00
Kubernetes Submit Queue b29d9cdbcf Merge pull request #39898 from ixdy/bazel-release-tars
Automatic merge from submit-queue

Build release tars using bazel

**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.

For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```

**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.

Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.

With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.

My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)

Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.

Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.

**Release note**:

```release-note
NONE
```
2017-01-18 14:24:48 -08:00
David Ashpole caa9cc0b70 turn on dynamic config for flaky tests 2017-01-18 13:48:47 -08:00
Antoine Pelisse 86248b5f0b Update OWNERS approvers and reviewers: test/e2e_node 2017-01-18 10:16:20 -08:00
Clayton Coleman 9a2a50cda7
refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
deads2k 77b4d55982 mechanical 2017-01-16 09:35:12 -05:00
Kubernetes Submit Queue 4744e7ec52 Merge pull request #39889 from Random-Liu/add-docker-1.12-node-e2e
Automatic merge from submit-queue (batch tested with PRs 38427, 39896, 39889, 39871, 39895)

Add docker 1.12 in node e2e.

Add docker 1.12 image in node e2e (including regular node e2e and cri node e2e).

@dchen1107 @yujuhong 
/cc @kubernetes/sig-node-misc
2017-01-13 20:21:38 -08:00
Random-Liu 04e68619ce Add docker 1.12 in node e2e. 2017-01-13 14:58:49 -08:00
Jeff Grafton 14dd0d3bef Add genrule to produce e2e_node.test binary artifact 2017-01-13 14:46:26 -08:00
Kubernetes Submit Queue 6b5d82b512 Merge pull request #37505 from k82cn/use_controller_inf
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)

Made cache.Controller to be interface.

**What this PR does / why we need it**:

#37504
2017-01-13 13:40:41 -08:00
Kubernetes Submit Queue a6fa5c2bfd Merge pull request #39814 from deads2k/api-58-multi-register
Automatic merge from submit-queue

replace global registry in apimachinery with global registry in k8s.io/kubernetes

We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work.  Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.

@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
2017-01-13 12:37:02 -08:00
deads2k f1176d9c5c mechanical repercussions 2017-01-13 08:27:14 -05:00
Klaus Ma 25fe1e0d82 Made cache.Controller to be interface. 2017-01-13 13:33:23 +08:00
NickrenREN a12dea14e0 fix redundant alias clientset 2017-01-12 10:21:05 +08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Kubernetes Submit Queue 3f9f7471af Merge pull request #38989 from sjenning/set-qos-field
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)

Set PodStatus QOSClass field

This PR continues the work for https://github.com/kubernetes/kubernetes/pull/37968

It converts all local usage of the `qos` package class types to the new API level types (first commit) and sets the pod status QOSClass field in the at pod creation time on the API server in `PrepareForCreate` and in the kubelet in the pod status update path (second commit).  This way the pod QOS class is set even if the pod isn't scheduled yet.

Fixes #33255

@ConnorDoyle @derekwaynecarr @vishh
2017-01-10 22:24:13 -08:00
Kubernetes Submit Queue a2da4f0cac Merge pull request #39546 from dashpole/dynamic_config_eviction_hard
Automatic merge from submit-queue (batch tested with PRs 39695, 37054, 39627, 39546, 39615)

Use Dynamic Config in e2e_node inode eviction test

Alternative solution to #39249.  Similar to solution proposed by @vishh in #36828.

@Random-Liu @mtaufen
2017-01-10 18:57:26 -08:00
Jeff Grafton 19aafd291c Always --pull in docker build to ensure recent base images 2017-01-10 16:21:05 -08:00
Seth Jennings e2402b781b set qos class field in pod status 2017-01-10 16:31:52 -06:00
Seth Jennings 4c30459e49 switch from local qos types to api types 2017-01-10 10:54:30 -06:00
David Ashpole c3951a72ab use dynamic config to set eviction hard threshold 2017-01-09 15:27:12 -08:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Kubernetes Submit Queue f4a8713088 Merge pull request #36229 from wojtek-t/bump_etcd_version
Automatic merge from submit-queue (batch tested with PRs 36229, 39450)

Bump etcd to 3.0.14 and switch to v3 API in etcd.

Ref #20504

**Release note**:

```release-note
Switch default etcd version to 3.0.14.
Switch default storage backend flag in apiserver to `etcd3` mode.
```
2017-01-04 17:36:06 -08:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
Random-Liu a719a7d7e7 Do not use sudo when untar node e2e tar ball. 2016-12-21 16:28:33 -08:00
Jeff Grafton 30a5efa33b Add flag to node e2e test specifying location of ssh privkey 2016-12-21 11:52:41 -08:00
Kubernetes Submit Queue 1955ed614f Merge pull request #39074 from Random-Liu/node-e2e-set-user
Automatic merge from submit-queue

Node E2E: Set user with `--ssh-user` flag when running remote node e2e.

This PR unblocks https://github.com/kubernetes/test-infra/issues/1348.

In our test environment, we must login test instance as user `jenkins` because of the service account. Node e2e is always using the default user on the host, which works fine till now, because it is always run as `jenkins` in our test environment.

However, now we moved the test runner into a docker container, inside the container user is `root` by default, which will cause error:
```
Permission denied (publickey)
```

This PR added a flag `--ssh-user` to explicitly specify the user used to ssh into test instance. The dockerized test runner can set user to `jenkins` with this flag.

@krzyzacy  @ixdy
2016-12-21 11:21:09 -08:00
Random-Liu 10f72be5af Support set user with `--ssh-user` flag when running remote node e2e. 2016-12-21 01:54:02 -08:00
Wojciech Tyczynski 498a893fa3 Switch to etcd v3 API by default 2016-12-20 11:57:46 +01:00
Kubernetes Submit Queue 5b2823adb9 Merge pull request #38191 from sttts/sttts-move-master-options
Automatic merge from submit-queue

Move non-generic apiserver code out of the generic packages
2016-12-17 01:25:45 -08:00
Matt Liggett 69cd805532 Merge pull request #38804 from Random-Liu/disable-au
Node E2E: Disable AU in node e2e test.
2016-12-16 15:32:23 -08:00
David Ashpole 5d352439d4 test no longer fails when it fails to get the summary 2016-12-16 11:50:43 -08:00
Dr. Stefan Schimanski 7267299c3c genericapiserver: move MasterCount and service options into master 2016-12-16 17:23:43 +01:00
Random-Liu c57f2ec064 Fix report prefix for node conformance test. 2016-12-15 15:27:14 -08:00
Kubernetes Submit Queue d169d59565 Merge pull request #38788 from Random-Liu/fix-node-conformance-test
Automatic merge from submit-queue (batch tested with PRs 38788, 38821, 38829)

Node Conformance Test: Fix node conformance test.

The test suite could build on my desktop. However it is failing on jenkins.
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-conformance/1

It turns out that `docker save $IMAGE -o $FILE` only works for docker 1.12. (My desktop is 1.12) For older version docker, we should use `docker save -o $FILE $IMAGE instead`. (Jenkins is using 1.9.1)

@timstclair Could you help me review this short PR? :)
2016-12-15 13:57:16 -08:00
Random-Liu e5efc21de6 Disable AU in node e2e test. 2016-12-15 01:33:09 -08:00
Kubernetes Submit Queue d8efc779ed Merge pull request #38154 from caesarxuchao/rename-release_1_5
Automatic merge from submit-queue (batch tested with PRs 38154, 38502)

Rename "release_1_5" clientset to just "clientset"

We used to keep multiple releases in the main repo. Now that [client-go](https://github.com/kubernetes/client-go) does the versioning, there is no need to keep releases in the main repo. This PR renames the "release_1_5" clientset to just "clientset", clientset development will be done in this directory.

@kubernetes/sig-api-machinery @deads2k 

```release-note
The main repository does not keep multiple releases of clientsets anymore. Please find previous releases at https://github.com/kubernetes/client-go
```
2016-12-14 14:21:51 -08:00
Random-Liu 06507e9378 Fix node conformance test. 2016-12-14 14:21:10 -08:00