Commit Graph

1224 Commits (9df102b4e2ab5af2cbc354b028358e983b342fed)

Author SHA1 Message Date
Ryan Hitchman 0c6fd62582 Fix the last deprecated "gcloud docker push args" usage. 2017-04-26 11:20:25 -07:00
NickrenREN d4376599ba Cleanup: replace some hardcoded codes and remove unused functions 2017-04-25 09:38:25 +08:00
Maru Newby 9413071ce8 e2e: Prefer kubeconfig host to default 2017-04-19 14:58:43 -07:00
Kubernetes Submit Queue 4f50a8d7cd Merge pull request #44457 from dnardo/e2e_node_cni
Automatic merge from submit-queue (batch tested with PRs 43000, 44500, 44457, 44553, 44267)

Updates e2e_node test to allow both kubenet and cni to be specified f…

…or the network plugin.

This adds a simple CNI configuration which is added to the node during test setup.
This also modifies the default flags in services/kubelet.go to specify the "cni-bin-dir"
and the "cni-conf-dir" and removes the "network-plugin-dir" flag.  This leaves the default
network plugin to kubenet.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-04-18 13:19:08 -07:00
Chao Xu 4f9591b1de move pkg/api/v1/ref.go and pkg/api/v1/resource.go to subpackages. move some functions in resource.go to pkg/api/v1/node and pkg/api/v1/pod 2017-04-17 11:38:11 -07:00
Mike Danese a05c3c0efd autogenerated 2017-04-14 10:40:57 -07:00
Daniel Nardo 29b2708046 Add comment to setupCNI. 2017-04-13 15:11:24 -07:00
Daniel Nardo 4e458ce001 Updates e2e_node test to allow both kubenet and cni to be specified for the network plugin.
This adds a simple CNI configuration which is added to the node during test setup.
This also modifies the default flags in services/kubelet.go to specify the "cni-bin-dir"
and the "cni-conf-dir" and removes the "network-plugin-dir" flag.  This leaves the default
network plugin to kubenet.
2017-04-13 09:57:46 -07:00
Timothy St. Clair 442713aaaf Remove leagcy init that no longer works. 2017-04-11 08:48:59 -05:00
Kubernetes Submit Queue f3d2ea5dfd Merge pull request #43990 from php-coder/e2e_readmes
Automatic merge from submit-queue

test/e2e*: add/update README.md files

**What this PR does / why we need it**:

This PR is adding `README.md` files with a link to the documentation to all E2E tests.
2017-04-10 08:05:04 -07:00
Michael Taufen e6321c7440 Cleanup some of the tarball producing code 2017-04-10 07:09:31 -07:00
Pengfei Ni a696e86bb0 Add node e2e tests for hostPid 2017-04-07 13:41:58 +08:00
Kubernetes Submit Queue 08fefc9d9a Merge pull request #42769 from timchenxiaoyu/acrosstypo
Automatic merge from submit-queue

fix across typo

fix across typo


NONE
2017-04-05 14:28:26 -07:00
Kubernetes Submit Queue 0a1385178d Merge pull request #43248 from yujuhong/pause_proc
Automatic merge from submit-queue

node e2e: improve the validate OOM score test for infra containers

The test blindly checked all "pause" processes on the node, assuming
they were all infra containers. This change takes a snapshot of all
existing "pause" processes on the node, and exclude them in the
validation. The test still relies on the fact that it runs exclusively
on the node. If that assumption changes, we will need other methods to
locate the PIDs of the infra containers.

This fixes #37580
2017-04-03 20:20:53 -07:00
Slava Semushin be78d03afb test/e2e*: add/update README.md files. 2017-04-03 19:05:50 +02:00
Kubernetes Submit Queue 25a87fa19c Merge pull request #40804 from runcom/prepull-cri
Automatic merge from submit-queue

test/e2e_node: prepull images with CRI

Part of https://github.com/kubernetes/kubernetes/issues/40739

- This PR builds on top of #40525 (and contains one commit from #40525)
- The second commit contains a tiny change in the `Makefile`.
- Third commit is a patch to be able to prepull images using the CRI (as opposed to run `docker` to pull images which doesn't make sense if you're using CRI most of the times)

Marked WIP till #40525 makes its way into master

@Random-Liu @lucab @yujuhong @mrunalp @rhatdan
2017-04-01 03:08:35 -07:00
Antonio Murdaca 2634f57f7f
test/e2e_node: prepull images with CRI
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-04-01 10:18:56 +02:00
Kubernetes Submit Queue 659ea8708f Merge pull request #43407 from sjenning/selinux-npd-refactor
Automatic merge from submit-queue

tests: e2e-node: refactor node-problem-detector test to avoid selinux…

Fixes https://github.com/kubernetes/kubernetes/issues/43401

This test creates a file in /tmp on the host and creates a HostPath volume in the container to so that the host can inject messages into the logfile being read by the node problem detector in the container.  However, selinux prohibits the container from reading files out of /tmp on the host.

This PR modifies the test to create the log file in an EmptyDir volume instead, which will be properly labeled for container access.

@derekwaynecarr
2017-03-31 18:30:46 -07:00
Derek Carr 7ded90eafb Add derekwaynecarr to approvers list for test/e2e_node 2017-03-31 18:00:15 -04:00
Yu-Ju Hong 09cf5c8192 Node e2e: Remove CRI/non-CRI configs 2017-03-30 12:07:05 -07:00
Seth Jennings 8f6f6bf141 tests: e2e-node: refactor node-problem-detector test to avoid selinux issues 2017-03-29 10:26:59 -05:00
deads2k 8e26fa25da wire in aggregation 2017-03-27 09:44:10 -04:00
Kubernetes Submit Queue 0afb06b600 Merge pull request #43083 from alejandroEsc/ae/osx/e2e_node_problem_detector
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)

Darwin won't build: syscall.Sysinfo issue.

**What this PR does / why we need it**: On darwin had problems building and testing because of syscall.Sysinfo_t etc which is a linux specific command. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

**Special notes for your reviewer**: Definitely would like another set of eyes on the bootTime function, it will have to be inaccurate but open to suggestions about improving this for darwin.


**Release note**:
```
NONE
```
2017-03-25 21:22:26 -07:00
Kubernetes Submit Queue fb537762fc Merge pull request #42297 from YuPengZTE/devErrorf
Automatic merge from submit-queue (batch tested with PRs 42237, 42297, 42279, 42436, 42551)

should replace errors.New(fmt.Sprintf(...)) with fmt.Errorf(...)

Signed-off-by: yupengzte <yu.peng36@zte.com.cn>



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-03-24 14:16:23 -07:00
caleb miles f4d9bbc7d8
Bump CNI consumers to latest version
- vendored CNI plugins properly handle `DEL` on missing resources
- [based on v0.5.1](https://github.com/kubernetes/kubernetes/issues/43488#issuecomment-288525151)
2017-03-22 16:03:13 -07:00
Derek Carr 5c8b957779 Fix faulty assumptions in summary API testing 2017-03-20 14:56:11 -04:00
Seth Jennings 5156f582fe e2e-node: refactor lifecycle test to avoid selinux issues 2017-03-20 11:56:10 -05:00
Alejandro Escobar ee741f23af added a space
bazel update

added new files to reflect that only one method has changed between arch types.

forgot to add changes to a commit.

changes made and gfmt run.

changed node_problem_detector to node_problem_detector_linux and made it linux only.

updated bazel
2017-03-20 06:40:27 -07:00
Yu-Ju Hong 056e343e03 node e2e: improve the validate OOM score test for infra containers
The test blindly checked all "pause" processes on the node, assuming
they were all infra containers. This change takes a snapshot of all
existing "pause" processes on the node, and exclude them in the
validation. The test still relies on the fact that it runs exclusively
on the node. If that assumption changes, we will need other methods to
locate the PIDs of the infra containers.
2017-03-16 15:39:03 -07:00
Kubernetes Submit Queue 5e0f0047dd Merge pull request #43242 from timstclair/summary-test
Automatic merge from submit-queue

Relax 'misc' container memory constraints

Fixes https://github.com/kubernetes/kubernetes/issues/40607

/cc @dchen1107
2017-03-16 15:25:23 -07:00
Tim St. Clair 827dd340d4
Relax 'misc' container memory constraints 2017-03-16 12:08:22 -07:00
Kubernetes Submit Queue 6656ffc300 Merge pull request #43165 from Random-Liu/update-npd
Automatic merge from submit-queue

Update npd to the official v0.3.0 release.

Update npd to the official release v0.3.0.

This also fixes a npd bug https://github.com/kubernetes/node-problem-detector/pull/98.

@dchen1107 @kubernetes/node-problem-detector-reviewers
2017-03-16 11:23:43 -07:00
Random-Liu c4b3fd4e63 Update npd to the official v0.3.0 release. 2017-03-15 14:26:12 -07:00
Tim St. Clair 9a1236ae20
Add process debug information to summary test 2017-03-14 17:45:12 -07:00
Vishnu Kannan 8ed9bff073 handle container restarts for GPUs
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Random-Liu f81460e35d Change the junit file name format to `junit_image-name_id.xml`,
and make the gci image name shorter.
2017-03-09 16:47:48 -08:00
Kubernetes Submit Queue 7c08e817a5 Merge pull request #42734 from dashpole/deletion_timeout
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

Create DefaultPodDeletionTimeout for e2e tests

In our e2e and e2e_node tests, we had a number of different timeouts for deletion.
Recent changes to the way deletion works (#41644, #41456) have resulted in some timeouts in e2e tests.  #42661 was the most recent fix for this.
Most of these tests are not meant to test pod deletion latency, but rather just to clean up pods after a test is finished.
For this reason, we should change all these tests to use a standard, fairly high timeout for deletion.

cc @vishh @Random-Liu
2017-03-09 15:06:53 -08:00
Kubernetes Submit Queue eefa2ef1bb Merge pull request #42425 from apprenda/kubeadm_189_docker_version
Automatic merge from submit-queue (batch tested with PRs 42762, 42739, 42425, 42778)

kubeadm: update docker version for CE and EE

**What this PR does / why we need it**: Update regex for docker version to also capture new CE and EE versions. 

**Which issue this PR fixes**: fixes #https://github.com/kubernetes/kubeadm/issues/189

**Special notes for your reviewer**: /cc @jbeda @luxas

**Release note**:
```release-note
NONE
```
2017-03-09 02:51:40 -08:00
timchenxiaoyu 767719ea9c fix across typo 2017-03-09 09:07:21 +08:00
David Ashpole 3806d386df use default timeout for deletion 2017-03-08 14:40:19 -08:00
Derek McQuay 35f07095d8
kubeadm: validators pass warnings and errors
This change allows validators to pass warnings as well as errors. This
was needed because of how support for docker 1.13+ and the new EE and CE
versions is currently being handled.
2017-03-08 14:35:26 -08:00
Kubernetes Submit Queue bf7f42d362 Merge pull request #42499 from dashpole/memcg_test_suite
Automatic merge from submit-queue

New e2e node test suite with memcg turned on

The flag --experimental-kernal-memcg-notification was initially added to allow disabling an eviction feature which used memcg notifications to make memory evictions more reactive.
As documented in #37853, memcg notifications increased the likelihood of encountering soft lockups, especially on CVM.

This feature would valuable to turn on, at least for GCI, since soft lockup issues were less prevalent on GCI and appeared (at the time) to be unrelated to memcg notifications.

In the interest of caution, I would like to monitor serial tests on GCI with --experimental-kernal-memcg-notification=true.

cc @vishh @Random-Liu @dchen1107 @kubernetes/sig-node-pr-reviews
2017-03-07 18:47:40 -08:00
Kubernetes Submit Queue 0d60fc4013 Merge pull request #42687 from dashpole/flaky_to_serial
Automatic merge from submit-queue (batch tested with PRs 42664, 42687)

[Fix Flaky Tests] E2e Node Flaky test suite runs serially

The [e2e Node Flaky Test Suite](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e&width=20) has been failing with strange errors.
This is because the tests in that suite are meant to be run serially, but are running in parallel, since that was left out of the config.  This PR fixes this by changing the Flaky test suite to serial

cc @Random-Liu
2017-03-07 17:51:17 -08:00
David Ashpole b0d138692e make the flaky suite run serially. Should prevent all the dynamic config errors 2017-03-07 15:12:17 -08:00
David Ashpole 0e20caf3fb new suite with memcg turned on 2017-03-07 14:14:08 -08:00
David Ashpole 6a0d5506c2 use default timeout 2017-03-07 11:45:59 -08:00
Derek McQuay eeefd2ca87
kubeadm: fail on docker version 1.13+, CE, and EE 2017-03-07 10:20:32 -08:00
Derek Carr 48d822eafe cgroup names created by kubelet should be lowercased 2017-03-06 11:19:21 -05:00
yupengzte 363f321f32 should replace errors.New(fmt.Sprintf(...)) with fmt.Errorf(...)
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
2017-03-06 09:14:48 +08:00
Kubernetes Submit Queue cb0728c50f Merge pull request #42457 from yujuhong/do_not_panic
Automatic merge from submit-queue (batch tested with PRs 42456, 42457, 42414, 42480, 42370)

node e2e: apparmor test should fail instead of panicking

This doesn't fix #42420, but at least stop the test from panicking.
2017-03-04 00:17:42 -08:00
Kubernetes Submit Queue f9ccee7714 Merge pull request #42435 from dashpole/timestamps_for_fsstats
Automatic merge from submit-queue (batch tested with PRs 42369, 42375, 42397, 42435, 42455)

[Bug Fix]: Avoid evicting more pods than necessary by adding Timestamps for fsstats and ignoring stale stats

Continuation of #33121.  Credit for most of this goes to @sjenning.  I added volume fs timestamps.

**why is this a bug** 
This PR attempts to fix part of https://github.com/kubernetes/kubernetes/issues/31362 which results in multiple pods getting evicted unnecessarily whenever the node runs into resource pressure. This PR reduces the chances of such disruptions by avoiding reacting to old/stale metrics.
Without this PR, kubernetes nodes under resource pressure will cause unnecessary disruptions to user workloads. 
This PR will also help deflake a node e2e test suite.

The eviction manager currently avoids evicting pods if metrics are old.  However, timestamp data is not available for filesystem data, and this causes lots of extra evictions.
See the [inode eviction test flakes](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e) for examples.
This should probably be treated as a bugfix, as it should help mitigate extra evictions.

cc: @kubernetes/sig-storage-pr-reviews  @kubernetes/sig-node-pr-reviews @vishh @derekwaynecarr @sjenning
2017-03-03 23:21:48 -08:00
Kubernetes Submit Queue 2d319bd406 Merge pull request #42204 from dashpole/allocatable_eviction
Automatic merge from submit-queue

Eviction Manager Enforces Allocatable Thresholds

This PR modifies the eviction manager to enforce node allocatable thresholds for memory as described in kubernetes/community#348.
This PR should be merged after #41234. 

cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests @vishh 

** Why is this a bug/regression**

Kubelet uses `oom_score_adj` to enforce QoS policies. But the `oom_score_adj` is based on overall memory requested, which means that a Burstable pod that requested a lot of memory can lead to OOM kills for Guaranteed pods, which violates QoS. Even worse, we have observed system daemons like kubelet or kube-proxy being killed by the OOM killer.
Without this PR, v1.6 will have node stability issues and regressions in an existing GA feature `out of Resource` handling.
2017-03-03 20:20:12 -08:00
Kubernetes Submit Queue 67500b3947 Merge pull request #42443 from Random-Liu/fix-node-e2e-npd
Automatic merge from submit-queue (batch tested with PRs 42443, 38924, 42367, 42391, 42310)

Cast system uptime to time.Duration to fix cross build.

Fixes https://github.com/kubernetes/kubernetes/issues/42441.

Cast system uptime to `time.Duration` to avoid different behavior on different architectures.

@sjenning @ixdy @ncdc
2017-03-03 18:08:38 -08:00
Kubernetes Submit Queue 98eae9b222 Merge pull request #42341 from dashpole/critial_pod_test
Automatic merge from submit-queue

Critial pod test uses allocatable instead of capacity

This solves #42239.

When this test was first introduced, pods could request up to the capacity of the node.
With the addition of allocatable introduced in #41234, this is no longer the case, and pods can only use up to allocatable.

This should be included in 1.6, as it is a bug related to a 1.6 feature.

cc @vish @yujuhong
2017-03-03 14:34:37 -08:00
Yu-Ju Hong 1d907dbf4f node e2e: apparmor test should fail instead of panicking 2017-03-02 16:36:52 -08:00
David Ashpole a90c7951d4 add volume timestamps 2017-03-02 15:01:59 -08:00
Random-Liu d41c2503e7 Cast system uptime to time.Duration to fix cross build. 2017-03-02 14:48:09 -08:00
Kubernetes Submit Queue 1d97472361 Merge pull request #41928 from Random-Liu/move-npd-test-to-node-e2e
Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)

Move node problem detector test into node e2e.

Move current NPD e2e test into node e2e.

In fact, current NPD e2e test is only a functionality test for NPD. It creates test NPD pod, sets test configuration, generates test logs and verifies test result.
It doesn't actually test the NPD really deployed in the cluster.

So it doesn't actually need to run in cluster e2e. Running it in node e2e will:
1) Make it easier to run the test.
2) Make it more light weight to introduce this as a pre/post submit test in NPD repo in the future.

Except this, I'm working on a cluster e2e to run some basic functionality test and benchmark test against the real NPD deployed in the cluster. Will send the PR later.

/cc @dchen1107 @kubernetes/node-problem-detector-reviewers
2017-03-02 10:51:18 -08:00
David Ashpole ac612eab8e eviction manager changes for allocatable 2017-03-02 07:36:24 -08:00
David Ashpole 5fa6515509 critial pod test uses allocatable instead of capacity 2017-03-01 09:57:17 -08:00
Kubernetes Submit Queue ed479163fa Merge pull request #42116 from vishh/gpu-experimental-support
Automatic merge from submit-queue

Extend experimental support to multiple Nvidia GPUs

Extended from #28216

```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with  support for multiple Nvidia GPUs. 
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```

1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```

TODO:

- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
2017-03-01 04:52:50 -08:00
Kubernetes Submit Queue cda109d224 Merge pull request #36828 from mtaufen/eviction-test-thresholds
Automatic merge from submit-queue (batch tested with PRs 42216, 42136, 42183, 42149, 36828)

Set custom threshold for memory eviction test

I am hoping this helps with memory eviction flakes, e.g. https://github.com/kubernetes/kubernetes/issues/32433 and https://github.com/kubernetes/kubernetes/issues/31676

/cc @derekwaynecarr @calebamiles @dchen1107
2017-02-28 21:17:05 -08:00
Vishnu kannan 318f4e102a adding an e2e for GPUs
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-28 13:42:08 -08:00
Kubernetes Submit Queue 81d01a84e0 Merge pull request #41944 from jingxu97/Feb/mounter
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)

Use chroot for containerized mounts

This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-28 09:20:21 -08:00
Vishnu kannan 9a65640789 fix go vet issues
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-27 21:24:45 -08:00
Vishnu Kannan cc5f5474d5 add support for node allocatable phase 2 to kubelet
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-27 21:24:44 -08:00
timchenxiaoyu fb213582e6 fix typo retries 2017-02-27 11:52:01 +08:00
Kubernetes Submit Queue 16f87fe7d8 Merge pull request #40952 from dashpole/premption
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)

Guaranteed admission for Critical Pods

This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.

cc: @vishh
2017-02-26 12:57:59 -08:00
Jing Xu ac22416835 Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-24 13:46:26 -08:00
David Ashpole b798df8c44 check that innocent pod survives after evictions 2017-02-23 11:52:25 -08:00
David Ashpole c58970e47c critical pods can preempt other pods to be admitted 2017-02-23 10:31:20 -08:00
Kubernetes Submit Queue bfdeaf302c Merge pull request #41652 from ncdc/shared-informers-13-namespace
Automatic merge from submit-queue (batch tested with PRs 39855, 41433, 41567, 41887, 41652)

Switch namespace controller to shared informer

@smarterclayton @derekwaynecarr @gmarek @wojtek-t @deads2k @sttts @liggitt @kubernetes/sig-scalability-pr-reviews
2017-02-23 09:36:38 -08:00
Kubernetes Submit Queue 59f4c5911a Merge pull request #41819 from dchen1107/master
Automatic merge from submit-queue (batch tested with PRs 38957, 41819, 41851, 40667, 41373)

Bump GCI to gci-stable-56-9000-84-2

Changelogs since gci-beta-56-9000-80-0:

- Fixed google-accounts-daemon breaks on GCI when network is unavailable.
- Fixed iptables-restore performance regression.

cc/ @adityakali @Random-Liu @fabioy
2017-02-22 19:59:33 -08:00
Random-Liu 1c8e127973 Move node problem detector test into node e2e. 2017-02-22 14:35:46 -08:00
Derek Carr 43ae6f49ad Enable per pod cgroups, fix defaulting of cgroup-root when not specified 2017-02-21 16:34:22 -05:00
Dawn Chen 57fe26111e Update node-e2e to gci-stable-56-9000-84-2 2017-02-21 10:05:44 -08:00
Andy Goldstein 99313cc394 Switch namespace controller to shared informer 2017-02-17 12:34:27 -05:00
Yu-Ju Hong 0189da49ce Add non-cri configurations for node e2e tests 2017-02-15 11:02:53 -08:00
Kubernetes Submit Queue 4ac7fd9d19 Merge pull request #40934 from dashpole/density_test_cadvisor
Automatic merge from submit-queue

delete cadvisor pod after test

tracing looks at events for pod deletion and volume teardown.  SInce the cadvisor pod has more than 1 volume, this can make results harder to analyze.
This PR moves the deletion of the cadvisor pod to after the logPodCreateThroughput call, since that marks the "end" of the test.

cc: @dchen1107 @Random-Liu
2017-02-14 17:40:32 -08:00
Random-Liu 1226c5794a Print running containers in infra container oom score test. 2017-02-10 17:45:21 -08:00
Kubernetes Submit Queue 558c37aee3 Merge pull request #41112 from janetkuo/no-watch-until
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)

e2e test flakes: remove some uses of watch.Until in e2e tests

`watch.Until` is somewhat broken and is causing quite a lot of test flakes. See https://github.com/kubernetes/kubernetes/issues/39879#issuecomment-277966375 and https://github.com/kubernetes/kubernetes/issues/31345 for more context.

@wojtek-t @yujuhong @kargakis
2017-02-10 01:40:41 -08:00
Kubernetes Submit Queue f5c07157a8 Merge pull request #41092 from yujuhong/cri-docker1_10
Automatic merge from submit-queue (batch tested with PRs 41037, 40118, 40959, 41084, 41092)

CRI node e2e: add tests for docker 1.10
2017-02-09 16:44:44 -08:00
Kubernetes Submit Queue b7772e4f89 Merge pull request #40048 from mtaufen/remove-deprecated-flags
Automatic merge from submit-queue (batch tested with PRs 41121, 40048, 40502, 41136, 40759)

Remove deprecated kubelet flags that look safe to remove

Removes:
```
--config
--auth-path
--resource-container
--system-container
```
which have all been marked deprecated since at least 1.4 and look safe to remove.

```release-note
The deprecated flags --config, --auth-path, --resource-container, and --system-container were removed.
```
2017-02-09 14:27:45 -08:00
David Ashpole ab2ce9cd73 lengthen pod deletion timeout to prevent flakes 2017-02-08 13:12:51 -08:00
Janet Kuo 7c89359cc8 Address comments: remove unused resourceVersion in e2e util wait loop; poll pods every 2 seconds 2017-02-08 13:05:11 -08:00
Yu-Ju Hong 3d78271dd9 CRI node e2e: add tests for docker 1.10
This is part of #38164
2017-02-08 10:21:12 -08:00
Random-Liu 4e231ee3dc Remove angle brackets in the test name. 2017-02-07 16:22:59 -08:00
Michael Taufen 982df56c52 Replace uses of --config with --pod-manifest-path 2017-02-07 14:32:37 -08:00
Kubernetes Submit Queue 5d0377d2e2 Merge pull request #41027 from dchen1107/master
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)

Bump GCI to gci-beta-56-9000-80-0

cc/ @Random-Liu @adityakali 

Changelogs since gci-dev-56-8977-0-0 (currently used in Kubernetes):
 - "net.ipv4.conf.eth0.forwarding" and "net.ipv4.ip_forward" may get reset to 0
 - Track CVE-2016-9962 in Docker in GCI
 - Linux kernel CVE-2016-7097
 - Linux kernel CVE-2015-8964
 - Linux kernel CVE-2016-6828
 - Linux kernel CVE-2016-7917
 - Linux kernel CVE-2016-7042
 - Linux kernel CVE-2016-9793
 - Linux kernel CVE-2016-7039 and CVE-2016-8666
 - Linux kernel CVE-2016-8655
 - Toolbox: allow docker image to be loaded from local tarball
 - Update compute-image-package in GCI 
 - Change the product name on /etc/os-release (to COS)
 - Remove 'dogfood' from HWID_OVERRIDE in /etc/lsb-release
 - Include Google NVME extensions to optimize LocalSSD performance.
 - /proc/<pid>/io missing on GCI (enables process stats accounting)
 - Enable BLK_DEV_THROTTLING

cc/ @roberthbailey @fabioy for GKE cluster update
2017-02-06 20:57:14 -08:00
Dawn Chen 687aa5768b Update node-e2e tests to gci-beta-56-9000-80-0 2017-02-06 09:25:48 -08:00
Michael Taufen 945d223738 Set custom threshold for memory eviction test 2017-02-03 16:48:34 -08:00
Derek Carr 2ab9f0384e Update test e2e nodes to use new flag 2017-02-03 17:21:37 -05:00
Derek Carr 04a909a257 Rename cgroups-per-qos flag to not be experimental 2017-02-03 17:10:53 -05:00
David Ashpole 4cd60e2393 delete cadvisor pod after test 2017-02-03 10:33:43 -08:00
Kubernetes Submit Queue 12a80380bc Merge pull request #40874 from dashpole/density_test_volumes
Automatic merge from submit-queue (batch tested with PRs 40864, 40666, 38382, 40874)

Density Test includes deletion and volumes

Moved the calls to deletePodSync to BEFORE logDensityTimeSeries.  This is because the parser considers a line printed in logDensityTimeSeries to be the "end" of the test.  This change includes deletion in the "test window", but makes no other changes.

I also added volumes to the test, so that we can make sure that mounting and unmounting volumes are also taken into account for performance profiling.
2017-02-02 21:04:52 -08:00
Kubernetes Submit Queue 7201f3b989 Merge pull request #40884 from Random-Liu/update-to-docker-1-12-6
Automatic merge from submit-queue (batch tested with PRs 40884, 40809, 40845, 40866, 40875)

Node E2E: Create new ubuntu image with docker 1.12.6.

We should test the newest docker 1.12 version - 1.12.6.

/cc @dchen1107 @yujuhong @kubernetes/sig-node-pr-reviews
2017-02-02 18:53:47 -08:00
Kubernetes Submit Queue 8a8f6ca849 Merge pull request #40525 from lucab/to-k8s/node-e2e-local-cri
Automatic merge from submit-queue (batch tested with PRs 40812, 39903, 40525, 40729)

test/node_e2e: wire-in cri-enabled local testing

This commit wires-in the pre-existing `--container-runtime` flag for
local node_e2e testing.
This is needed in order to further skip docker specific testing
and validation.

Local CRI node_e2e can now be performed via
`make test-e2e-node RUNTIME=remote REMOTE=false`
which will also take care of passing the appropriate argument to
the kubelet.
2017-02-02 13:57:48 -08:00
Random-Liu ec7f34a24b Create new ubuntu image with docker 1.12.6. 2017-02-02 11:52:54 -08:00
David Ashpole ad73b325f3 changed density test to use volumes, and include deletion before logging 2017-02-02 08:51:01 -08:00
Luca Bruno 42bdbe5c82
test/node_e2e: wire-in "container-runtime" for local tests
This commit wires-in the pre-existing `--container-runtime` flag for
local node_e2e testing.
This is needed in order to further skip docker specific testing
and validation.

Local CRI node_e2e can now be performed via
`make test-e2e-node RUNTIME=remote REMOTE=false`
which will also take care of passing the appropriate arguments to
the kubelet.
2017-02-01 20:34:51 +00:00
Kubernetes Submit Queue f272781259 Merge pull request #40529 from lucab/to-k8s/e2e_node-kubelet-busybox-argv0
Automatic merge from submit-queue (batch tested with PRs 40529, 40630)

test/e2e_node: tie together expected string and exec

This commit ties together busybox-sh invocation and test expectation
to avoid subtle mismatches between exec command and output string.
2017-02-01 00:16:37 -08:00
Kubernetes Submit Queue e5d647988e Merge pull request #39049 from ixdy/node-e2e-ssh-key
Automatic merge from submit-queue

Add flag to node e2e test specifying location of ssh privkey

**What this PR does / why we need it**: in CI, the ssh private key is not always located at `$HOME/.ssh`, so it's helpful to be able to override it.

@krzyzacy here's my resurrected change. I'm not sure why I neglected to follow-through on it originally.

**Release note**:

```release-note
NONE
```
2017-01-31 13:40:26 -08:00
deads2k c9a008dff3 move util/intstr to apimachinery 2017-01-30 12:46:59 -05:00
Dr. Stefan Schimanski 44ea6b3f30 Update generated files 2017-01-29 21:41:45 +01:00
Dr. Stefan Schimanski bc6fdd925d pkg/api/resource: move to apimachinery 2017-01-29 21:41:44 +01:00
Lucas Käldström 84006601a0
Upgrade go version in Makefiles to 1.7, use qemu 2.7, armel => armhf and goarm=6 => goarm=7 and use go 1.7.4 2017-01-27 20:04:24 +02:00
Luca Bruno 05bff300f3
test/e2e_node: tie together expected string and exec
This commit ties together busybox-sh invocation and test expectation
to avoid subtle mismatches between exec command and output string.
2017-01-26 17:14:06 +00:00
deads2k 2734f8f892 move dynamic and discovery clients 2017-01-26 08:37:06 -05:00
Kubernetes Submit Queue 7bce538d0b Merge pull request #40326 from mtaufen/gci-cos-node-tests
Automatic merge from submit-queue

Prep node_e2e for GCI to COS name change

GCI will soon change name in etc/os-release from "gci" to "cos".

This prepares the node_e2e tests to deal with that change and also updates some comments/log messages/var names in anticipation.
2017-01-25 20:15:32 -08:00
Dr. Stefan Schimanski a0137e9b28 Update generated files 2017-01-25 19:49:45 +01:00
Dr. Stefan Schimanski d7eb3b6870 pkg/util: move uuid and strategicpatch into k8s.io/apimachinery 2017-01-25 19:45:09 +01:00
deads2k b0b156b381 make tools/cache authoritative 2017-01-25 08:29:45 -05:00
Kubernetes Submit Queue df42444742 Merge pull request #40216 from sttts/sttts-more-cutoffs
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)

genericapiserver: more dependency cutoffs

- cut-off pkg/api.Resource and friends - lgtm
- authn plugins -> k8s.io/apiserver - 
- webhook authz plugin -> k8s.io/apiserver - lgtm
- ~~pkg/cert -> k8s.io/apimachinery (will rebase on @deads2k's PR also moving it)~~
- split pkg/config into kubelet config merger and flags - lgtm
- split feature gate between generic apiserver and kube - lgtm
- move pkg/util/flag into k8s.io/apiserver - lgtm
2017-01-24 16:26:00 -08:00
Dr. Stefan Schimanski 2b8e938128 Update generated files 2017-01-24 20:56:03 +01:00
Dr. Stefan Schimanski a6b2ebb50c pkg/flag: make feature gate extensible and split between generic and kube 2017-01-24 20:56:03 +01:00
Dr. Stefan Schimanski 56d60cfae6 pkg/util: move flags from pkg/util/config to pkg/util/flags 2017-01-24 20:56:03 +01:00
Clayton Coleman be6d2933df
refactor: Move *Options references to metav1 2017-01-24 13:41:51 -05:00
Michael Taufen 924ec4711b Prep node_e2e for GCI to COS name change 2017-01-23 15:00:39 -08:00
Clayton Coleman 469df12038
refactor: move ListOptions references to metav1 2017-01-23 17:52:46 -05:00
Clayton Coleman 244734171e
Add conformance tests for terminationMessage(Path|Policy)
Test root, non-root, success and message, failure and message.
2017-01-23 12:26:37 -05:00
Sen Lu cb4ea07229 Add --ssh-user to conformance script as well 2017-01-20 16:07:13 -08:00
deads2k ee6752ef20 find and replace 2017-01-20 08:04:53 -05:00
deads2k c587b8a21e re-run client-gen 2017-01-20 08:02:36 -05:00
Kubernetes Submit Queue 582aa0d793 Merge pull request #40107 from dashpole/flaky_dynamic_kconfig
Automatic merge from submit-queue

turn on dynamic config for flaky tests

Added dynamic config to inode eviction node e2e tests in #39546, but did not enable it for flaky tests.  This PR enables this feature for the flaky test suite
2017-01-18 20:13:00 -08:00
Kubernetes Submit Queue b29d9cdbcf Merge pull request #39898 from ixdy/bazel-release-tars
Automatic merge from submit-queue

Build release tars using bazel

**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.

For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```

**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.

Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.

With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.

My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)

Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.

Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.

**Release note**:

```release-note
NONE
```
2017-01-18 14:24:48 -08:00
David Ashpole caa9cc0b70 turn on dynamic config for flaky tests 2017-01-18 13:48:47 -08:00
Antoine Pelisse 86248b5f0b Update OWNERS approvers and reviewers: test/e2e_node 2017-01-18 10:16:20 -08:00
Clayton Coleman 9a2a50cda7
refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
deads2k 77b4d55982 mechanical 2017-01-16 09:35:12 -05:00
Kubernetes Submit Queue 4744e7ec52 Merge pull request #39889 from Random-Liu/add-docker-1.12-node-e2e
Automatic merge from submit-queue (batch tested with PRs 38427, 39896, 39889, 39871, 39895)

Add docker 1.12 in node e2e.

Add docker 1.12 image in node e2e (including regular node e2e and cri node e2e).

@dchen1107 @yujuhong 
/cc @kubernetes/sig-node-misc
2017-01-13 20:21:38 -08:00
Random-Liu 04e68619ce Add docker 1.12 in node e2e. 2017-01-13 14:58:49 -08:00
Jeff Grafton 14dd0d3bef Add genrule to produce e2e_node.test binary artifact 2017-01-13 14:46:26 -08:00
Kubernetes Submit Queue 6b5d82b512 Merge pull request #37505 from k82cn/use_controller_inf
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)

Made cache.Controller to be interface.

**What this PR does / why we need it**:

#37504
2017-01-13 13:40:41 -08:00
Kubernetes Submit Queue a6fa5c2bfd Merge pull request #39814 from deads2k/api-58-multi-register
Automatic merge from submit-queue

replace global registry in apimachinery with global registry in k8s.io/kubernetes

We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work.  Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.

@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
2017-01-13 12:37:02 -08:00
deads2k f1176d9c5c mechanical repercussions 2017-01-13 08:27:14 -05:00
Klaus Ma 25fe1e0d82 Made cache.Controller to be interface. 2017-01-13 13:33:23 +08:00
NickrenREN a12dea14e0 fix redundant alias clientset 2017-01-12 10:21:05 +08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Kubernetes Submit Queue 3f9f7471af Merge pull request #38989 from sjenning/set-qos-field
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)

Set PodStatus QOSClass field

This PR continues the work for https://github.com/kubernetes/kubernetes/pull/37968

It converts all local usage of the `qos` package class types to the new API level types (first commit) and sets the pod status QOSClass field in the at pod creation time on the API server in `PrepareForCreate` and in the kubelet in the pod status update path (second commit).  This way the pod QOS class is set even if the pod isn't scheduled yet.

Fixes #33255

@ConnorDoyle @derekwaynecarr @vishh
2017-01-10 22:24:13 -08:00
Kubernetes Submit Queue a2da4f0cac Merge pull request #39546 from dashpole/dynamic_config_eviction_hard
Automatic merge from submit-queue (batch tested with PRs 39695, 37054, 39627, 39546, 39615)

Use Dynamic Config in e2e_node inode eviction test

Alternative solution to #39249.  Similar to solution proposed by @vishh in #36828.

@Random-Liu @mtaufen
2017-01-10 18:57:26 -08:00
Jeff Grafton 19aafd291c Always --pull in docker build to ensure recent base images 2017-01-10 16:21:05 -08:00
Seth Jennings e2402b781b set qos class field in pod status 2017-01-10 16:31:52 -06:00
Seth Jennings 4c30459e49 switch from local qos types to api types 2017-01-10 10:54:30 -06:00
David Ashpole c3951a72ab use dynamic config to set eviction hard threshold 2017-01-09 15:27:12 -08:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Kubernetes Submit Queue f4a8713088 Merge pull request #36229 from wojtek-t/bump_etcd_version
Automatic merge from submit-queue (batch tested with PRs 36229, 39450)

Bump etcd to 3.0.14 and switch to v3 API in etcd.

Ref #20504

**Release note**:

```release-note
Switch default etcd version to 3.0.14.
Switch default storage backend flag in apiserver to `etcd3` mode.
```
2017-01-04 17:36:06 -08:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
Random-Liu a719a7d7e7 Do not use sudo when untar node e2e tar ball. 2016-12-21 16:28:33 -08:00
Jeff Grafton 30a5efa33b Add flag to node e2e test specifying location of ssh privkey 2016-12-21 11:52:41 -08:00
Kubernetes Submit Queue 1955ed614f Merge pull request #39074 from Random-Liu/node-e2e-set-user
Automatic merge from submit-queue

Node E2E: Set user with `--ssh-user` flag when running remote node e2e.

This PR unblocks https://github.com/kubernetes/test-infra/issues/1348.

In our test environment, we must login test instance as user `jenkins` because of the service account. Node e2e is always using the default user on the host, which works fine till now, because it is always run as `jenkins` in our test environment.

However, now we moved the test runner into a docker container, inside the container user is `root` by default, which will cause error:
```
Permission denied (publickey)
```

This PR added a flag `--ssh-user` to explicitly specify the user used to ssh into test instance. The dockerized test runner can set user to `jenkins` with this flag.

@krzyzacy  @ixdy
2016-12-21 11:21:09 -08:00
Random-Liu 10f72be5af Support set user with `--ssh-user` flag when running remote node e2e. 2016-12-21 01:54:02 -08:00
Wojciech Tyczynski 498a893fa3 Switch to etcd v3 API by default 2016-12-20 11:57:46 +01:00
Kubernetes Submit Queue 5b2823adb9 Merge pull request #38191 from sttts/sttts-move-master-options
Automatic merge from submit-queue

Move non-generic apiserver code out of the generic packages
2016-12-17 01:25:45 -08:00
Matt Liggett 69cd805532 Merge pull request #38804 from Random-Liu/disable-au
Node E2E: Disable AU in node e2e test.
2016-12-16 15:32:23 -08:00
David Ashpole 5d352439d4 test no longer fails when it fails to get the summary 2016-12-16 11:50:43 -08:00
Dr. Stefan Schimanski 7267299c3c genericapiserver: move MasterCount and service options into master 2016-12-16 17:23:43 +01:00
Random-Liu c57f2ec064 Fix report prefix for node conformance test. 2016-12-15 15:27:14 -08:00
Kubernetes Submit Queue d169d59565 Merge pull request #38788 from Random-Liu/fix-node-conformance-test
Automatic merge from submit-queue (batch tested with PRs 38788, 38821, 38829)

Node Conformance Test: Fix node conformance test.

The test suite could build on my desktop. However it is failing on jenkins.
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-conformance/1

It turns out that `docker save $IMAGE -o $FILE` only works for docker 1.12. (My desktop is 1.12) For older version docker, we should use `docker save -o $FILE $IMAGE instead`. (Jenkins is using 1.9.1)

@timstclair Could you help me review this short PR? :)
2016-12-15 13:57:16 -08:00
Random-Liu e5efc21de6 Disable AU in node e2e test. 2016-12-15 01:33:09 -08:00
Kubernetes Submit Queue d8efc779ed Merge pull request #38154 from caesarxuchao/rename-release_1_5
Automatic merge from submit-queue (batch tested with PRs 38154, 38502)

Rename "release_1_5" clientset to just "clientset"

We used to keep multiple releases in the main repo. Now that [client-go](https://github.com/kubernetes/client-go) does the versioning, there is no need to keep releases in the main repo. This PR renames the "release_1_5" clientset to just "clientset", clientset development will be done in this directory.

@kubernetes/sig-api-machinery @deads2k 

```release-note
The main repository does not keep multiple releases of clientsets anymore. Please find previous releases at https://github.com/kubernetes/client-go
```
2016-12-14 14:21:51 -08:00
Random-Liu 06507e9378 Fix node conformance test. 2016-12-14 14:21:10 -08:00
Chao Xu 03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Random-Liu 02e96df55c Update log level. 2016-12-13 19:08:55 -08:00
Random-Liu 54c874f2c6 Update bazel. 2016-12-13 19:08:55 -08:00
Random-Liu 4cdd1b788a Add node conformance ci test. 2016-12-13 19:08:55 -08:00
Random-Liu b7ec229e2c Add run kubelet mode. 2016-12-13 19:08:55 -08:00
Random-Liu bca5aea5ba Refactor RunRemote to support TestSuite interface. 2016-12-13 19:08:55 -08:00
Random-Liu 99dc80ccc2 Add TestSuite interface and update the CreateTestArchive function. 2016-12-13 19:08:55 -08:00
Kubernetes Submit Queue aca523f586 Merge pull request #38664 from dashpole/flaky_inode
Automatic merge from submit-queue

Inode Eviction Test is Flaky

This Pull Request:
Marks the InodeEviciton test as flaky
Increases the timeout for disk pressure because coreos has nearly 2 million inodes.
Decreases the status polling interval so we can see eviction ordering better.
@Random-Liu
2016-12-13 16:22:40 -08:00
David Ashpole 0ec482308e increase timeout and marked as flaky. decreased polling interval to better monitor eviction ordering 2016-12-12 17:31:24 -08:00
Mike Danese c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
Random-Liu b27776a25f Update CVM version to e2e-node-containervm-v20161208-image. 2016-12-12 01:50:37 -08:00
Random-Liu 486ddae35a `make test-e2e-node` runs the same test with pr builder by default. 2016-12-09 16:06:18 -08:00
Wojciech Tyczynski a9ec31209e GetOptions - fix tests 2016-12-09 09:42:01 +01:00
Jun Gong 036899ec98 Add --image-pull-progress-deadline option to kubelet 2016-12-09 09:28:57 +08:00
David Ashpole 93ca4bbf47 adjusted timeouts; fixed but where pods are deleted late; added delay to account for delayed pressure 2016-12-08 14:01:30 -08:00
Kubernetes Submit Queue 6b9a944285 Merge pull request #36637 from resouer/nits-e2e
Automatic merge from submit-queue

Fix useless uuid  in container log path node e2e

@timstclair pointed out there're nits in original PR, ref: https://github.com/kubernetes/kubernetes/pull/34877

So this patch: 
1. removed useless uuid
2. change all those strings to const

Thanks. 🐱
2016-12-08 00:43:57 -08:00
Tim St. Clair f4706464f6
Decrease expected lower bound for misc CPU 2016-12-07 12:39:48 -08:00
Random-Liu 6f92572209 Move ssh related functions into ssh.go. 2016-12-05 23:59:58 -08:00
Random-Liu 7c2b1f4752 Remove setup-node, which is not needed after we run the whole test as
root.
2016-12-05 16:16:08 -08:00
Dr. Stefan Schimanski 24e24fc7bb Add verb support to gc and namespace controllers 2016-12-05 12:36:05 +01:00
Clayton Coleman 3454a8d52c
refactor: update bazel, codec, and gofmt 2016-12-03 19:10:53 -05:00
Clayton Coleman 5df8cc39c9
refactor: generated 2016-12-03 19:10:46 -05:00
Kubernetes Submit Queue 9f66f7544b Merge pull request #37856 from timstclair/summary-test
Automatic merge from submit-queue (batch tested with PRs 37692, 37785, 37647, 37941, 37856)

Verify misc container in summary test

Should detect issue from https://github.com/kubernetes/kubernetes/issues/35214, https://github.com/kubernetes/kubernetes/issues/37453

/cc @piosz @dchen1107
2016-12-03 11:45:03 -08:00
Kubernetes Submit Queue ba62dafe39 Merge pull request #37663 from Random-Liu/fix-node-e2e-firewall-configure
Automatic merge from submit-queue (batch tested with PRs 37094, 37663, 37442, 37808, 37826)

Node E2E: Fix node e2e firewall configure.

Get rid of the misleading error message:
```
W1129 12:57:16.967] E1129 12:57:16.967130   29815 remote.go:204] Failed to configured firewall: command [ssh -i /home/jenkins/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR 104.154.201.208 -- sudo sh -c 'iptables -L INPUT | grep "Chain INPUT (policy DROP)"&&(iptables -C INPUT -w -p TCP -j ACCEPT || iptables -A INPUT -w -p TCP -j ACCEPT)&&(iptables -C INPUT -w -p UDP -j ACCEPT || iptables -A INPUT -w -p UDP -j ACCEPT)&&(iptables -C INPUT -w -p ICMP -j ACCEPT || iptables -A INPUT -w -p ICMP -j ACCEPT)'] failed with error: exit status 1 output: 
W1129 12:57:17.271] E1129 12:57:17.271169   29815 remote.go:213] Failed to configured firewall: command [ssh -i /home/jenkins/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR 104.154.201.208 -- sudo sh -c 'iptables -L FORWARD | grep "Chain FORWARD (policy DROP)" > /dev/null&&(iptables -C FORWARD -w -p TCP -j ACCEPT || iptables -A FORWARD -w -p TCP -j ACCEPT)&&(iptables -C FORWARD -w -p UDP -j ACCEPT || iptables -A FORWARD -w -p UDP -j ACCEPT)&&(iptables -C FORWARD -w -p ICMP -j ACCEPT || iptables -A FORWARD -w -p ICMP -j ACCEPT)'] failed with error: exit status 1 output: 
W1129 12:57:17.557] E1129 12:57:17.556683   29815 remote.go:204] Failed to configured firewall: command [ssh -i /home/jenkins/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR 104.154.128.178 -- sudo sh -c 'iptables -L INPUT | grep "Chain INPUT (policy DROP)"&&(iptables -C INPUT -w -p TCP -j ACCEPT || iptables -A INPUT -w -p TCP -j ACCEPT)&&(iptables -C INPUT -w -p UDP -j ACCEPT || iptables -A INPUT -w -p UDP -j ACCEPT)&&(iptables -C INPUT -w -p ICMP -j ACCEPT || iptables -A INPUT -w -p ICMP -j ACCEPT)'] failed with error: exit status 1 output: 
W1129 12:57:17.771] I1129 12:57:17.771236   29815 remote.go:231] Killing any existing node processes on tmp-node-e2e-a1212c32-gci-dev-56-8977-0-0
W1129 12:57:17.877] E1129 12:57:17.877123   29815 remote.go:213] Failed to configured firewall: command [ssh -i /home/jenkins/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR 104.154.128.178 -- sudo sh -c 'iptables -L FORWARD | grep "Chain FORWARD (policy DROP)" > /dev/null&&(iptables -C FORWARD -w -p TCP -j ACCEPT || iptables -A FORWARD -w -p TCP -j ACCEPT)&&(iptables -C FORWARD -w -p UDP -j ACCEPT || iptables -A FORWARD -w -p UDP -j ACCEPT)&&(iptables -C FORWARD -w -p ICMP -j ACCEPT || iptables -A FORWARD -w -p ICMP -j ACCEPT)'] failed with error: exit status 1 output: 
W1129 12:57:17.898] I1129 12:57:17.898711   29815 remote.go:239] Extracting tar on tmp-node-e2e-a1212c32-gci-dev-56-8977-0-0
W1129 12:57:17.941] E1129 12:57:17.941566   29815 remote.go:204] Failed to configured firewall: command [ssh -i /home/jenkins/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR 104.154.154.237 -- sudo sh -c 'iptables -L INPUT | grep "Chain INPUT (policy DROP)"&&(iptables -C INPUT -w -p TCP -j ACCEPT || iptables -A INPUT -w -p TCP -j ACCEPT)&&(iptables -C INPUT -w -p UDP -j ACCEPT || iptables -A INPUT -w -p UDP -j ACCEPT)&&(iptables -C INPUT -w -p ICMP -j ACCEPT || iptables -A INPUT -w -p ICMP -j ACCEPT)'] failed with error: exit status 1 output: 
W1129 12:57:18.020] I1129 12:57:18.019802   29815 remote.go:231] Killing any existing node processes on tmp-node-e2e-a1212c32-coreos-alpha-1122-0-0-v20160727
W1129 12:57:18.024] E1129 12:57:18.024044   29815 remote.go:213] Failed to configured firewall: command [ssh -i /home/jenkins/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR 104.154.154.237 -- sudo sh -c 'iptables -L FORWARD | grep "Chain FORWARD (policy DROP)" > /dev/null&&(iptables -C FORWARD -w -p TCP -j ACCEPT || iptables -A FORWARD -w -p TCP -j ACCEPT)&&(iptables -C FORWARD -w -p UDP -j ACCEPT || iptables -A FORWARD -w -p UDP -j ACCEPT)&&(iptables -C FORWARD -w -p ICMP -j ACCEPT || iptables -A FORWARD -w -p ICMP -j ACCEPT)'] failed with error: exit status 1 output: 
```

The problem is that the command 'iptables -L FORWARD | grep "Chain FORWARD (policy DROP)" returns an error when the rule is not found, which is not expected behaviour.

@freehan
2016-12-03 04:27:48 -08:00
Kubernetes Submit Queue 7ec3be4c8e Merge pull request #36964 from ixdy/gobin-build
Automatic merge from submit-queue

Build vendored copy of go-bindata and use that in go generate step

**What this PR does / why we need it**: as the title says, uses the vendored version of `go-bindata` rather than expecting developers to `go get` it (when building outside docker).

**Which issue this PR fixes**: fixes #34067, partially addresses #36655

**Special notes for your reviewer**: we still call `go generate` far too many times:
```console
~/.../src/k8s.io/kubernetes $ which go-bindata
~/.../src/k8s.io/kubernetes $ make
+++ [1116 17:35:28] Building the toolchain targets:
    k8s.io/kubernetes/hack/cmd/teststale
    k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:29] Generating bindata:
    test/e2e/framework/gobindata_util.go
+++ [1116 17:35:30] Building go targets for linux/amd64:
    cmd/libs/go2idl/deepcopy-gen
+++ [1116 17:35:35] Building the toolchain targets:
    k8s.io/kubernetes/hack/cmd/teststale
    k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:35] Generating bindata:
    test/e2e/framework/gobindata_util.go
+++ [1116 17:35:36] Building go targets for linux/amd64:
    cmd/libs/go2idl/defaulter-gen
+++ [1116 17:35:41] Building the toolchain targets:
    k8s.io/kubernetes/hack/cmd/teststale
    k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:41] Generating bindata:
    test/e2e/framework/gobindata_util.go
+++ [1116 17:35:42] Building go targets for linux/amd64:
    cmd/libs/go2idl/conversion-gen
+++ [1116 17:35:47] Building the toolchain targets:
    k8s.io/kubernetes/hack/cmd/teststale
    k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:47] Generating bindata:
    test/e2e/framework/gobindata_util.go
+++ [1116 17:35:48] Building go targets for linux/amd64:
    cmd/libs/go2idl/openapi-gen
+++ [1116 17:35:56] Building the toolchain targets:
    k8s.io/kubernetes/hack/cmd/teststale
    k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:56] Generating bindata:
    test/e2e/framework/gobindata_util.go
```
Fixing that is a separate effort, though.

cc @sebgoa @ZhangBanger
2016-12-02 07:29:01 -08:00
Kubernetes Submit Queue 3a4b4749e3 Merge pull request #37648 from Random-Liu/collect-serial-output-node-e2e
Automatic merge from submit-queue

Node E2E: Collect serial output

This is a temporary solution to collect serial output from test GCE node in node e2e.

We should come up with a better idea later. Ideally, node e2e should share the same log collection logic with cluster e2e. https://github.com/kubernetes/kubernetes/blob/master/cluster/log-dump.sh

Mark v1.5 because this helps debug https://github.com/kubernetes/kubernetes/issues/37333.

@mtaufen @dchen1107 
/cc @kubernetes/sig-node
2016-12-01 15:06:59 -08:00
Tim St. Clair bcf5e434fa
Verify misc container in summary test 2016-12-01 14:02:39 -08:00
Daniel Smith 5b1d875f27 Revert "Modify GCI mounter to enable NFSv3" 2016-12-01 11:47:24 -08:00
Random-Liu 6d4e457f1f Collect serial output when test fails in node e2e. 2016-12-01 10:41:24 -08:00
Kubernetes Submit Queue 2fab199390 Merge pull request #36334 from luxas/add_preflight
Automatic merge from submit-queue

Add the system verification test to the kubeadm preflight checks

And refactor the system verification test to accept to write to a specific writer in order to customize the output

This PR is targeting v1.5, PTAL
cc @Random-Liu @dchen1107 @kubernetes/sig-cluster-lifecycle
2016-12-01 04:52:07 -08:00
Jeff Grafton 0d9d623f04 Build vendored copy of go-bindata and use that in go generate step
Additionally remove all instances of `go get`ing go-bindata
2016-11-30 22:23:40 -08:00
Kubernetes Submit Queue 6c2c12fafa Merge pull request #37582 from jingxu97/Nov/retrynfsv3
Automatic merge from submit-queue

Modify GCI mounter to enable NFSv3
2016-11-30 21:59:08 -08:00
Kubernetes Submit Queue b0fd700f61 Merge pull request #36604 from deads2k/api-42-add-generic-loopback
Automatic merge from submit-queue

move parts of the mega generic run struct out

This splits the main `ServerRunOptions` into composeable pieces that are bindable separately and adds easy paths for composing servers to run delegating authentication and authorization.

@sttts @ncdc alright, I think this is as far as I need to go to make the composing servers reasonable to write.  I'll try leaving it here
2016-11-30 21:11:05 -08:00
Pengfei Ni f584ed4398 Fix package aliases to follow golang convention 2016-11-30 15:40:50 +08:00
Kubernetes Submit Queue 42d5a1a9cd Merge pull request #37392 from Random-Liu/final-cleanup-for-nct
Automatic merge from submit-queue

Node Conformance Test: Final cleanup for node conformance test.

This PR fits node conformance test with recent change.
* Remove `--manifest-path` because the test will get kubelet configuration through `/configz` now. https://github.com/kubernetes/kubernetes/pull/36919
* Add `$TEST_ARGS` so that we can override arguments inside the container.
* Fix a bug in garbage_collector_test.go which will cause the framework tries to connect docker no matter running the test or not. @dashpole 
* Add `${REGISTRY}/node-test:${VERSION}` for convenience. 
* Bump up the image version to `0.2`. (the one released with v1.4 is `v0.1`)

I've run the test both with `run_test.sh` script and directly `docker run`. Both of them passed.

After this gets merged, I'll build and push the new test image.

@dchen1107 
/cc @kubernetes/sig-node
2016-11-29 22:39:52 -08:00
Random-Liu 85afed5dd0 Fix node e2e firewall configure. 2016-11-29 15:40:31 -08:00
Jing Xu 80f2e58ccc Modify GCI mounter to enable NFSv3
This PR is a retry for PR #36610
2016-11-29 10:50:33 -08:00
deads2k 56b7a8b02b remove some options from mega-struct 2016-11-29 10:59:43 -05:00
deads2k a9af8206cb split generic etcdoption out of main struct 2016-11-29 10:59:42 -05:00
David Ashpole 679c503ae3 changed api to api/v1 2016-11-28 14:24:43 -08:00
David Ashpole a232c15a45 InodeEviction test tests that when some pods create many empty files, both in their container and in volumes, they are evicted before pods that act normally. 2016-11-28 13:09:40 -08:00
Harry Zhang 3002f5719d Fix nits in node e2e log path 2016-11-28 15:24:22 +08:00
Clayton Coleman 35a6bfbcee
generated: refactor 2016-11-23 22:30:47 -06:00
Clayton Coleman a43960da3b Move GroupVersion* to pkg/runtime/schema 2016-11-23 21:03:36 -06:00
Chao Xu bcc783c594 run hack/update-all.sh 2016-11-23 15:53:09 -08:00
Chao Xu 29400ac195 test/e2e_node 2016-11-23 15:53:09 -08:00
Random-Liu dfbe7be5b5 Final cleanup for node conformance test. 2016-11-23 13:39:54 -08:00
Kubernetes Submit Queue 4c7febd360 Merge pull request #37338 from Random-Liu/fix-remote-log-fetching
Automatic merge from submit-queue

Node E2E: Fix remote log fetching.

For issue https://github.com/kubernetes/kubernetes/issues/37333.

This will help debug https://github.com/kubernetes/kubernetes/issues/37333.

Mark v1.5 because this helps debug an issue https://github.com/kubernetes/kubernetes/issues/37333, which was originally https://github.com/kubernetes/kubernetes/issues/35935. /cc @saad-ali 

@yujuhong @dchen1107 @jingxu97 
/cc @kubernetes/sig-node
2016-11-23 10:58:32 -08:00
Piotr Szczesniak a3e6ad4b9a Revert "Modify GCI mounter to enable NFSv3" 2016-11-23 13:15:37 +01:00
Kubernetes Submit Queue e801fcfc4a Merge pull request #36610 from jingxu97/Nov/nfsv3
Automatic merge from submit-queue

Modify GCI mounter to enable NFSv3

In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.

After this change, need to make push the image and update the sha number in Changelog.
2016-11-22 23:38:51 -08:00
Random-Liu e000ff0872 Fix remote log fetching. 2016-11-22 18:40:40 -08:00
Kubernetes Submit Queue e4724e8ab0 Merge pull request #37109 from Random-Liu/fix-lifecycle-hook-test
Automatic merge from submit-queue

Use netexec container in http lifecycle hook test.

Fixes https://github.com/kubernetes/kubernetes/issues/33636.

The original test is using `"echo -e \"HTTP/1.1 200 OK\n\" | nc -l -p 1234` as a simple http server.

However, it seems that this is not very reliable, which may response before golang thinks it should.
So we get the error:
```
I1106 06:14:13.325397    2096 logs.go:41] Unsolicited response received on idle HTTP channel starting with "HTTP/1.1 200 OK\n\n"; err=<nil>
```

This PR changes the test to use the `netexec` container which is a simple http server written by golang and used in many of our networking e2e test. It should be more reliable.
Mark 1.5 since this is fixing a 1.5 release blocking issue. Mark P0 to match the original issue.

@dchen1107
2016-11-22 12:41:37 -08:00
Random-Liu e0cdeb4c2a Use netexec container in http lifecycle hook test. 2016-11-22 10:09:19 -08:00
Random-Liu ab99cc0eba Update ubuntu image to e2e-node-ubuntu-trusty-docker10-v2-image. 2016-11-22 01:22:20 -08:00
Jing Xu 2a8d89e5d1 Modify GCI mounter to enable NFSv3
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.

After this change, need to make push the image and update the sha number in Changelog.
2016-11-21 16:42:40 -08:00
Brendan Burns ef6529bf2f make groupVersionResource listing dynamic when third party resources are
enabled.
2016-11-20 20:48:57 -08:00
David Ashpole 10f73bde27 added eviction minimum reclaim flags to test flags, and changed gce default config for eviction-hard to match what tests are using 2016-11-18 08:48:40 -08:00
Kubernetes Submit Queue eca9e989a3 Merge pull request #36779 from sjenning/fix-memory-leak-via-terminated-pods
Automatic merge from submit-queue

fix leaking memory backed volumes of terminated pods

Currently, we allow volumes to remain mounted on the node, even though the pod is terminated.  This creates a vector for a malicious user to exhaust memory on the node by creating memory backed volumes containing large files.

This PR removes memory backed volumes (emptyDir w/ medium Memory, secrets, configmaps) of terminated pods from the node.

@saad-ali @derekwaynecarr
2016-11-17 21:29:51 -08:00
Random-Liu 87a9d94f24 Update bazel. 2016-11-17 10:18:00 -08:00
Random-Liu edf7608c51 Remove kubelet related flags from node e2e. Add a single flag `kubelet-flags` to pass kubelet flags all together. 2016-11-17 10:17:32 -08:00
Random-Liu 090809d8ad Remove dependency on kubelet related flags. 2016-11-17 10:17:32 -08:00
Yu-Ju Hong 6ba2c7b857 Bump gci image version for cri builds 2016-11-16 14:09:51 -08:00
Seth Jennings b80bea4a62 fix leaking memory backed volumes of terminated pods 2016-11-16 10:17:22 -06:00
Kubernetes Submit Queue 587cbaf988 Merge pull request #36410 from Random-Liu/avoid-printing-test-result-twice
Automatic merge from submit-queue

Node E2E: Avoid printing test result twice.

This is a problem since long time ago.

`RunSshCommand` includes the command output to the error. If the command running the test fails, the test output will also be included in the error. [The runner prints both the test output and the error](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/runner/remote/run_remote.go#L270), which leads the test result to be printed twice. (See the [test result](https://storage.googleapis.com/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10968/build-log.txt) on node tmp-node-e2e-af900a4d-e2e-node-ubuntu-trusty-docker9-v1-image)

This PR changes `RunSshCommand` not to put command output into the error, and leave the caller to decide how to deal with command output when the command fails.
2016-11-16 02:23:12 -08:00
Random-Liu 09bc5e23a6 Avoid printing test result twice. 2016-11-15 18:10:27 -08:00
Kubernetes Submit Queue f0aba3d6fe Merge pull request #35811 from dashpole/garbage_collect_testing
Automatic merge from submit-queue

Garbage collection tests the MaxPerPodContainers and MaxContainers constraints

This is the first version of this test.  It tests that containers are garbage collected according to the default configuration.
2016-11-15 11:22:52 -08:00
David Ashpole f6224590f7 Test Container Garbage Collection 2016-11-15 09:15:31 -08:00
Kubernetes Submit Queue d8fa4f4d56 Merge pull request #35897 from k82cn/fix_mac_build
Automatic merge from submit-queue

Fixed failed build on Mac.

fixed build error on Mac.
2016-11-14 20:42:50 -08:00
Vishnu kannan 9066253491 [kubelet] rename --cgroups-per-qos to --experimental-cgroups-per-qos to reflect the true nature of that feature
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-11-14 14:06:39 -08:00
Lucas Käldström dacec687a4 Add a reporter to the system verification check 2016-11-12 16:36:40 +02:00
Michael Taufen a38c61395e Bump GCI version to gci-dev-56-8977-0-0 2016-11-11 16:00:18 -08:00
Kubernetes Submit Queue 1bc5b822cd Merge pull request #36479 from Random-Liu/node-e2e-node-name
Automatic merge from submit-queue

Node Conformance & E2E: Get node name from node object.

This PR changes the node e2e test framework to get node name from apiserver instead of test flags.

When a user tried out the node conformance test, he found that node conformance test will not work properly if kubelet is started with `hostname-override`.

The reason is that node conformance test is using [the default node name - `os.Hostname`](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L124), which may be different from `hostname-override`. This will cause test pods not scheduled, and eventually test timeout.

We can expose a flag from node conformance test, and let user set node name themselves if they are using `hostname-override` on kubelet. However, let the framework automatically detect it from apiserver is more user friendly.

/cc @kubernetes/sig-node 
This PR 1) only changes node e2e test framework; 2) fixes a problem in node conformance test which is a 1.5 feature. @saad-ali Can we have this in 1.5?
2016-11-10 18:56:53 -08:00
Kubernetes Submit Queue e7754e89df Merge pull request #36594 from mtaufen/fixup-density_test
Automatic merge from submit-queue

Fix wrong comparison var in e2e_node density test
2016-11-10 15:27:47 -08:00
Random-Liu 1c70c899f7 Get node name from node object. 2016-11-10 14:25:50 -08:00
Michael Taufen 90f8bffc33 Fix wrong comparison var in e2e_node density test 2016-11-10 10:26:00 -08:00
Kubernetes Submit Queue 44f672e5e2 Merge pull request #34877 from resouer/e2e-log-path
Automatic merge from submit-queue

Add e2e node test for log path

fixes #34661

A node e2e test to check if container logs files are properly created with right content.

Since the log files under `/var/log/containers` are actually symbolic of docker containers log files, we can not use a pod to mount them in and do check (symbolic doesn't supported by docker volume).

cc @Random-Liu
2016-11-10 08:35:59 -08:00
Kubernetes Submit Queue 467a1cd23b Merge pull request #35868 from Random-Liu/cleanup-node-e2e-output-dir
Automatic merge from submit-queue

Node E2E: Reorganize node e2e output directories.

Fixes https://github.com/kubernetes/kubernetes/issues/35074.

This PR cleans up the result directory and workspace directory of node e2e test.

Local result directory:

```
/tmp/_artifacts/
        |----- build-log.txt  (build log)
        |----- *.xml  (junit xml file)
        |----- local/  (local run *.log)
        |----- hostname1/  (remote run *.log)
        |----- hostname2/
```

Workspace directory on test node:

```
/tmp/node-e2e-yyyy-mm-ddThh-mm-ss/
        |----- cluster/  (gci mounter)
        |----- cni/  (cni binary)
        |----- e2e_node.test  (test binary)
        |----- e2e_node_test.tar.gz  (test tar)
        |----- etcd060429031/  (etcd data directory)
        |----- ginkgo  (ginkgo binary)
        |----- kubelet (kubelet binary)
        |----- pod-manifest365096781/  (mirror pod directory)
        |----- results/  (test result directory)
```

@mtaufen 
/cc @kubernetes/sig-node
2016-11-10 01:58:58 -08:00
Yu-Ju Hong cbe2358940 Remove mounter flags from cri test configs 2016-11-08 22:14:28 -08:00
Kubernetes Submit Queue 6983262914 Merge pull request #36267 from vishh/gci-mounter-scope
Automatic merge from submit-queue

Make GCI nodes mount non tmpfs, ext* & bind mounts using an external mounter 

This PR downloads the stage1 & gci-mounter ACIs as part of cluster bring up instead of downloading them dynamically from gcr.io, which was the cause for #36206.

I have also optimized the containerized mounter to pre-load the mounter image once to avoid fetch latency while using it.

Original PR which got reverted: https://github.com/kubernetes/kubernetes/pull/35821

```release-note
GCI nodes use an external mounter script to mount NFS & GlusterFS storage volumes
```

@mtaufen Node e2e is not re-enabled in this PR.

cc @jingxu97
2016-11-08 19:46:32 -08:00
Vishnu kannan 0562386385 re-enable node e2e for GCI mounter
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-11-08 11:09:13 -08:00
Vishnu kannan dd8ec911f3 Revert "Revert "Merge pull request #35821 from vishh/gci-mounter-scope""
This reverts commit 402116aed4.
2016-11-08 11:09:10 -08:00
Kubernetes Submit Queue 2e4e932391 Merge pull request #36413 from Random-Liu/extend-node-e2e-timeout
Automatic merge from submit-queue

Node E2E: Extend the default ci node e2e test timeout to 1h.

With more and more test added into node e2e, 45m seems to be not enough now.
I saw sometimes the test takes 40+m:
* https://storage.googleapis.com/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10942/build-log.txt (44m4.917870119s)
* https://storage.googleapis.com/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10965/build-log.txt (40m2.37254827s)

And sometimes even timeout:
* https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10968
* https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10941

Although it's quite likely that the timeout happened because the limit is too tight, we are not 100% sure.
This PR extends the test timeout of regular ci node e2e test to 1h. Let's see whether the timeout will happen again.

@yujuhong
2016-11-08 11:08:12 -08:00
Harry Zhang fad1990eaa Fixe verify bazel
Remove rootfs and chroot in scripts
2016-11-08 13:01:28 -05:00
Harry Zhang 64c8d3ad3d Add e2e node test for log path
Update to use pod to check log file
2016-11-08 13:01:25 -05:00
Random-Liu d9ddd64c9c Reorganize node e2e output directories. 2016-11-08 00:12:14 -08:00
Kubernetes Submit Queue 0df6384770 Merge pull request #31093 from Random-Liu/containerize-node-e2e-test
Automatic merge from submit-queue

Node Conformance Test: Containerize the node e2e test

For #30122, #30174.
Based on #32427, #32454.

**Please only review the last 3 commits.**

This PR packages the node e2e test into a docker image:
- 1st commit: Add `NodeConformance` flag in the node e2e framework to avoid starting kubelet and collecting system logs. We do this because:
  - There are all kinds of ways to manage kubelet and system logs, for different situation we need to mount different things into the container, run different commands. It is hard and unnecessary to handle the complexity inside the test suite.
- 2nd commit: Remove all `sudo` in the test container. We do this because:
  - In most container, there is no `sudo` command, and there is no need to use `sudo` inside the container.
  - It introduces some complexity to use `sudo` inside the test. (https://github.com/kubernetes/kubernetes/issues/29211, https://github.com/kubernetes/kubernetes/issues/26748) In fact we just need to run the test suite with `sudo`.
- 3rd commit: Package the test into a docker container with corresponding `Makefile` and `Dockerfile`. We also added a `run_test.sh` script to start kubelet and run the test container. The script is only for demonstration purpose and we'll also use the script in our node e2e framework. In the future, we should update the script to start kubelet in production way (maybe with `systemd` or `supervisord`).

@dchen1107 @vishh 
/cc @kubernetes/sig-node @kubernetes/sig-testing



**Release note**:

<!--  Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access) 
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`. 
-->

``` release-note
Release alpha version node test container gcr.io/google_containers/node-test-ARCH:0.1 for users to verify their node setup.
```
2016-11-07 23:41:25 -08:00
Random-Liu 0247626ed2 Extend the default ci node e2e test timeout to 1h. 2016-11-07 19:02:53 -08:00
Kubernetes Submit Queue 18cdbadb96 Merge pull request #36319 from yujuhong/cri_flag
Automatic merge from submit-queue

Rename experimental-runtime-integration-type to experimental-cri

Also rename the field in the component config to `EnableCRI`
2016-11-07 17:07:14 -08:00
Random-Liu 9345e12bc9 Add Dockerfile and Makefile to containerize node conformance test. 2016-11-07 15:27:53 -08:00
Random-Liu 919935beec Remove sudo in test suite and run test with sudo. 2016-11-07 15:27:53 -08:00
Kubernetes Submit Queue 356230f8a1 Merge pull request #36299 from Random-Liu/mark-more-conformance-test
Automatic merge from submit-queue

Node Conformance Test: Mark more conformance test

For https://github.com/kubernetes/kubernetes/issues/30122.

This PR:
1) Removes unused image test.
2) Marks more conformance tests based on https://docs.google.com/spreadsheets/d/1yib6ypfdWuq8Ikyo-rTcBGHe76Xur7tGqCKD9dkzx0Y/edit?usp=sharing.

Notice that 2 tests are not marked conformance for now:
1. **OOM score test:** The test is serial and is verifying host PID directly. The test should start a pod with PID=host and verify inside the pod. @vishh 
2. **Summary api test:** The assumption made in the test doesn't always make sense for arbitrary image, for example: The fs capacity bounds is only [(100mb, 100gb)](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/summary_test.go#L62). @timstclair 
3. We should consider mark **[cgroup manager test](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/cgroup_manager_test.go)** as conformance test. 

@dchen1107 @vishh @timstclair 
/cc @kubernetes/sig-node
2016-11-07 12:45:40 -08:00
Yu-Ju Hong dcce768a3e Rename experimental-runtime-integration-type to experimental-cri 2016-11-07 11:29:24 -08:00
Tim St. Clair 3977a14463
Enable StreamingProxyRedirects for CRI e2e tests 2016-11-07 09:42:44 -08:00
Random-Liu 13a50e3b97 Add containerize flag to avoid starting kubelet and collecting logs. 2016-11-06 20:18:23 -08:00
Kubernetes Submit Queue 9534c4f563 Merge pull request #32427 from Random-Liu/system-verification
Automatic merge from submit-queue

Node Conformance Test: Add system verification

For #30122 and #29081.

This PR introduces system verification test in node e2e and conformance test. It will run before the real test. Once the system verification fails, the test will just fail. The output of the system verification is like this:

```
I0909 23:33:20.622122    2717 validators.go:45] Validating os...
OS: Linux
I0909 23:33:20.623274    2717 validators.go:45] Validating kernel...
I0909 23:33:20.624037    2717 kernel_validator.go:79] Validating kernel version
KERNEL_VERSION: 3.16.0-4-amd64
I0909 23:33:20.624146    2717 kernel_validator.go:93] Validating kernel config
CONFIG_NAMESPACES: enabled
CONFIG_NET_NS: enabled
CONFIG_PID_NS: enabled
CONFIG_IPC_NS: enabled
CONFIG_UTS_NS: enabled
CONFIG_CGROUPS: enabled
CONFIG_CGROUP_CPUACCT: enabled
CONFIG_CGROUP_DEVICE: enabled
CONFIG_CGROUP_FREEZER: enabled
CONFIG_CGROUP_SCHED: enabled
CONFIG_CPUSETS: enabled
CONFIG_MEMCG: enabled
I0909 23:33:20.679328    2717 validators.go:45] Validating cgroups...
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
I0909 23:33:20.679454    2717 validators.go:45] Validating docker...
DOCKER_GRAPH_DRIVER: aufs
```

It verifies the system following a predefined `SysSpec`:

``` go
// DefaultSysSpec is the default SysSpec.
 var DefaultSysSpec = SysSpec{
    OS:            "Linux",
    KernelVersion: []string{`3\.[1-9][0-9].*`, `4\..*`}, // Requires 3.10+ or 4+
    // TODO(random-liu): Add more config
    KernelConfig: KernelConfig{
        Required: []string{
            "NAMESPACES", "NET_NS", "PID_NS", "IPC_NS", "UTS_NS",
            "CGROUPS", "CGROUP_CPUACCT", "CGROUP_DEVICE", "CGROUP_FREEZER",
            "CGROUP_SCHED", "CPUSETS", "MEMCG",
        },
        Forbidden: []string{},
    },
    Cgroups: []string{"cpu", "cpuacct", "cpuset", "devices", "freezer", "memory"},
    RuntimeSpec: RuntimeSpec{
        DockerSpec: &DockerSpec{
            Version: []string{`1\.(9|\d{2,})\..*`}, // Requires 1.9+
            GraphDriver: []string{"aufs", "overlay", "devicemapper"},
        },
    },
 }
```

Currently, it only supports:
- Kernel validation: version validation and kernel configuration validation
- Cgroup validation: validating whether required cgroups subsystems are enabled.
- Runtime Validation: currently, only validates docker graph driver.

The validating framework is ready. The specific validation items could be added over time.

@dchen1107 
/cc @kubernetes/sig-node
2016-11-06 17:12:39 -08:00
Kubernetes Submit Queue f650ddf800 Merge pull request #35132 from dashpole/per_volume_inode
Automatic merge from submit-queue

Per Volume Inode Accounting

Collects volume inode stats using the same find command as cadvisor.  The command is "find _path_ -xdev -printf '.' | wc -c".  The output is passed to the summary api, and will be consumed by the eviction manager.

This cannot be merged yet, as it depends on changes adding the InodesUsed field to the summary api, and the eviction manager consuming this.  Expect tests to fail until this happens.
DEPENDS ON #35137
2016-11-05 23:45:44 -07:00
Kubernetes Submit Queue 649c0ddd0e Merge pull request #35342 from timstclair/rejected
Automatic merge from submit-queue

[AppArmor] Hold bad AppArmor pods in pending rather than rejecting

Fixes https://github.com/kubernetes/kubernetes/issues/32837

Overview of the fix:

If the Kubelet needs to reject a Pod for a reason that the control plane doesn't understand (e.g. which AppArmor profiles are installed on the node), then it might contiinuously try to run the pod on the same rejecting node. This change adds a concept of "soft rejection", in which the Pod is admitted, but not allowed to run (and therefore held in a pending state). This prevents the pod from being retried on other nodes, but also prevents the high churn. This is consistent with how other missing local resources (e.g. volumes) is handled.

A side effect of the change is that Pods which are not initially runnable will be retried. This is desired behavior since it avoids a race condition when a new node is brought up but the AppArmor profiles have not yet been loaded on it.

``` release-note
Pods with invalid AppArmor configurations will be held in a Pending state, rather than rejected (failed). Check the pod status message to find out why it is not running.
```

@kubernetes/sig-node @timothysc @rrati @davidopp
2016-11-05 22:52:26 -07:00
Random-Liu 150a04d2fc Remove unused image test. 2016-11-05 22:19:43 -07:00
Random-Liu f4aee8664d Mark more conformance tests. 2016-11-05 21:11:51 -07:00
Kubernetes Submit Queue 56526043d5 Merge pull request #32530 from mtaufen/dynamic-settings-tests
Automatic merge from submit-queue

Utility functions for using dynamic Kubelet configuration from a test

/cc @vishh @dchen1107
2016-11-04 20:24:03 -07:00
Kubernetes Submit Queue fbe29f43ea Merge pull request #35724 from mtaufen/disable-cmount-for-e2e-node
Automatic merge from submit-queue

Temporarily disable GCI mounter in e2e node tests

This is just so we have an off-switch ready to go if we need it. Don't merge unless we need to disable this functionality in the e2e node tests.
2016-11-04 14:49:52 -07:00
Michael Taufen c76c9c5330 Temporarily disable GCI mounter in e2e node tests 2016-11-04 12:42:47 -07:00
Yu-Ju Hong 0918a5d5f3 Revert "cr2 e2e: remove experimental-mounter-rootfs flag" 2016-11-04 08:25:03 -07:00
Random-Liu f9b50f0949 Update bazel. 2016-11-03 20:38:29 -07:00
Random-Liu b76b2f218b Add unit test for system verification 2016-11-03 20:38:28 -07:00
Random-Liu a5fdf3850c Add system verification. 2016-11-03 20:37:18 -07:00
saadali 402116aed4 Revert "Merge pull request #35821 from vishh/gci-mounter-scope"
This reverts commit 973fa6b334, reversing
changes made to 41b5fe86b6.
2016-11-03 20:23:25 -07:00
Kubernetes Submit Queue 32bc46a202 Merge pull request #36181 from yujuhong/get_logs
Automatic merge from submit-queue

Node e2e: collect logs if the test fails unexpectedly
2016-11-03 14:40:52 -07:00
Yu-Ju Hong 97a348063c Node e2e: collect logs if the test fails unexpectedly
This only works for nodes with journald.
2016-11-03 11:54:02 -07:00
Yu-Ju Hong 722ecfb21c cr2 e2e: remove experimental-mounter-rootfs flag
The commit was reverted and the flag no longer exists.
2016-11-03 08:21:07 -07:00
Kubernetes Submit Queue 973fa6b334 Merge pull request #35821 from vishh/gci-mounter-scope
Automatic merge from submit-queue

[Kubelet] Use the custom mounter script for Nfs and Glusterfs only

This patch reduces the scope for the containerized mounter to NFS and GlusterFS on GCE + GCI clusters

This patch also enabled the containerized mounter on GCI nodes

Shepherding multiple PRs through the submit queue is painful. Hence I combined them into this PR. Please review each commit individually.

cc @jingxu97 @saad-ali

https://github.com/kubernetes/kubernetes/pull/35652 has also been reverted as part of this PR
2016-11-03 04:32:19 -07:00
Vishnu Kannan 414e4ae549 Revert "Adding a root filesystem override for kubelet mounter"
This reverts commit e861a5761d.
2016-11-02 15:18:09 -07:00
Tim St. Clair ec9111d942
Hold bad AppArmor pods in pending rather than rejecting 2016-11-02 11:05:16 -07:00
Michael Taufen 5190a7d72d Add dynamic kubelet configuration utilities to node e2e tests
Also modify dynamic kubelet configuration test to rely on new utility functions
2016-11-02 10:02:21 -07:00
derekwaynecarr 42289c2758 pod and qos level cgroup support 2016-11-02 08:07:04 -04:00
Yu-Ju Hong d22f4045d5 Disable gci-mounter in cri node e2e tests
gci-mounter is still being validated and there are known issues. Do not enable it
for cri tests for now.
2016-11-01 18:00:43 -07:00
David Ashpole d494ef66f0 Collects volume inode stats using the same find command that cadvisor uses these are included in the summary 2016-11-01 10:51:11 -07:00
Kubernetes Submit Queue 2244bfed81 Merge pull request #35137 from dashpole/per_container_inode_eviction
Automatic merge from submit-queue

Eviction manager evicts based on inode consumption

Fixes: #32526 Integrate Cadvisor per-container inode stats into the summary api.  Make the eviction manager act based on inode consumption to evict pods using the most inodes.

This PR is pending on a cadvisor godeps update which will be included in PR #35136
2016-11-01 10:32:09 -07:00
Kubernetes Submit Queue 4bae0f3a96 Merge pull request #35927 from timstclair/summary-test
Automatic merge from submit-queue

Bump Kubelet workingset upper bound

For https://github.com/kubernetes/kubernetes/issues/34990

Follow up to https://github.com/kubernetes/kubernetes/pull/35828, because working memory is too high now too.
2016-10-31 15:34:18 -07:00
Tim St. Clair 8330b081bc
Bump Kubelet workingset upper bound 2016-10-31 13:51:07 -07:00
David Ashpole b8fc546d60 eviction manager ecivts pod using the most inodes. 2016-10-31 11:32:49 -07:00
Klaus Ma 0e6dbaad67 Fixed failed build on Mac. 2016-10-31 21:14:21 +08:00
Dr. Stefan Schimanski ab3ce27f01 Make master+federation ServerRunOptions embeddings explicit 2016-10-31 11:04:58 +01:00
Dr. Stefan Schimanski b798527793 Rename master/options/{APIServer -> ServerRunOptions} 2016-10-31 10:55:19 +01:00
Tim St. Clair 304dbd0e2e
Increase sys container usageBytes upper bound 2016-10-28 14:50:08 -07:00
Kubernetes Submit Queue cf7178d7c3 Merge pull request #35572 from bprashanth/ip_gc
Automatic merge from submit-queue

GC pod ips

Finally managed to write a *failing* test. 
Supersedes https://github.com/kubernetes/kubernetes/pull/34373

```release-note
GC pod ips
```
2016-10-28 14:44:28 -07:00
Kubernetes Submit Queue 14495fed7c Merge pull request #35717 from vishh/rkt-v1.18.0
Automatic merge from submit-queue

Update rkt version on GCI nodes to v1.18.0

v1.18.0 avoids outputting debug information by default which happens to
pollute events and kubelet logs.
2016-10-28 03:10:30 -07:00
bprashanth 37bc34c567 periodically GC pod ips 2016-10-27 22:15:35 -07:00
Kubernetes Submit Queue a266f72b34 Merge pull request #35730 from yujuhong/expand_benchmarks
Automatic merge from submit-queue

Add coreos and gci images to the node benchmark job
2016-10-27 16:47:19 -07:00
Yu-Ju Hong bf2fd238cc Add coreos and gci images to the node benchmark job 2016-10-27 14:52:58 -07:00
Kubernetes Submit Queue 90f4ceefc4 Merge pull request #35349 from vishh/gci-cmount
Automatic merge from submit-queue

Update GCI mounter script to run in a rkt container

Depends on #35652
2016-10-27 13:49:37 -07:00
Vishnu kannan c556b33bd6 update rkt to v1.18.0 which avoids outputting debug information by default
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-27 12:24:29 -07:00
Vishnu kannan 19c19c2e0f Updating GCI mounter to be containerized
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-27 09:37:08 -07:00
David Ashpole eb19713486 kubelet calls GetDirFsInfo(root directory) instead of using GetFsInfo(root label). Reverted #33520, and changed e2e test context to use nodefs 2016-10-27 08:04:59 -07:00
Vishnu kannan e861a5761d Adding a root filesystem override for kubelet mounter
This is useful for supporting hostPath volumes via containerized
mounters in kubelet.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-26 21:42:59 -07:00
Kubernetes Submit Queue f300d7ed69 Merge pull request #35646 from vishh/klet-relative-mount
Automatic merge from submit-queue

rename kubelet flag mounter-path to experimental-mounter-path

```release-note
* Kubelet flag '--mounter-path' renamed to '--experimental-mounter-path'
```

The feature the flag controls is an experimental feature and this renaming ensures that users do not depend on this feature just yet.
2016-10-26 16:57:33 -07:00
Brian Grant 2ae2339d6a Merge pull request #35546 from thockin/kill-head-scary-warning-on-master
Remove obsolete munger on docs
2016-10-26 16:44:53 -07:00
Vishnu kannan adef4675a0 rename kubelet flag mounter-path to experimental-mounter-path
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-26 14:50:33 -07:00
Kubernetes Submit Queue b1d8961fe4 Merge pull request #35463 from timstclair/summary-test
Automatic merge from submit-queue

Log failed containers after summary e2e test

To help debug https://github.com/kubernetes/kubernetes/issues/34990
2016-10-25 23:55:05 -07:00
Kubernetes Submit Queue 67d947996c Merge pull request #33988 from Random-Liu/add-remote-docker-shim
Automatic merge from submit-queue

CRI: Add dockershim grpc server.

This PR adds a in-process grpc server for dockershim.

Flags change:
1. `container-runtime` will not be automatically set to remote when `container-runtime-endpoint` is set. @feiskyer 
2. set kubelet flag `--experimental-runtime-integration-type=remote --container-runtime-endpoint=UNIX_SOCKET_FILE_PATH` to enable the in-process dockershim grpc server.
3. set node e2e test flag `--runtime-integration-type=remote -container-runtime-endpoint=UNIX_SOCKET_FILE_PATH` to run node e2e test against in-process dockershim grpc server.

I've run node e2e test against the remote cri integration, tests which don't rely on stream and log functions can pass.

This unblocks the following work:
1) CRI conformance test.
2) Performance comparison between in-process integration and in-process grpc integration.

@yujuhong @feiskyer 
/cc @kubernetes/sig-node
2016-10-25 15:36:29 -07:00
Tim Hockin b0fa2056a6 Remove 'this is HEAD' warning on docs 2016-10-26 00:06:59 +02:00
Kubernetes Submit Queue ffeb01fd17 Merge pull request #35321 from vishh/gci-rkt
Automatic merge from submit-queue

Adding rkt binary to GCI 

rkt is being used to support containerized storage plugins on GCI.
2016-10-25 14:56:14 -07:00
Vishnu kannan 968e7ebe1d add rkt to GCI node e2e test images
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-25 12:38:16 -07:00
Ryan Hitchman 78eeb76386 Make hack/update_owners.py get list from local repo, add --check option. 2016-10-25 12:26:21 -07:00
Random-Liu 3d549b9e25 Add dockershim grpc server. 2016-10-25 10:31:16 -07:00
Mike Danese 763c4987f2 autogenerated 2016-10-24 14:47:27 -07:00
Mike Danese ea632fa813 Revert "disable bazel build"
This reverts commit ee15c80de2.
2016-10-24 14:47:26 -07:00
Mike Danese 0ea5904c23 rename test/e2e_node/build/ to builder/ 2016-10-24 14:47:26 -07:00
Kubernetes Submit Queue 4fbbc746a0 Merge pull request #35161 from mtaufen/mike-klet-cmount-node-e2e
Automatic merge from submit-queue

e2e node plumbing and bundling for GCI mounter

**Note:** The code in this PR only bundles the mounter and modifies `--mounter-path` if it can find `cluster/gce/gci/mounter` in the K8s source dir when building the test bundle.

This bundles the mounter script for GCI with the node e2e tests and allows the `--mounter-path` to be passed to the Kubelet via the node test framework. The node test runner will detect when we are running on a remote GCI node and add the appropriate `--mounter-path` to the `testArgs`. 

It also includes a simple node test that mounts a tmpfs volume. This will exercise the Kubelet's mounter code path. 

**ITEM OF NOTE:** To get the k8s root dir (in order to copy the mount script into the tarball), I changed `getK8sRootDir` -> `GetK8sRootDir` in `test/e2e_node/build/build.go`. Based on the comment above that function (and the fact that it was private to begin with), I'm not sure this is the best way to do things:
```
// TODO: Dedup / merge this with comparable utilities in e2e/util.go
```
On the other hand, the `e2e/util.go` file mentioned in that comment doesn't exist anymore. This should be resolved before this PR is merged.
2016-10-24 14:22:57 -07:00
Tim St. Clair 7656ffeec6
Log failed containers after summary e2e test 2016-10-24 13:47:16 -07:00
Kubernetes Submit Queue 4f072f7a06 Merge pull request #35401 from Random-Liu/add-containervm-cri-test
Automatic merge from submit-queue

CRI: Add cri test on containervm.

As is discussed with @yujuhong, we need to validate cri on containervm.

@yujuhong @feiskyer 
/cc @kubernetes/sig-node
2016-10-24 13:38:48 -07:00
Michael Taufen c339c97583 Simple mount test 2016-10-24 05:50:24 -07:00
Michael Taufen 6fdb20480f Bundle GCI mounter w/ node tests and plumb --mounter-path through test args to the Kubelet for node tests 2016-10-24 05:50:24 -07:00
Random-Liu 3922dd6667 Add cri test on containervm. 2016-10-23 19:43:57 -07:00
Jan Chaloupka 4fde09d308 Replace client with clientset in code 2016-10-23 22:00:35 +02:00
Mike Danese ee15c80de2 disable bazel build 2016-10-22 15:50:06 -07:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Maisem Ali d3163c93f4 Updating the GCI image to gci-dev-55-8872-18-0. 2016-10-20 15:59:08 -07:00
Kubernetes Submit Queue 333c045429 Merge pull request #34998 from timstclair/sysdisk
Automatic merge from submit-queue

Don't report FS stats for system containers in the Kubelet Summary API

Fixes https://github.com/kubernetes/kubernetes/issues/31999
2016-10-20 00:07:56 -07:00
gmarek f08f751831 Use clientset in GetReadySchedulableNodesOrDie 2016-10-19 15:55:39 +02:00
Kubernetes Submit Queue 743e0689cc Merge pull request #35079 from dchen1107/test1
Automatic merge from submit-queue

Collect resource usage test with 0 pod for node benchmark.
2016-10-18 22:36:31 -07:00
Clayton Coleman 957c0955aa
Run defaulting on the scheduler startup 2016-10-18 21:07:35 -04:00
Dawn Chen 6339b32a93 Collect resource usage test with 0 pod for node benchmark. 2016-10-18 17:38:09 -07:00
Kubernetes Submit Queue 88c11d4b79 Merge pull request #34995 from timstclair/summary-test
Automatic merge from submit-queue

Delete old summary test

Revert https://github.com/kubernetes/kubernetes/pull/33779 now that the new test is moved out of [Flaky] (https://github.com/kubernetes/kubernetes/pull/34631)
2016-10-18 02:00:45 -07:00
Tim St. Clair bd80da5822
Don't report FS stats for system containers 2016-10-17 16:57:17 -07:00
Tim St. Clair fa455126fc
Delete old summary test 2016-10-17 16:43:13 -07:00
Kubernetes Submit Queue bf132c82a6 Merge pull request #34631 from timstclair/summary-test
Automatic merge from submit-queue

Move Summary test out of [Flaky]

Test has been stable for a week.
2016-10-14 03:08:37 -07:00
Random-Liu bab971d002 Fix the wait for pod success in test framework. 2016-10-12 11:33:40 -07:00
Tim St. Clair f4fc84f337 Move Summary test out of [Flaky] 2016-10-12 11:17:27 -07:00
Kubernetes Submit Queue c50af358e8 Merge pull request #34473 from DirectXMan12/feature/set-image-id-manifest-digest
Automatic merge from submit-queue

Kubelet: Use RepoDigest for ImageID when available

```release-note
Use manifest digest (as `docker-pullable://`) as ImageID when available (exposes a canonical, pullable image ID for containers).
```

Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead.  Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.

This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)

Related to #32159
2016-10-11 00:33:25 -07:00
Kubernetes Submit Queue a1f1e88f44 Merge pull request #34344 from timstclair/summary-test
Automatic merge from submit-queue

Run flaky tests in parallel

We should try to emulate the main CI environment in the flaky test suite so that it is clear when a test can be moved out of the flaky suite. Since a common source of flakes is unintended interactions between tests running in parallel, we should run the flaky suite in parallel to better detect such flakes.
2016-10-10 21:12:39 -07:00
Solly Ross 135f87dc15 Kubelet: Use RepoDigest for ImageID when available
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead.  Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.

This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
2016-10-10 15:16:58 -04:00
Wojciech Tyczynski 77371c3bf4 Revert "Kubelet: Use RepoDigest for ImageID when available" 2016-10-08 10:19:22 +02:00
Kubernetes Submit Queue c02db86f2e Merge pull request #34306 from Random-Liu/fix-cri-presubmit-test
Automatic merge from submit-queue

CRI: Fix cri pre-submit test.

Fixes https://github.com/kubernetes/kubernetes/pull/33988#issuecomment-252134276.
We should use `k8s-jkns-pr-node-e2e` instead of `k8s-jkns-ci-node-e2e` for presubmit test.

@yujuhong @feiskyer
2016-10-07 22:29:43 -07:00
Kubernetes Submit Queue 8bcb85685e Merge pull request #34156 from adityakali/gci
Automatic merge from submit-queue

Update GCI_VERSION to gci-dev-55-8866-0-0

Update GCI base image:

Change log:
* Built-in kubernetes updated to v1.4.0
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools
* OpenSSL CVE fixes

```release-note
Update GCI base image:
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools (ebtools)
* OpenSSL CVE fixes
```

cc/ @kubernetes/goog-image cc/ @dchen1107
2016-10-07 16:35:20 -07:00
Kubernetes Submit Queue 0623f5aab5 Merge pull request #34350 from kubernetes/revert-26501-scheduler
Automatic merge from submit-queue

Revert "Add kubelet awareness to taint tolerant match caculator."

Reverts kubernetes/kubernetes#26501

Original PR was not fully reviewed by @kubernetes/sig-node 

cc/ @timothysc @resouer
2016-10-07 14:42:12 -07:00
Kubernetes Submit Queue c23346f391 Merge pull request #33014 from DirectXMan12/feature/set-image-id-manifest-digest
Automatic merge from submit-queue

Kubelet: Use RepoDigest for ImageID when available

**Release note**:
```release-note
Use manifest digest (as `docker-pullable://`) as ImageID when available (exposes a canonical, pullable image ID for containers).
```

Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead.  Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.

This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)

Related to #32159
2016-10-07 12:48:32 -07:00
David Oppenheimer cd4e08e7ec Revert "Add kubelet awareness to taint tolerant match caculator." 2016-10-07 12:10:55 -07:00
Kubernetes Submit Queue 21188cadeb Merge pull request #26501 from resouer/scheduler
Automatic merge from submit-queue

Add kubelet awareness to taint tolerant match caculator.

Add kubelet awareness to taint tolerant match caculator.

Ref: #25320

This is required by `TaintEffectNoScheduleNoAdmit` & `TaintEffectNoScheduleNoAdmitNoExecute `, so that node will know if it should expect the taint&tolerant
2016-10-07 12:05:35 -07:00
Tim St. Clair e859e9bbb7
Run flaky tests in parallel 2016-10-07 10:31:32 -07:00
Random-Liu a477879e58 Fix cri pre-submit test. 2016-10-07 00:48:22 -07:00
Aditya Kali 11397e0f6d Update GCI_VERSION to gci-dev-55-8866-0-0
Changelog:
* Built-in kubernetes updated to v1.4.0
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools
* OpenSSL CVE fixes
2016-10-06 15:43:29 -07:00
Kubernetes Submit Queue a98850bf30 Merge pull request #34055 from freehan/revert-add-network-node-e2e
Automatic merge from submit-queue

Revert "Revert "move pod networking tests common""

Reverts #34011

And fix the problem causing `Granular Checks: Services [Slow] should update nodePort` tests to fail
2016-10-05 23:25:09 -07:00
Kubernetes Submit Queue 0554f58489 Merge pull request #34141 from Random-Liu/add-cri-node-serial-and-benchmark
Automatic merge from submit-queue

CRI: Add serial and benchmark test suite.

For https://github.com/kubernetes/kubernetes/issues/31459.

The serial test result will be shown on test-grid.
The benchmark test result will be shown [node-perf-dash](http://node-perf-dash.k8s.io/#/builds)

This PR also changes the cri validation test to use the same gci image with node e2e instead of the canary image. The docker version is still 1.11.2.

@yujuhong @feiskyer @yifan-gu 
/cc @kubernetes/sig-node
2016-10-05 18:35:24 -07:00
Kubernetes Submit Queue 9aac6cfdb6 Merge pull request #34068 from Random-Liu/add-cri-presubmit-test
Automatic merge from submit-queue

CRI: Add presubmit CRI validation test.

For #31459.

This PR adds a new suite for CRI presubmit validation which runs non-flaky, non-serial, non-slow test per-pr.

Except this PR, I'll also change the test-infra side. Ideally, after this is done, we should be be able to trigger CRI validation test per-pr with something like `@k8s-bot cri node e2e test this` and `@k8s-bot cri e2e test this`.

@yujuhong @feiskyer @yifan-gu @freehan 
/cc @kubernetes/sig-node
2016-10-05 15:38:25 -07:00
Random-Liu 6a4a6e5b6c Add presubmit CRI validation test. 2016-10-05 14:05:05 -07:00
Random-Liu b7b95ccb19 Add serial and benchmark test suite. 2016-10-05 12:03:32 -07:00
Minhan Xia df92825c33 Revert "Revert "move pod networking tests common"" 2016-10-05 10:53:22 -07:00
Kubernetes Submit Queue 0ae65b2ff6 Merge pull request #34026 from timstclair/summary-test
Automatic merge from submit-queue

Tweak summary test memory expectations

To handle recent flakes of the summary test (https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e)
2016-10-05 00:43:37 -07:00
Kubernetes Submit Queue 49e9a90762 Merge pull request #33779 from timstclair/summary-test-matchers
Automatic merge from submit-queue

Add the original summary test back
2016-10-04 19:07:59 -07:00
Solly Ross 01b0b5ed70 Kubelet: Use RepoDigest for ImageID when available
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead.  Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.

This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
2016-10-04 20:41:53 -04:00
bprashanth 99957d2ae1 Add netexec 1.7 to whitelists 2016-10-04 14:47:33 -07:00
Tim St. Clair 7b9b0ae297
Tweak summary test memory expectations 2016-10-04 10:17:19 -07:00
Marek Grabowski b7d76023c9 Revert "move pod networking tests common" 2016-10-04 14:22:55 +02:00
Kubernetes Submit Queue b74f0fc480 Merge pull request #33795 from freehan/add-network-node-e2e
Automatic merge from submit-queue

move pod networking tests common

This allows pod networking tests to run in both e2e and node e2e
2016-10-04 02:09:24 -07:00
Kubernetes Submit Queue 4929880a21 Merge pull request #33788 from timstclair/summary-test
Automatic merge from submit-queue

Fix summary test

Issue was comparing an `unversioned.Time` rather than `time.Time`. I temporarily removed the `[Flaky]` tag so the PR builder will run the test. I will revert that change before submitting.
2016-10-03 13:30:23 -07:00
Minhan Xia 5b8e16d255 move pod networking tests common 2016-10-03 10:00:36 -07:00
Harry Zhang c2cf5bbaf6 Setup e2e test for no admit 2016-10-01 01:07:18 -04:00
Kubernetes Submit Queue 186a4a06c6 Merge pull request #33778 from timstclair/summary-arm
Automatic merge from submit-queue

Fix summary_test.go ARM build

Fixes https://github.com/kubernetes/kubernetes/issues/33761

/cc @ixdy @luxas
2016-09-29 22:22:03 -07:00
Tim St. Clair efd21ea982
Fix time matcher in summary test 2016-09-29 13:57:31 -07:00
Tim St. Clair 50c8545a37
Add the original summary test back 2016-09-29 13:14:08 -07:00
Tim St. Clair e2b7424ee0
Fix summary_test.go ARM build 2016-09-29 11:46:23 -07:00
Kubernetes Submit Queue a771063928 Merge pull request #33695 from Random-Liu/add-flaky-node-e2e-suite
Automatic merge from submit-queue

Node E2E: Add node e2e flaky suite.

Addresses #33692.

I've tested locally, this should work.
2016-09-28 21:22:40 -07:00
Kubernetes Submit Queue 7dcae5edd8 Merge pull request #25260 from duglin/minion
Automatic merge from submit-queue

Change minion to node

Continuation of #1111

I tried to keep this PR down to just a simple search-n-replace to keep
things simple.  I may have gone too far in some spots but its easy to
roll those back if needed - just let me know.

I avoided renaming `contrib/mesos/pkg/minion` because there's already
a `contrib/mesos/pkg/node` dir and fixing that will require a bit of work
due to a circular import chain that pops up. So I'm saving that for a
follow-on PR.

Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-09-28 20:08:59 -07:00
Random-Liu 55dc9311ed Add node e2e flaky suite. 2016-09-28 15:23:43 -07:00
Tim St. Clair d4aeaedba0
Rewrite summary stats test to validate metrics 2016-09-28 13:44:37 -07:00
Doug Davis 9d5bac6330 Change minion to node
Contination of #1111

I tried to keep this PR down to just a simple search-n-replace to keep
things simple.  I may have gone too far in some spots but its easy to
roll those back if needed.

I avoided renaming `contrib/mesos/pkg/minion` because there's already
a `contrib/mesos/pkg/node` dir and fixing that will require a bit of work
due to a circular import chain that pops up. So I'm saving that for a
follow-on PR.

I rolled back some of this from a previous commit because it just got
to big/messy. Will follow up with additional PRs

Signed-off-by: Doug Davis <dug@us.ibm.com>
2016-09-28 10:53:30 -07:00
Kubernetes Submit Queue 19a2a10354 Merge pull request #33389 from Random-Liu/lifecycle-hook
Automatic merge from submit-queue

CRI: Fix lifecycle hook and add container lifecycle node e2e test

This PR:
1) Adds pod spec missing handling in kuberuntime. (1st commit)
2) Adds container lifecycle hook node e2e test. (2nd commit)

@yujuhong @feiskyer
2016-09-26 10:48:35 -07:00
Kubernetes Submit Queue 66d67ee41d Merge pull request #33178 from k82cn/remove_unused_var
Automatic merge from submit-queue

Removed unused var.
2016-09-25 21:30:59 -07:00
Kubernetes Submit Queue 1654fd4041 Merge pull request #33105 from k82cn/k8s_33091
Automatic merge from submit-queue

Fixed e2e_node build error on Mac.

fixes #33091 

cc @vishh , any suggestion to avoid duplicated codes?
2016-09-24 22:02:13 -07:00
Klaus Ma 849400abf9 Fix build error on Mac. 2016-09-25 11:15:42 +08:00
Random-Liu f501acebab Add the monitorParent option when starting services, and set
monitorParent to false when stop-services=false.
2016-09-24 19:45:19 -07:00
Random-Liu 0d3befd7ea Split services.go into services.go, internal_services.go and server.go. 2016-09-24 19:45:19 -07:00
Random-Liu 5eb41e9acb Add container lifecycle hook test. 2016-09-23 17:13:19 -07:00
Kubernetes Submit Queue 071927a59d Merge pull request #32549 from smarterclayton/gc_non_kube_legacy
Automatic merge from submit-queue

Allow garbage collection to work against different API prefixes

The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.

Allows OpenShift to use the GC
2016-09-23 14:06:35 -07:00
Kubernetes Submit Queue 76d15d193d Merge pull request #33236 from dchen1107/test1
Automatic merge from submit-queue

Fix node performance benchmark by using latest containervm image (docker 1.11.2)

Also add two more tests for resource tracking. 

cc/ @Random-Liu @coufon
2016-09-23 04:50:36 -07:00
Kubernetes Submit Queue 1f7e79afbf Merge pull request #33066 from Random-Liu/set-docker-client-version
Automatic merge from submit-queue

Add docker client version.

Addressed https://github.com/kubernetes/kubernetes/issues/29478#issuecomment-248197665.

This partially reverted #31540, because currently we are really trying to connect to docker daemon when creating the client.

This PR updated docker client with real docker apiversion with `UpdateClientVersion`, so that the version related logic of engine-api can work properly, such as https://github.com/docker/engine-api/pull/174/files.

@yujuhong @feiskyer
2016-09-22 19:09:14 -07:00
Clayton Coleman 97c35fcc67
Allow garbage collection to work against different API prefixes
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
2016-09-22 15:00:58 -04:00
Kubernetes Submit Queue 34c61bdba6 Merge pull request #33201 from Random-Liu/disk-eviction-recover-images
Automatic merge from submit-queue

Node E2E: Change the disk eviction test to pull images again after the test.

Fixes https://github.com/kubernetes/kubernetes/issues/32022#issuecomment-248677706.

This PR changes the disk eviction test to pull test images again in `AfterEach`, because images may be evicted during the test.

@yujuhong 
/cc @kubernetes/sig-node
2016-09-22 10:20:42 -07:00
Dawn Chen 3a5ce7f3cd Add resource tracking with 0 pods and 35 pods to node performance benchmark. 2016-09-22 09:22:56 -07:00
Dawn Chen 33343dc4e2 Node performance benchmark test using the latest containervm image. 2016-09-22 09:22:56 -07:00
Kubernetes Submit Queue db07433782 Merge pull request #33063 from pmorie/node-e2e
Automatic merge from submit-queue

Make node E2E tests more transparent

Add some logging and minor code reorg to make the node E2E tests a little more transparent and understandable.
2016-09-22 08:22:11 -07:00
Kubernetes Submit Queue 03c698ce44 Merge pull request #33194 from dchen1107/master
Automatic merge from submit-queue

Update the containervm image to the latest one (container-v1-3-v20160…

Node e2e is running with old containervm image which only has docker 1.9.1. This pr fixed such issue.
2016-09-21 20:40:02 -07:00
Random-Liu fcfe4264fe Change the disk eviction test to pull images again after the test. 2016-09-21 15:54:03 -07:00
Dawn Chen f1f16fe03a Update the containervm image to the latest one (container-v1-3-v20160604). 2016-09-21 10:24:22 -07:00
Klaus Ma 10e880684f Removed unused var. 2016-09-21 23:28:15 +08:00
Paul Morie 3539993ee0 Make node E2E tests more transparent 2016-09-20 21:55:41 -04:00
Kubernetes Submit Queue 0986a01f4f Merge pull request #33131 from Random-Liu/fix-node-e2e-for-cri
Automatic merge from submit-queue

Fix the properties file for node e2e cri validation.

I fixed this locally before, but accidentally missed in the PR. Sorry about that.

This time, I've tried myself, it should work.

@yujuhong
2016-09-20 17:09:30 -07:00
Kubernetes Submit Queue 6fd94968e1 Merge pull request #32738 from Amey-D/gci-version-v1.4
Automatic merge from submit-queue

Bump up GCI version.

```release-note
   Upgrading Container-VM base image for k8s on GCE. Brief changelog as follows:
    - Fixed performance regression in veth device driver
    - Docker and related binaries are statically linked
    - Fixed the issue of systemd being oom-killable
```

Fixes #32596

This needs a cherrypick into v1.4 release branch because it is fixing v1.4 release blocking issues. This patch is easy and safe to rollback in case of emergencies.

@vishh can you please review?

Fixes #32596 and many other issues.
cc/ @kubernetes/goog-image  FYI
2016-09-20 16:30:01 -07:00
Random-Liu 87d62d50ee Fix the properties file for node e2e cri validation. 2016-09-20 15:04:55 -07:00
Amey Deshpande 5da8486758 Bump up GCI version.
Brief changelog compared to gci-dev-54-8743-3-0:
- Fixed performance regression in veth device driver
- Docker and related binaries are statically linked
- Fixed the issue of systemd being oom-killable
- Updated built-in kubelet version to 1.3.7
- add ethtool and ebtables binaries expected by kubelet

Fixes #32596
2016-09-20 13:59:31 -07:00
Random-Liu ae031634e4 Add CRI Validation test. The test run non-flaky, non-serial test against
Kubernetes HEAD and docker v1.11.2 with CRI enabled.
2016-09-20 12:18:07 -07:00
Kubernetes Submit Queue c21fdc71a3 Merge pull request #32986 from Random-Liu/add-image-white-list
Automatic merge from submit-queue

Node E2E: Add image white list

This is part of #29081. Fixes #29155.

As is discussed with @yujuhong in #29155, it is difficult to maintain the prepull image list if it is not enforced. 

This PR added an image white list in the test framework, only images in the white list could be used in the test. If the image is not in the white list, the test will fail with reason:
```
Image "XXX" is not in the white list, consider adding it to CommonImageWhiteList in test/e2e/common/util.go or NodeImageWhiteList in test/e2e_node/image_list.go
```

Notice that if image pull policy is `PullAlways`, the image is not necessary to be in the white list or prepulled, because the test expects the image to be pulled during the test.

Currently, the image white list is only enabled in node e2e, because the image puller in e2e test is not integrated with the image white list yet.

/cc @kubernetes/sig-node
2016-09-20 07:28:58 -07:00
Random-Liu 08d74f33f6 Add client version. 2016-09-19 21:27:00 -07:00
Random-Liu ed411c9042 Add image white list, images in white list will be prepulled, and
only images in white list could be used in the test. Currently only
enabled in node e2e test.
2016-09-19 14:39:23 -07:00
Random-Liu dfcbdae178 Add image pull retry in image pulling test. 2016-09-19 14:18:37 -07:00
Kubernetes Submit Queue 3aa72fa480 Merge pull request #32926 from kubernetes/revert-32841-revert-32251-fix-oom-policy
Automatic merge from submit-queue

[kubelet] Fix oom-score-adj policy in kubelet

Fixes #32238 

We have been having this regression since v1.3. It is critical for GKE/GCE deployments of k8s because docker daemon has a high likelihood of being OOM killed which will end up nuking all containers. 
The reason for moving from mnt to pid is that docker daemon moves itself into a new mnt namespace with systemd based deployments.
2016-09-17 13:00:20 -07:00
Paul Morie 88acffcda1 Fix error message around gcloud calls in node e2e and gubernator 2016-09-17 01:05:20 -04:00
Vish Kannan a1fe3adbc7 Revert "Revert "[kubelet] Fix oom-score-adj policy in kubelet"" 2016-09-16 16:32:58 -07:00
Kubernetes Submit Queue d69cdce704 Merge pull request #32820 from coufon/change_collector_log
Automatic merge from submit-queue

change the error log for empty resource usage

This PR changes the error log for empty resource usage buffer for a container to be more clear. It happens when the container name is wrong, or cAdvisor somehow does not response.
2016-09-15 23:54:34 -07:00
Vish Kannan 492ca3bc9c Revert "[kubelet] Fix oom-score-adj policy in kubelet" 2016-09-15 19:28:59 -07:00
Kubernetes Submit Queue fcc97f37ee Merge pull request #32718 from mikedanese/mv-informer
Automatic merge from submit-queue

move informer and controller to pkg/client/cache

@kubernetes/sig-api-machinery
2016-09-15 16:44:30 -07:00
Zhou Fang 3e16eb5082 change the error log for empty resource usage 2016-09-15 14:13:25 -07:00
Mike Danese a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Vishnu kannan e4acad7afb Fix oom-score-adj policy in kubelet.
Docker daemon and kubelet needs to be protected by setting oom-score-adj to -999.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-14 11:56:10 -07:00
Kubernetes Submit Queue 312acd9e30 Merge pull request #32342 from coufon/get_image_machine_info_from_apiserver
Automatic merge from submit-queue

Get image and machine info from apiserver in node e2e test

This PR changes node e2e test to get image and machine information from API server instead of pass them from Jenkins test framework. The original format to pass image and machine info is naming the test node as "machine-image-uuid", which is hard to parse because "-" occurs a lot in both machine and image names.

Now we add two labels "image" and "machine" into performance data. The machine type has the format "cpu:1core,memory:3.6GB".

This PR is based on #32250.
2016-09-14 03:34:45 -07:00
Zhou Fang b47e22d013 endable DynamicKubeletConfig in benchmark test properties 2016-09-13 13:59:19 -07:00
Zhou Fang a683eb0418 get image and machine info from api server instead of passing from test
# Please enter the commit message for your changes. Lines starting
2016-09-13 08:41:29 -07:00
Kubernetes Submit Queue 0ca6506850 Merge pull request #32250 from coufon/increase_qps
Automatic merge from submit-queue

Add node e2e density test using 60 QPS for benchmark

This PR adds a new benchmark node e2e density test which sets Kubelet API QPS limit from default 5 to 60, through ConfigMap. 

The latency caused by API QPS limit is as large as ~30% when creating a large batch of pods (e.g. 105). It makes the pod startup latency, as well creation throughput underestimated. This test helps us to know the real performance of Kubelet core.
2016-09-12 20:27:11 -07:00
Michael Taufen 28db03869b Fix memory eviction test parameters. Those parameters should NOT have come through in b9f0bd95 2016-09-12 12:01:53 -07:00
Zhou Fang a6500cc74a change benchmark configration file to add QPS60 tests 2016-09-12 11:46:38 -07:00
Kubernetes Submit Queue af325ee7bf Merge pull request #31797 from aveshagarwal/master-dapi-volume-tests-image-update
Automatic merge from submit-queue

Update container image version for downward api volume tests

Some tests were using 0.7, and some were using 0.6, so updating all to 0.7.
@kubernetes/rh-cluster-infra
2016-09-12 01:22:27 -07:00
Kubernetes Submit Queue 469698a803 Merge pull request #32169 from ixdy/node-e2e-flake
Automatic merge from submit-queue

Make error more useful when failing to list node e2e images

To help investigate https://github.com/kubernetes/kubernetes/issues/31694 if it happens again.
2016-09-11 05:07:00 -07:00
Kubernetes Submit Queue 51d996e5d7 Merge pull request #32003 from Random-Liu/change-docker-validation-config-file
Automatic merge from submit-queue

Automated Docker Validation: Change wrong name in perf config.

The config key `containervm-density*` is improper, remove it.

/cc @coufon
2016-09-10 17:58:23 -07:00
Kubernetes Submit Queue 09efe0457d Merge pull request #32163 from mtaufen/more-eviction-logging
Automatic merge from submit-queue

Log pressure condition, memory usage, events in memory eviction test

I want to log this to help us debug some of the latest memory eviction test flakes, where we are seeing burstable "fail" before the besteffort. I saw (in the logs) attempts by the eviction manager to evict besteffort a while before burstable phase changed to "Failed", but the besteffort's phase appeared to remain "Running". I want to see the pressure condition interleaved with the pod phases to get a sense of the eviction manager's knowledge vs. pod phase.
2016-09-09 18:37:55 -07:00
Michael Taufen b9f0bd959e Log the following items in memory eviction test:
- memory working set
- pressure condition
- events for the default and test namespaces, after the test completes
2016-09-09 13:42:26 -07:00
Kubernetes Submit Queue e317af87cc Merge pull request #31819 from mtaufen/plumb-feature-gates
Automatic merge from submit-queue

Plumb --feature-gates from TEST_ARGS to components in node e2e tests

This means you can set `TEST_ARGS` on the command line, in a `.properties` config for a Jenkins job, etc, to toggle gated features. For example:

`TEST_ARGS='--feature-gates=DynamicKubeletConfig=true'`

/cc @vishh @jlowdermilk
2016-09-09 12:31:00 -07:00
Zhou Fang a1d2f43e0e add benchmark test using 60 QPS 2016-09-07 17:51:51 -07:00
Kubernetes Submit Queue 0bd0d5571a Merge pull request #31540 from mtaufen/DockerOrDieRename
Automatic merge from submit-queue

Rename ConnectToDockerOrDie to CreateDockerClientOrDie

This function does not actually attempt to connect to the docker daemon, it just creates a client object that can be used to do so later. The old name was confusing, as it implied that a failure to touch the docker daemon could cause program termination (rather than just a failure to create the client).
2016-09-07 15:27:41 -07:00
Jeff Grafton 1e0cbbf451 Make error more useful when failing to list node e2e images 2016-09-06 17:16:53 -07:00
Minhan Xia 1e88c99e3e bump cni 2016-09-06 10:48:36 -07:00
Kubernetes Submit Queue 2a7d0df30d Merge pull request #30727 from asalkeld/iptables-caps
Automatic merge from submit-queue

Clean up IPTables caps i.e.: sed -i "s/Iptables/IPTables/g"

Fixes #30651
2016-09-06 09:01:27 -07:00
Kubernetes Submit Queue 631fda19a1 Merge pull request #31769 from Random-Liu/wrong-log-permission-bit
Automatic merge from submit-queue

Node E2E: Fix wrong permission bit for log file.

When creating log for logs from journald, we use `0755` which is weird to me.
This PR changes it to `0666`.
2016-09-06 01:24:12 -07:00
Random-Liu 35cabc370e Fix wrong permission bit for log file. 2016-09-02 14:05:18 -07:00
Vishnu kannan 59e14cf55b Increase logging level for e2e node services
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-02 14:05:09 -07:00
Random-Liu d36f7220da Change wrong name in perf config. 2016-09-02 13:48:10 -07:00
Kubernetes Submit Queue 0cdaf1028e Merge pull request #31912 from mtaufen/eviction-aftereach
Automatic merge from submit-queue

Move wait for pressure to subside to AfterEach 

so we still wait if the test part of the test for eviction order fails.
2016-09-02 13:15:36 -07:00
Zhou Fang bb0a0c0fe9 increase resource usage limit in node e2e test 2016-09-02 09:12:40 -07:00
Avesh Agarwal 4ba39b4722 Update mounttest container image version to 0.7 everywhere. 2016-09-02 09:03:11 -04:00
Kubernetes Submit Queue ca98482da9 Merge pull request #31945 from Random-Liu/prepull-common-test-image
Automatic merge from submit-queue

Add images used in common test into prepull image list.

Addresses https://github.com/kubernetes/kubernetes/issues/31774#issuecomment-243800830.

Fixes #31774, #31579.

This PR added all images in common test into the node e2e prepull list.
Mark P2 because this help get rid of image pulling flake.

@yujuhong
2016-09-02 01:36:55 -07:00
Random-Liu 76b35cf387 Add images used in common test into prepull image list. 2016-09-01 21:20:49 -07:00
Random-Liu 5420138378 Add automated docker performance validation. 2016-09-01 19:18:20 -07:00
Michael Taufen 3ad9a1e423 Move wait for pressure to subside to AfterEach so we still wait if the test for eviction order fails 2016-09-01 14:39:43 -07:00
Kubernetes Submit Queue d3270bce71 Merge pull request #31566 from ronnielai/container-gc
Automatic merge from submit-queue

Avoid disk eviction node e2e test using up all the disk space
2016-09-01 14:19:25 -07:00
Amey Deshpande 6a2201f410 Pick a specific GCI version by default on GCE.
Prior to this change, a K8s branch (master as well as release) was
pinned to a GCI milestone.  It would pick up the latest GCI release on
that milestone at the time of cluster creation.  The rationale was the
K8s users would automatically get the bug fixes in newer versions of
GCI.  However in practice, it makes the runtime environment
non-deterministic, and lack of continuous e2e tests mean we would run
into breakages sooner or later.

With this change, each K8s release will pick a specific version
of GCI by default (similar to how the Debian-based container-vm gets used).
Users can override the default version through KUBE_GCE_MASTER_IMAGE and
KUBE_GCE_NODE_IMAGE environment variables.

We expect the default GCI version will be updated relatively frequently stay
updated with newer GCI releases.  We can also automate the process to
automatically bump the hard-coded GCI version in future.
2016-08-31 17:26:00 -07:00
Michael Taufen af0a0c6367 Enable dynamic kubelet configuration for node e2e Jenkins serial tests
This commit enables the dynamic kubelet configuration feature for the
node e2e Jenkins serial tests, which is where the test for dynamic kubelet
configuration currently runs.
2016-08-31 13:54:57 -07:00
Michael Taufen a40b2cbe10 feature-gates flag plumbing for node e2e tests
This gives the node e2e test binary a --feature-gates flag that populates a
FeatureGates field on the test context. The value of this field is forwarded
to the kubelet's --feature-gates flag and is also used to populate the global
DefaultFeatureGate object so that statically-linked components see the same
feature gate settings as provided via the flag.

This means that you can set feature gates via the TEST_ARGS environment
variable when running node e2e tests. For example:

TEST_ARGS='--feature-gates=DynamicKubeletConfig=true'
2016-08-31 13:52:08 -07:00
Michael Taufen b3e9875fcc Wait before trying to start a new pod after the eviction test
This should stop the test from flaking while we figure out why there is
a mismatch between the reported pressure condition and the eviction
manager's decision to evict due to memory pressure.
2016-08-31 10:42:20 -07:00
Kubernetes Submit Queue 3b404bd213 Merge pull request #31651 from Random-Liu/move-host-info-around-test-result
Automatic merge from submit-queue

Node E2E: Move host info around test result.

Discussed offline with @yujuhong and @dchen1107. Currently, the node e2e result is organized as:
```
================================================================
Success Finished Host tmp-node-e2e-b6c375c7-e2e-node-containervm-v20160321-image Test Suite
{ginkgo-output}
{framework-error}
================================================================
```
This makes it painful to find which image the test is failing on. The `{ginkgo-output}` is usually quite long, so we have to scroll mouse up and down to find the host name.
This PR changes the test result to:
```
================================================================
Start Host tmp-node-e2e-b6c375c7-e2e-node-containervm-v20160321-image Test Suite
{ginkgo-output}
Success Finished Host tmp-node-e2e-b6c375c7-e2e-node-containervm-v20160321-image Test Suite
{framework-error}
================================================================
```
This is not perfect, but much better than before. We can easily find the host name under the ginkgo test result, like this:
```
================================================================
Start Host test-gci-dev-54-8743-3-0 Test Suite
Running Suite: E2eNode Suite
============================
Random Seed: 1472511489 - Will randomize all specs
Will run 0 of 131 specs

Running in parallel across 8 nodes

I0829 22:58:13.727764    1143 e2e_node_suite_test.go:98] Pre-pulling images so that they are cached for the tests.
I0829 22:58:28.562459    1143 e2e_node_suite_test.go:111] Node services started.  Running tests...
I0829 22:58:28.562477    1143 e2e_node_suite_test.go:116] Wait for the node to be ready

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
I0829 22:58:29.742596    1143 e2e_node_suite_test.go:136] Stopping node services...
I0829 22:58:29.742650    1143 services.go:673] Killing process 1423 (services) with -TERM
I0829 22:58:29.860893    1143 e2e_node_suite_test.go:141] Tests Finished


Ran 0 of 131 Specs in 16.185 seconds
SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 131 Skipped 

Ginkgo ran 1 suite in 19.939034297s
Test Suite Passed

Success Finished Host test-gci-dev-54-8743-3-0 Test Suite
================================================================
```

In a following PR, I'll print the test result from different images into different files to make it more clear for debugging. Mark v1.4 because this helps us de-flake test.

/cc @kubernetes/sig-node
2016-08-30 23:08:41 -07:00
Random-Liu 2907d4019d Move host info around test result. 2016-08-30 21:31:04 -07:00
Random-Liu e7a1b4e16f Do not set stop-services=false for node e2e and add more logs. 2016-08-30 17:52:23 -07:00
Kubernetes Submit Queue 2b755dc480 Merge pull request #31716 from coufon/explicitly_delete_pods_in_node_perf_test
Automatic merge from submit-queue

Explicitly delete pods in node performance tests

This PR explicitly deletes all created pods at the end in node e2e performance related tests.

The large number of pods may cause namespace cleanup times out (in #30878), therefore we explicitly delete all pods for cleaning up.
2016-08-30 16:52:03 -07:00
Kubernetes Submit Queue 09e3fb355b Merge pull request #31629 from rmmh/fix-gubernator-line
Automatic merge from submit-queue

Only print "running gubernator.sh" when actually running it.
2016-08-30 16:10:18 -07:00
Zhou Fang 0167f74c6c explicitly delete pods in node perf tests 2016-08-30 15:18:37 -07:00
Kubernetes Submit Queue 12429e1690 Merge pull request #31664 from coufon/fix_perf_test_limit
Automatic merge from submit-queue

increase latency and resource limit accroding to test results

This PR increases the latency limit of node e2e density test according to previous test results.

Fixed #30878
2016-08-30 09:57:19 -07:00
Kubernetes Submit Queue cca6d7ddd9 Merge pull request #31634 from timstclair/gubernator
Automatic merge from submit-queue

Cleanup node failure message

Fix missing newline
2016-08-30 00:53:15 -07:00
Kubernetes Submit Queue 4ae5770162 Merge pull request #31370 from mtaufen/dynkubetest-update
Automatic merge from submit-queue

Capitalize feature name in test for dynamic kubelet configuration
2016-08-30 00:53:10 -07:00
Kubernetes Submit Queue 22fe385c13 Merge pull request #31653 from euank/kill-the-updates-for-the-nth-time
Automatic merge from submit-queue

test/node-e2e: Update CoreOS update disabling

Previously in this saga... #25004

This disables update-engine and locksmithd with ignition instead of
cloud-init so that they're really totally 100% disabled. Our ignition guy promises.

Pretty much every way of disabling them with cloud-init is mildly racy.

Fixes #31633 

I think @vishh can say "I told you so" after the comment on https://github.com/kubernetes/kubernetes/pull/30023#discussion-diff-73431324 .. he was right, but it turns out "stop" there doesn't really work either because of the mess that is cloud-init. Fortunately, converting our cloud-init to json and calling it "ignition" works quite well 😄 

Testing done: I ssh'd in and verified that yes, they're disabled. I didn't wait on the e2e tests to pass, so we'll let this PR check that.
2016-08-30 00:08:45 -07:00
Zhou Fang 483655a312 increase latency and resource limit accroding to test results 2016-08-29 19:05:52 -07:00