Automatic merge from submit-queue
Inode Eviction Test is Flaky
This Pull Request:
Marks the InodeEviciton test as flaky
Increases the timeout for disk pressure because coreos has nearly 2 million inodes.
Decreases the status polling interval so we can see eviction ordering better.
@Random-Liu
Automatic merge from submit-queue
Fix useless uuid in container log path node e2e
@timstclair pointed out there're nits in original PR, ref: https://github.com/kubernetes/kubernetes/pull/34877
So this patch:
1. removed useless uuid
2. change all those strings to const
Thanks. 🐱
Automatic merge from submit-queue
Build vendored copy of go-bindata and use that in go generate step
**What this PR does / why we need it**: as the title says, uses the vendored version of `go-bindata` rather than expecting developers to `go get` it (when building outside docker).
**Which issue this PR fixes**: fixes#34067, partially addresses #36655
**Special notes for your reviewer**: we still call `go generate` far too many times:
```console
~/.../src/k8s.io/kubernetes $ which go-bindata
~/.../src/k8s.io/kubernetes $ make
+++ [1116 17:35:28] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:29] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:30] Building go targets for linux/amd64:
cmd/libs/go2idl/deepcopy-gen
+++ [1116 17:35:35] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:35] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:36] Building go targets for linux/amd64:
cmd/libs/go2idl/defaulter-gen
+++ [1116 17:35:41] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:41] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:42] Building go targets for linux/amd64:
cmd/libs/go2idl/conversion-gen
+++ [1116 17:35:47] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:47] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:48] Building go targets for linux/amd64:
cmd/libs/go2idl/openapi-gen
+++ [1116 17:35:56] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:56] Generating bindata:
test/e2e/framework/gobindata_util.go
```
Fixing that is a separate effort, though.
cc @sebgoa @ZhangBanger
Automatic merge from submit-queue
Add the system verification test to the kubeadm preflight checks
And refactor the system verification test to accept to write to a specific writer in order to customize the output
This PR is targeting v1.5, PTAL
cc @Random-Liu @dchen1107 @kubernetes/sig-cluster-lifecycle
Automatic merge from submit-queue
move parts of the mega generic run struct out
This splits the main `ServerRunOptions` into composeable pieces that are bindable separately and adds easy paths for composing servers to run delegating authentication and authorization.
@sttts @ncdc alright, I think this is as far as I need to go to make the composing servers reasonable to write. I'll try leaving it here
Automatic merge from submit-queue
Node Conformance Test: Final cleanup for node conformance test.
This PR fits node conformance test with recent change.
* Remove `--manifest-path` because the test will get kubelet configuration through `/configz` now. https://github.com/kubernetes/kubernetes/pull/36919
* Add `$TEST_ARGS` so that we can override arguments inside the container.
* Fix a bug in garbage_collector_test.go which will cause the framework tries to connect docker no matter running the test or not. @dashpole
* Add `${REGISTRY}/node-test:${VERSION}` for convenience.
* Bump up the image version to `0.2`. (the one released with v1.4 is `v0.1`)
I've run the test both with `run_test.sh` script and directly `docker run`. Both of them passed.
After this gets merged, I'll build and push the new test image.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Modify GCI mounter to enable NFSv3
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.
After this change, need to make push the image and update the sha number in Changelog.
Automatic merge from submit-queue
Use netexec container in http lifecycle hook test.
Fixes https://github.com/kubernetes/kubernetes/issues/33636.
The original test is using `"echo -e \"HTTP/1.1 200 OK\n\" | nc -l -p 1234` as a simple http server.
However, it seems that this is not very reliable, which may response before golang thinks it should.
So we get the error:
```
I1106 06:14:13.325397 2096 logs.go:41] Unsolicited response received on idle HTTP channel starting with "HTTP/1.1 200 OK\n\n"; err=<nil>
```
This PR changes the test to use the `netexec` container which is a simple http server written by golang and used in many of our networking e2e test. It should be more reliable.
Mark 1.5 since this is fixing a 1.5 release blocking issue. Mark P0 to match the original issue.
@dchen1107
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.
After this change, need to make push the image and update the sha number in Changelog.
Automatic merge from submit-queue
fix leaking memory backed volumes of terminated pods
Currently, we allow volumes to remain mounted on the node, even though the pod is terminated. This creates a vector for a malicious user to exhaust memory on the node by creating memory backed volumes containing large files.
This PR removes memory backed volumes (emptyDir w/ medium Memory, secrets, configmaps) of terminated pods from the node.
@saad-ali @derekwaynecarr
Automatic merge from submit-queue
Node E2E: Avoid printing test result twice.
This is a problem since long time ago.
`RunSshCommand` includes the command output to the error. If the command running the test fails, the test output will also be included in the error. [The runner prints both the test output and the error](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/runner/remote/run_remote.go#L270), which leads the test result to be printed twice. (See the [test result](https://storage.googleapis.com/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10968/build-log.txt) on node tmp-node-e2e-af900a4d-e2e-node-ubuntu-trusty-docker9-v1-image)
This PR changes `RunSshCommand` not to put command output into the error, and leave the caller to decide how to deal with command output when the command fails.
Automatic merge from submit-queue
Garbage collection tests the MaxPerPodContainers and MaxContainers constraints
This is the first version of this test. It tests that containers are garbage collected according to the default configuration.
Automatic merge from submit-queue
Node Conformance & E2E: Get node name from node object.
This PR changes the node e2e test framework to get node name from apiserver instead of test flags.
When a user tried out the node conformance test, he found that node conformance test will not work properly if kubelet is started with `hostname-override`.
The reason is that node conformance test is using [the default node name - `os.Hostname`](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L124), which may be different from `hostname-override`. This will cause test pods not scheduled, and eventually test timeout.
We can expose a flag from node conformance test, and let user set node name themselves if they are using `hostname-override` on kubelet. However, let the framework automatically detect it from apiserver is more user friendly.
/cc @kubernetes/sig-node
This PR 1) only changes node e2e test framework; 2) fixes a problem in node conformance test which is a 1.5 feature. @saad-ali Can we have this in 1.5?
Automatic merge from submit-queue
Add e2e node test for log path
fixes#34661
A node e2e test to check if container logs files are properly created with right content.
Since the log files under `/var/log/containers` are actually symbolic of docker containers log files, we can not use a pod to mount them in and do check (symbolic doesn't supported by docker volume).
cc @Random-Liu
Automatic merge from submit-queue
Make GCI nodes mount non tmpfs, ext* & bind mounts using an external mounter
This PR downloads the stage1 & gci-mounter ACIs as part of cluster bring up instead of downloading them dynamically from gcr.io, which was the cause for #36206.
I have also optimized the containerized mounter to pre-load the mounter image once to avoid fetch latency while using it.
Original PR which got reverted: https://github.com/kubernetes/kubernetes/pull/35821
```release-note
GCI nodes use an external mounter script to mount NFS & GlusterFS storage volumes
```
@mtaufen Node e2e is not re-enabled in this PR.
cc @jingxu97
Automatic merge from submit-queue
Node Conformance Test: Containerize the node e2e test
For #30122, #30174.
Based on #32427, #32454.
**Please only review the last 3 commits.**
This PR packages the node e2e test into a docker image:
- 1st commit: Add `NodeConformance` flag in the node e2e framework to avoid starting kubelet and collecting system logs. We do this because:
- There are all kinds of ways to manage kubelet and system logs, for different situation we need to mount different things into the container, run different commands. It is hard and unnecessary to handle the complexity inside the test suite.
- 2nd commit: Remove all `sudo` in the test container. We do this because:
- In most container, there is no `sudo` command, and there is no need to use `sudo` inside the container.
- It introduces some complexity to use `sudo` inside the test. (https://github.com/kubernetes/kubernetes/issues/29211, https://github.com/kubernetes/kubernetes/issues/26748) In fact we just need to run the test suite with `sudo`.
- 3rd commit: Package the test into a docker container with corresponding `Makefile` and `Dockerfile`. We also added a `run_test.sh` script to start kubelet and run the test container. The script is only for demonstration purpose and we'll also use the script in our node e2e framework. In the future, we should update the script to start kubelet in production way (maybe with `systemd` or `supervisord`).
@dchen1107 @vishh
/cc @kubernetes/sig-node @kubernetes/sig-testing
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
Release alpha version node test container gcr.io/google_containers/node-test-ARCH:0.1 for users to verify their node setup.
```
Automatic merge from submit-queue
Rename experimental-runtime-integration-type to experimental-cri
Also rename the field in the component config to `EnableCRI`
Automatic merge from submit-queue
Node Conformance Test: Add system verification
For #30122 and #29081.
This PR introduces system verification test in node e2e and conformance test. It will run before the real test. Once the system verification fails, the test will just fail. The output of the system verification is like this:
```
I0909 23:33:20.622122 2717 validators.go:45] Validating os...
OS: Linux
I0909 23:33:20.623274 2717 validators.go:45] Validating kernel...
I0909 23:33:20.624037 2717 kernel_validator.go:79] Validating kernel version
KERNEL_VERSION: 3.16.0-4-amd64
I0909 23:33:20.624146 2717 kernel_validator.go:93] Validating kernel config
CONFIG_NAMESPACES: enabled
CONFIG_NET_NS: enabled
CONFIG_PID_NS: enabled
CONFIG_IPC_NS: enabled
CONFIG_UTS_NS: enabled
CONFIG_CGROUPS: enabled
CONFIG_CGROUP_CPUACCT: enabled
CONFIG_CGROUP_DEVICE: enabled
CONFIG_CGROUP_FREEZER: enabled
CONFIG_CGROUP_SCHED: enabled
CONFIG_CPUSETS: enabled
CONFIG_MEMCG: enabled
I0909 23:33:20.679328 2717 validators.go:45] Validating cgroups...
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
I0909 23:33:20.679454 2717 validators.go:45] Validating docker...
DOCKER_GRAPH_DRIVER: aufs
```
It verifies the system following a predefined `SysSpec`:
``` go
// DefaultSysSpec is the default SysSpec.
var DefaultSysSpec = SysSpec{
OS: "Linux",
KernelVersion: []string{`3\.[1-9][0-9].*`, `4\..*`}, // Requires 3.10+ or 4+
// TODO(random-liu): Add more config
KernelConfig: KernelConfig{
Required: []string{
"NAMESPACES", "NET_NS", "PID_NS", "IPC_NS", "UTS_NS",
"CGROUPS", "CGROUP_CPUACCT", "CGROUP_DEVICE", "CGROUP_FREEZER",
"CGROUP_SCHED", "CPUSETS", "MEMCG",
},
Forbidden: []string{},
},
Cgroups: []string{"cpu", "cpuacct", "cpuset", "devices", "freezer", "memory"},
RuntimeSpec: RuntimeSpec{
DockerSpec: &DockerSpec{
Version: []string{`1\.(9|\d{2,})\..*`}, // Requires 1.9+
GraphDriver: []string{"aufs", "overlay", "devicemapper"},
},
},
}
```
Currently, it only supports:
- Kernel validation: version validation and kernel configuration validation
- Cgroup validation: validating whether required cgroups subsystems are enabled.
- Runtime Validation: currently, only validates docker graph driver.
The validating framework is ready. The specific validation items could be added over time.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Per Volume Inode Accounting
Collects volume inode stats using the same find command as cadvisor. The command is "find _path_ -xdev -printf '.' | wc -c". The output is passed to the summary api, and will be consumed by the eviction manager.
This cannot be merged yet, as it depends on changes adding the InodesUsed field to the summary api, and the eviction manager consuming this. Expect tests to fail until this happens.
DEPENDS ON #35137
Automatic merge from submit-queue
[AppArmor] Hold bad AppArmor pods in pending rather than rejecting
Fixes https://github.com/kubernetes/kubernetes/issues/32837
Overview of the fix:
If the Kubelet needs to reject a Pod for a reason that the control plane doesn't understand (e.g. which AppArmor profiles are installed on the node), then it might contiinuously try to run the pod on the same rejecting node. This change adds a concept of "soft rejection", in which the Pod is admitted, but not allowed to run (and therefore held in a pending state). This prevents the pod from being retried on other nodes, but also prevents the high churn. This is consistent with how other missing local resources (e.g. volumes) is handled.
A side effect of the change is that Pods which are not initially runnable will be retried. This is desired behavior since it avoids a race condition when a new node is brought up but the AppArmor profiles have not yet been loaded on it.
``` release-note
Pods with invalid AppArmor configurations will be held in a Pending state, rather than rejected (failed). Check the pod status message to find out why it is not running.
```
@kubernetes/sig-node @timothysc @rrati @davidopp
Automatic merge from submit-queue
Temporarily disable GCI mounter in e2e node tests
This is just so we have an off-switch ready to go if we need it. Don't merge unless we need to disable this functionality in the e2e node tests.
Automatic merge from submit-queue
[Kubelet] Use the custom mounter script for Nfs and Glusterfs only
This patch reduces the scope for the containerized mounter to NFS and GlusterFS on GCE + GCI clusters
This patch also enabled the containerized mounter on GCI nodes
Shepherding multiple PRs through the submit queue is painful. Hence I combined them into this PR. Please review each commit individually.
cc @jingxu97 @saad-ali
https://github.com/kubernetes/kubernetes/pull/35652 has also been reverted as part of this PR
Automatic merge from submit-queue
Eviction manager evicts based on inode consumption
Fixes: #32526 Integrate Cadvisor per-container inode stats into the summary api. Make the eviction manager act based on inode consumption to evict pods using the most inodes.
This PR is pending on a cadvisor godeps update which will be included in PR #35136
Automatic merge from submit-queue
Update rkt version on GCI nodes to v1.18.0
v1.18.0 avoids outputting debug information by default which happens to
pollute events and kubelet logs.
Automatic merge from submit-queue
rename kubelet flag mounter-path to experimental-mounter-path
```release-note
* Kubelet flag '--mounter-path' renamed to '--experimental-mounter-path'
```
The feature the flag controls is an experimental feature and this renaming ensures that users do not depend on this feature just yet.
Automatic merge from submit-queue
CRI: Add dockershim grpc server.
This PR adds a in-process grpc server for dockershim.
Flags change:
1. `container-runtime` will not be automatically set to remote when `container-runtime-endpoint` is set. @feiskyer
2. set kubelet flag `--experimental-runtime-integration-type=remote --container-runtime-endpoint=UNIX_SOCKET_FILE_PATH` to enable the in-process dockershim grpc server.
3. set node e2e test flag `--runtime-integration-type=remote -container-runtime-endpoint=UNIX_SOCKET_FILE_PATH` to run node e2e test against in-process dockershim grpc server.
I've run node e2e test against the remote cri integration, tests which don't rely on stream and log functions can pass.
This unblocks the following work:
1) CRI conformance test.
2) Performance comparison between in-process integration and in-process grpc integration.
@yujuhong @feiskyer
/cc @kubernetes/sig-node
Automatic merge from submit-queue
e2e node plumbing and bundling for GCI mounter
**Note:** The code in this PR only bundles the mounter and modifies `--mounter-path` if it can find `cluster/gce/gci/mounter` in the K8s source dir when building the test bundle.
This bundles the mounter script for GCI with the node e2e tests and allows the `--mounter-path` to be passed to the Kubelet via the node test framework. The node test runner will detect when we are running on a remote GCI node and add the appropriate `--mounter-path` to the `testArgs`.
It also includes a simple node test that mounts a tmpfs volume. This will exercise the Kubelet's mounter code path.
**ITEM OF NOTE:** To get the k8s root dir (in order to copy the mount script into the tarball), I changed `getK8sRootDir` -> `GetK8sRootDir` in `test/e2e_node/build/build.go`. Based on the comment above that function (and the fact that it was private to begin with), I'm not sure this is the best way to do things:
```
// TODO: Dedup / merge this with comparable utilities in e2e/util.go
```
On the other hand, the `e2e/util.go` file mentioned in that comment doesn't exist anymore. This should be resolved before this PR is merged.
Automatic merge from submit-queue
CRI: Add cri test on containervm.
As is discussed with @yujuhong, we need to validate cri on containervm.
@yujuhong @feiskyer
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Kubelet: Use RepoDigest for ImageID when available
```release-note
Use manifest digest (as `docker-pullable://`) as ImageID when available (exposes a canonical, pullable image ID for containers).
```
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Related to #32159
Automatic merge from submit-queue
Run flaky tests in parallel
We should try to emulate the main CI environment in the flaky test suite so that it is clear when a test can be moved out of the flaky suite. Since a common source of flakes is unintended interactions between tests running in parallel, we should run the flaky suite in parallel to better detect such flakes.
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Automatic merge from submit-queue
Update GCI_VERSION to gci-dev-55-8866-0-0
Update GCI base image:
Change log:
* Built-in kubernetes updated to v1.4.0
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools
* OpenSSL CVE fixes
```release-note
Update GCI base image:
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools (ebtools)
* OpenSSL CVE fixes
```
cc/ @kubernetes/goog-image cc/ @dchen1107
Automatic merge from submit-queue
Revert "Add kubelet awareness to taint tolerant match caculator."
Reverts kubernetes/kubernetes#26501
Original PR was not fully reviewed by @kubernetes/sig-node
cc/ @timothysc @resouer
Automatic merge from submit-queue
Kubelet: Use RepoDigest for ImageID when available
**Release note**:
```release-note
Use manifest digest (as `docker-pullable://`) as ImageID when available (exposes a canonical, pullable image ID for containers).
```
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Related to #32159
Automatic merge from submit-queue
Add kubelet awareness to taint tolerant match caculator.
Add kubelet awareness to taint tolerant match caculator.
Ref: #25320
This is required by `TaintEffectNoScheduleNoAdmit` & `TaintEffectNoScheduleNoAdmitNoExecute `, so that node will know if it should expect the taint&tolerant
Changelog:
* Built-in kubernetes updated to v1.4.0
* Enabled VXLAN and IP_SET config options in kernel to support some networking tools
* OpenSSL CVE fixes
Automatic merge from submit-queue
Revert "Revert "move pod networking tests common""
Reverts #34011
And fix the problem causing `Granular Checks: Services [Slow] should update nodePort` tests to fail
Automatic merge from submit-queue
CRI: Add serial and benchmark test suite.
For https://github.com/kubernetes/kubernetes/issues/31459.
The serial test result will be shown on test-grid.
The benchmark test result will be shown [node-perf-dash](http://node-perf-dash.k8s.io/#/builds)
This PR also changes the cri validation test to use the same gci image with node e2e instead of the canary image. The docker version is still 1.11.2.
@yujuhong @feiskyer @yifan-gu
/cc @kubernetes/sig-node
Automatic merge from submit-queue
CRI: Add presubmit CRI validation test.
For #31459.
This PR adds a new suite for CRI presubmit validation which runs non-flaky, non-serial, non-slow test per-pr.
Except this PR, I'll also change the test-infra side. Ideally, after this is done, we should be be able to trigger CRI validation test per-pr with something like `@k8s-bot cri node e2e test this` and `@k8s-bot cri e2e test this`.
@yujuhong @feiskyer @yifan-gu @freehan
/cc @kubernetes/sig-node
Previously, we used the docker config digest (also called "image ID"
by Docker) for the value of the `ImageID` field in the container status.
This was not particularly useful, since the config manifest is not
what's used to identify the image in a registry, which uses the manifest
digest instead. Docker 1.12+ always populates the RepoDigests field
with the manifest digests, and Docker 1.10 and 1.11 populate it when
images are pulled by digest.
This commit changes `ImageID` to point to the the manifest digest when
available, using the prefix `docker-pullable://` (instead of
`docker://`)
Automatic merge from submit-queue
Fix summary test
Issue was comparing an `unversioned.Time` rather than `time.Time`. I temporarily removed the `[Flaky]` tag so the PR builder will run the test. I will revert that change before submitting.