Commit Graph

75417 Commits (1133f24b7af06b03eac7fef1ba6db9d5baab0c5a)

Author SHA1 Message Date
Kubernetes Prow Robot ff61314dc3
Merge pull request #74478 from smarterclayton/mount_tmpfs
Ignore the sticky setgid bit when a test is running on memory EmptyDir
2019-02-25 15:15:41 -08:00
Kubernetes Prow Robot b22da83307
Merge pull request #74473 from vanduc95/cleanup-kubeadm-cont.4-20190223
kubeadm cleanup: master -> control-plane (cont.4)
2019-02-25 15:15:30 -08:00
Kubernetes Prow Robot 3814176d42
Merge pull request #74455 from SataQiu/fix-shell-2019022302
Fix shellcheck lint errors in cluster and hack scripts
2019-02-25 15:15:19 -08:00
Kubernetes Prow Robot 77cf7c7b86
Merge pull request #73703 from rphillips/fixes/kubelet_file_fsnotify
kubelet: upgrade sourceFile to use fsnotify
2019-02-25 15:15:08 -08:00
Vy Ta 2869c67076 Windows-linux connectivity 2019-02-25 14:42:02 -08:00
Yu-Ju Hong b863655faa GCE: switch to using e2eteam/pause:3.1 for pause containers
Stop building pause images on node startup.
2019-02-25 14:36:49 -08:00
Kubernetes Prow Robot 2aacb77374
Merge pull request #74444 from pjh/gce-windows-no-defender
Disable Windows Defender on Windows nodes.
2019-02-25 13:54:42 -08:00
Kubernetes Prow Robot a778f409ba
Merge pull request #74385 from SataQiu/fix-shell-20190222
Fix some shellcheck failures in hack
2019-02-25 13:54:32 -08:00
Kubernetes Prow Robot fb92681882
Merge pull request #74370 from oomichi/issue/74326
Register openstack provider for e2e test
2019-02-25 13:54:21 -08:00
Kubernetes Prow Robot 0813567660
Merge pull request #74349 from mattjmcnaughton/mattjmcnaughton/fix-shellcheck-in-hack
Fix shellcheck for hack/verify-generated-*
2019-02-25 13:54:10 -08:00
Kubernetes Prow Robot 86a3caee35
Merge pull request #74085 from vyta/e2e-tests/win-volumes
Add readonly volume tests for windows
2019-02-25 13:54:00 -08:00
Aaron Crickenberger e563402701 Fix test-cmd kubectl_run flake
It is unrealistic to expect a cascading delete to immediately take
effect. Somehow this test got away with it for a while, but we
have finally reached a point where apiserver performance has changed
just enough to expsoe this flaky expectation.
2019-02-25 13:51:48 -08:00
Hemant Kumar 7a46b30a7a Allow cinder volume limits to be configurable 2019-02-25 16:09:24 -05:00
Kenichi Omichi 215dee7dd2 Fix golint under test/e2e/framework/ingress 2019-02-25 20:55:03 +00:00
Bob Killen e137f4702a
Fix shellcheck lint errors in test/kubemark/stop-kubemark.sh 2019-02-25 15:21:32 -05:00
Bob Killen b8aae458a1
Fix shellcheck lint errors in test/kubemark/start-kubemark.sh 2019-02-25 15:21:31 -05:00
Bob Killen 46333a01b4
Fix shellcheck lint errors in test/kubemark/run-e2e-tests.sh 2019-02-25 15:21:31 -05:00
Bob Killen adf4bf1741
Fix shellcheck lint errors in test/kubemark/resources/start-kubemark-master.sh 2019-02-25 15:21:31 -05:00
Bob Killen f72ac1f5b7
Fix shellcheck lint errors in test/kubemark/master-log-dump.sh 2019-02-25 15:21:31 -05:00
Bob Killen cb59cb33ff
Fix shellcheck lint errors in test/kubemark/iks/util.sh 2019-02-25 15:21:30 -05:00
Bob Killen e3e2a96521
Fix shellcheck lint errors in test/kubemark/iks/startup.sh 2019-02-25 15:21:30 -05:00
Bob Killen 6310305d3b
Fix shellcheck lint errors in test/kubemark/iks/shutdown.sh 2019-02-25 15:21:30 -05:00
Bob Killen 186b83fe5b
Fix shellcheck lint errors in test/kubemark/gce/util.sh 2019-02-25 15:21:30 -05:00
Bob Killen 5f4b919887
Fix shellcheck lint errors in test/kubemark/common/util.sh 2019-02-25 15:21:29 -05:00
Bob Killen 9a4f4878f5
Fix shellcheck lint errors in cluster/kubemark/util.sh 2019-02-25 15:21:29 -05:00
Bob Killen 9a58913e8f
Fix shellcheck lint errors in cluster/kubemark/iks/config-default.sh 2019-02-25 15:21:25 -05:00
Bob Killen ce4c85e3fd
Fix shellcheck lint errors in cluster/kubemark/gce/config-default.sh 2019-02-25 14:55:01 -05:00
Bob Killen b538f18c0e
Add color-coding to kubemark scripts. 2019-02-25 14:53:40 -05:00
Bobby (Babak) Salamat 66cf8c8982 generated files 2019-02-25 11:45:38 -08:00
Bobby (Babak) Salamat 304244e4ae Mark scheduling/v1alpha1 and scheduling/v1beta1 API deprecated by scheduling/v1 2019-02-25 11:45:38 -08:00
Kubernetes Prow Robot 1eb2acca99
Merge pull request #74248 from danielqsj/pdep
Update vendor prometheus/common/... to latest release
2019-02-25 11:33:43 -08:00
Kubernetes Prow Robot a826e80ca7
Merge pull request #74096 from oomichi/cleanup-e2e-framework-ingress
Remove unused GetDefaultBackendNodePort()
2019-02-25 11:33:32 -08:00
Kubernetes Prow Robot 35a258d640
Merge pull request #73272 from danielqsj/juju
fix shellcheck in cluster/juju
2019-02-25 11:33:21 -08:00
Kubernetes Prow Robot f288678cfa
Merge pull request #73261 from danielqsj/local
fix shellcheck in cluster/local
2019-02-25 11:33:11 -08:00
Kubernetes Prow Robot d0f79fcf73
Merge pull request #72440 from ajatprabha/issue_34059
annotate errors in apps/job e2e tests
2019-02-25 11:33:00 -08:00
Jean Rouge f1bdfa93f9 Review comments
Signed-off-by: Jean Rouge <rougej+github@gmail.com>
2019-02-25 10:59:23 -08:00
Vy Ta 59987e7410 update bazel 2019-02-25 10:22:03 -08:00
Kubernetes Prow Robot 81e7858ece
Merge pull request #74501 from RA489/fixptrtofunction
Refactor etcd client function have same signatures in etcd.go
2019-02-25 09:56:37 -08:00
Vy Ta 585426f85f External connectivity test 2019-02-25 09:49:40 -08:00
Kubernetes Prow Robot 3b11f95810
Merge pull request #72827 from errordeveloper/drain-pkg
Refactor most of `kubectl drain` as a library
2019-02-25 06:06:36 -08:00
Kevin Wiesmüller a7d414817f fix bazel 2019-02-25 14:53:24 +01:00
Davanum Srinivas 5d13f6f776
Remove support for containerized-kubelet in local-up-cluster.sh
Change-Id: I3435b02fbe052a88f6b88d5517de2d68ff636a66
2019-02-25 08:53:14 -05:00
Florent Delannoy e627474e8f Fix fluentd-gcp addon liveness probe
Fix three issues with the fluentd-gcp liveness probe:

h1. STUCK_THRESHOLD_SECONDS was overridden by LIVENESS_THRESHOLD_SECONDS
if defined

Probably a copy/paste issue introduced in edf1ffc074

h1. `[[` is [a bashism](https://stackoverflow.com/a/47576482), and will always failed when called with `/bin/sh`

Introduced by a844523c20

Given that we call the liveness probe with `/bin/sh`, we cannot use the
double-bracketed `[[` syntax for test, as it is not POSIX-compliant and
will throw an error.

Annoyingly, even through it prints an error, `sh` returns with exit code 0
in this case:

```bash
root@fluentd-7mprs:/# sh liveness.sh
liveness.sh: 8: liveness.sh: [[: not found
liveness.sh: 15: liveness.sh: [[: not found
root@fluentd-7mprs:/# echo $?
0
```

Which means the liveness probe is considered successful by Kubernetes,
despite failing to test things as it was intended. This is also
probably the reason why this bug wasn't reported sooner :)

Thankfully, the test in this case can just as easily be written as
POSIX-compliant as it doesn't use any bash-specific features within the
`[[` block.

h1. Buffers are transient and cannot be relied upon for monitoring

Finally, after fixing the above issue, we started seeing the fluentd
containers being restarted very often, and found an issue with the
underlying logic of the liveness probe.

The probe checks that the pod is still alive by running the following
command:

`find /var/log/fluentd-buffers -type f -newer /tmp/marker-stuck -print -quit`

This checks if any _regular_ file exists under `/var/log/fluentd-buffers`
that is more recent than a predetermined time, and will return an empty
string otherwise.

The issue is that these buffers are temporary and volatile, they get created and
deleted constantly. Here is an example of running that check every second on a
running fluentd:

```
root@fluentd-eks-playground-jdc8m:/# LIVENESS_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-300};
root@fluentd-eks-playground-jdc8m:/# STUCK_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-900};
root@fluentd-eks-playground-jdc8m:/# touch -d "${STUCK_THRESHOLD_SECONDS} seconds ago" /tmp/marker-stuck;
root@fluentd-eks-playground-jdc8m:/# touch -d "${LIVENESS_THRESHOLD_SECONDS} seconds ago" /tmp/marker-liveness;
root@fluentd-eks-playground-jdc8m:/# while true; do date ; find /var/log/fluentd-buffers -type f -newer /tmp/marker-stuck -print -quit ; sleep 1 ; done
Fri Feb 22 10:52:57 UTC 2019
Fri Feb 22 10:52:58 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964ccf4c7004103c3fa7c8533f85.log
Fri Feb 22 10:52:59 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964ccf4c7004103c3fa7c8533f85.log
Fri Feb 22 10:53:00 UTC 2019
Fri Feb 22 10:53:01 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964fb8b2eedcccd2763ea7775cc2.log
Fri Feb 22 10:53:02 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964fb8b2eedcccd2763ea7775cc2.log
Fri Feb 22 10:53:03 UTC 2019
Fri Feb 22 10:53:04 UTC 2019
Fri Feb 22 10:53:05 UTC 2019
Fri Feb 22 10:53:06 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log
Fri Feb 22 10:53:07 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log
Fri Feb 22 10:53:08 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log
Fri Feb 22 10:53:09 UTC 2019
Fri Feb 22 10:53:10 UTC 2019
Fri Feb 22 10:53:11 UTC 2019
Fri Feb 22 10:53:12 UTC 2019
Fri Feb 22 10:53:13 UTC 2019
Fri Feb 22 10:53:14 UTC 2019
Fri Feb 22 10:53:15 UTC 2019
Fri Feb 22 10:53:16 UTC 2019
```

We can see buffers being created, then disappearing. The LivenessProbe running
under these conditions has a ~50% chance of failing, despite fluentd being
perfectly happy.

I believe that check is probably ok for fluentd installs using large
amounts of buffers, in which case the liveness probe will be correct more
often than not, but fluentd installs that use buffering less intensively
will be negatively impacted by this.

My solution to fix this is to check the last updated time of buffering
_folders_ within `/var/log/fluentd_buffers`. These _do_ get updated when
buffers are created, and do not get deleted as buffers are emptied,
making them the perfect candidate for our use.

Here's an example with the `-d` flag for directories:
```
root@fluentd-eks-playground-jdc8m:/# while true; do date ; find /var/log/fluentd-buffers -type d -newer /tmp/marker-stuck -print -quit ; sleep 1 ; done
Fri Feb 22 10:57:51 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:52 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:53 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:54 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:55 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:56 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:57 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:58 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:57:59 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:58:00 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:58:01 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:58:02 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
Fri Feb 22 10:58:03 UTC 2019
/var/log/fluentd-buffers/kubernetes.system.buffer
```

And example of the directory being updated as new buffers come in:
```
root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer
total 0
drwxr-xr-x 2 root root  6 Feb 22 11:17 .
drwxr-xr-x 3 root root 38 Feb 22 11:14 ..
root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer
total 16K
drwxr-xr-x 2 root root  224 Feb 22 11:18 .
drwxr-xr-x 3 root root   38 Feb 22 11:14 ..
-rw-r--r-- 1 root root 1.8K Feb 22 11:18 buffer.b58279be6e21e8b29fc333a7d50096ed0.log
-rw-r--r-- 1 root root  215 Feb 22 11:18 buffer.b58279be6e21e8b29fc333a7d50096ed0.log.meta
-rw-r--r-- 1 root root  429 Feb 22 11:18 buffer.b58279be6f09bdfe047a96486a525ece2.log
-rw-r--r-- 1 root root  195 Feb 22 11:18 buffer.b58279be6f09bdfe047a96486a525ece2.log.meta
root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer
total 0
drwxr-xr-x 2 root root  6 Feb 22 11:18 .
drwxr-xr-x 3 root root 38 Feb 22 11:14 ..
```
2019-02-25 11:48:31 +00:00
Adam Harrison c9dd2a2a45 kubectl run --quiet suppresses deletion messages
The `--quiet` option should prevent kubectl run from polluting the
output from an attached container - make it apply to the resource
deletion messages caused by `--rm`.
2019-02-25 11:10:07 +00:00
SataQiu 09ba08f8f4 fix some golint failures for pkg/apis/... 2019-02-25 18:06:08 +08:00
André Bauer 2bd6d3dc12 use image version 6.6.1
Signed-off-by: André Bauer <monotek23@gmail.com>
2019-02-25 11:05:52 +01:00
André Bauer 2d15ffc9cc updated to 6.5.2
Signed-off-by: André Bauer <monotek23@gmail.com>
2019-02-25 10:56:50 +01:00
André Bauer 0c29ea1a2e Update es-statefulset.yaml 2019-02-25 10:55:23 +01:00
André Bauer 53a936c359 Update Makefile 2019-02-25 10:55:23 +01:00
André Bauer 0e44fa6359 updated elasticsearch to 6.5.0 2019-02-25 10:55:23 +01:00