github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Kubernetes Prow Robot	2d20b57594	Merge pull request #77832 from anfernee/release-1.14 Bump ip-masq-agent version to v2.3.0	2019-06-20 17:40:36 -07:00
Yongkun Gui	d55560f29a	Bump ip-masq-agent version to v2.3.0	2019-05-13 12:04:31 -07:00
Steve Coffman	68cff866d1	Update k8s-dns-node-cache image version This revised image resolves kubernetes dns#292 by updating the image from `k8s-dns-node-cache:1.15.2` to `k8s-dns-node-cache:1.15.2`	2019-05-13 11:55:57 -07:00
Marek Siarkowicz	65b0557d31	Pick up security patches for fluentd-gcp-scaler by upgrading to version 0.5.2	2019-05-02 16:08:23 +02:00
Marek Siarkowicz	a98d10ba0b	Restore metrics-server using of IP addresses This preference list matches is used to pick prefered field from k8s node object. It was introduced in metrics-server 0.3 and changed default behaviour to use DNS instead of IP addresses. It was merged into k8s 1.12 and caused breaking change by introducing dependency on DNS configuration.	2019-04-19 16:55:38 +02:00
Zhen Wang	412a892706	Use Node-Problem-Detector v0.6.3 on GCI	2019-04-08 13:16:28 -07:00
Marek Siarkowicz	93388c9fc2	Update gcp images with security patches [stackdriver addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes. [fluentd-gcp addon] Bump fluentd-gcp-scaler to v0.5.1 to pick up security fixes. [fluentd-gcp addon] Bump event-exporter to v0.2.4 to pick up security fixes. [fluentd-gcp addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes. [metatada-proxy addon] Bump prometheus-to-sd v0.5.0 to pick up security fixes.	2019-03-29 13:31:44 +01:00
Kubernetes Prow Robot	d778b9308a	Merge pull request #75063 from wangzhen127/npd-test-fix Fix NPD e2e test on Ubuntu node and update NPD container version	2019-03-08 14:19:09 -08:00
Tim Allclair	63f61a6714	Migrate RuntimeClass to internal API	2019-03-07 11:07:54 -08:00
Zhen Wang	f4d9e7d992	Fix NPD e2e test on Ubuntu node and update NPD container version	2019-03-06 22:42:47 -08:00
Kubernetes Prow Robot	45e5f6053b	Merge pull request #74424 from liggitt/drop-k8s-io-node-labels Clean up self-set node labels	2019-03-06 08:24:26 -08:00
Kubernetes Prow Robot	95cd1d59e4	Merge pull request #74209 from monotek/fluentd-helm-readme added production note about EFK stack to the readme	2019-03-04 17:55:12 -08:00
Kubernetes Prow Robot	ccf33be0cc	Merge pull request #73940 from jiayingz/nvidia-dp-update Update nvidia-gpu-device-plugin addon.	2019-02-27 17:13:01 -08:00
Kubernetes Prow Robot	1942c1ccb0	Merge pull request #71251 from monotek/kibana updated kibana to 6.6.1	2019-02-26 23:40:33 -08:00
Kubernetes Prow Robot	7a4496429d	Merge pull request #71252 from monotek/elasticsearch updated elasticsearch to 6.6.1	2019-02-26 09:33:44 -08:00
Jordan Liggitt	0174e043c5	Prepare switch from beta.kubernetes.io/masq-agent-ds-ready to node.kubernetes.io/masq-agent-ds-ready	2019-02-26 11:43:10 -05:00
Jordan Liggitt	943b32a289	Prepare switch from beta.kubernetes.io/kube-proxy-ds-ready to node.kubernetes.io/kube-proxy-ds-ready	2019-02-26 11:42:23 -05:00
Jordan Liggitt	d6664a2365	Prepare switch from beta.kubernetes.io/metadata-proxy-ready to cloud.google.com/metadata-proxy-ready	2019-02-26 11:42:23 -05:00
Jordan Liggitt	8975233788	Finish migration of fluentd to daemonset	2019-02-26 11:42:23 -05:00
André Bauer	9e2d9cfbb0	changed es image repo Signed-off-by: André Bauer <monotek23@gmail.com>	2019-02-26 09:09:21 +01:00
Florent Delannoy	e627474e8f	Fix fluentd-gcp addon liveness probe Fix three issues with the fluentd-gcp liveness probe: h1. STUCK_THRESHOLD_SECONDS was overridden by LIVENESS_THRESHOLD_SECONDS if defined Probably a copy/paste issue introduced in `edf1ffc074` h1. `[[` is [a bashism](https://stackoverflow.com/a/47576482), and will always failed when called with `/bin/sh` Introduced by `a844523c20` Given that we call the liveness probe with `/bin/sh`, we cannot use the double-bracketed `[[` syntax for test, as it is not POSIX-compliant and will throw an error. Annoyingly, even through it prints an error, `sh` returns with exit code 0 in this case: ```bash root@fluentd-7mprs:/# sh liveness.sh liveness.sh: 8: liveness.sh: [[: not found liveness.sh: 15: liveness.sh: [[: not found root@fluentd-7mprs:/# echo $? 0 ``` Which means the liveness probe is considered successful by Kubernetes, despite failing to test things as it was intended. This is also probably the reason why this bug wasn't reported sooner :) Thankfully, the test in this case can just as easily be written as POSIX-compliant as it doesn't use any bash-specific features within the `[[` block. h1. Buffers are transient and cannot be relied upon for monitoring Finally, after fixing the above issue, we started seeing the fluentd containers being restarted very often, and found an issue with the underlying logic of the liveness probe. The probe checks that the pod is still alive by running the following command: `find /var/log/fluentd-buffers -type f -newer /tmp/marker-stuck -print -quit` This checks if any _regular_ file exists under `/var/log/fluentd-buffers` that is more recent than a predetermined time, and will return an empty string otherwise. The issue is that these buffers are temporary and volatile, they get created and deleted constantly. Here is an example of running that check every second on a running fluentd: ``` root@fluentd-eks-playground-jdc8m:/# LIVENESS_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-300}; root@fluentd-eks-playground-jdc8m:/# STUCK_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-900}; root@fluentd-eks-playground-jdc8m:/# touch -d "${STUCK_THRESHOLD_SECONDS} seconds ago" /tmp/marker-stuck; root@fluentd-eks-playground-jdc8m:/# touch -d "${LIVENESS_THRESHOLD_SECONDS} seconds ago" /tmp/marker-liveness; root@fluentd-eks-playground-jdc8m:/# while true; do date ; find /var/log/fluentd-buffers -type f -newer /tmp/marker-stuck -print -quit ; sleep 1 ; done Fri Feb 22 10:52:57 UTC 2019 Fri Feb 22 10:52:58 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964ccf4c7004103c3fa7c8533f85.log Fri Feb 22 10:52:59 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964ccf4c7004103c3fa7c8533f85.log Fri Feb 22 10:53:00 UTC 2019 Fri Feb 22 10:53:01 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964fb8b2eedcccd2763ea7775cc2.log Fri Feb 22 10:53:02 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827964fb8b2eedcccd2763ea7775cc2.log Fri Feb 22 10:53:03 UTC 2019 Fri Feb 22 10:53:04 UTC 2019 Fri Feb 22 10:53:05 UTC 2019 Fri Feb 22 10:53:06 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log Fri Feb 22 10:53:07 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log Fri Feb 22 10:53:08 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer/buffer.b5827965564883997b673d703af54848b.log Fri Feb 22 10:53:09 UTC 2019 Fri Feb 22 10:53:10 UTC 2019 Fri Feb 22 10:53:11 UTC 2019 Fri Feb 22 10:53:12 UTC 2019 Fri Feb 22 10:53:13 UTC 2019 Fri Feb 22 10:53:14 UTC 2019 Fri Feb 22 10:53:15 UTC 2019 Fri Feb 22 10:53:16 UTC 2019 ``` We can see buffers being created, then disappearing. The LivenessProbe running under these conditions has a ~50% chance of failing, despite fluentd being perfectly happy. I believe that check is probably ok for fluentd installs using large amounts of buffers, in which case the liveness probe will be correct more often than not, but fluentd installs that use buffering less intensively will be negatively impacted by this. My solution to fix this is to check the last updated time of buffering _folders_ within `/var/log/fluentd_buffers`. These _do_ get updated when buffers are created, and do not get deleted as buffers are emptied, making them the perfect candidate for our use. Here's an example with the `-d` flag for directories: ``` root@fluentd-eks-playground-jdc8m:/# while true; do date ; find /var/log/fluentd-buffers -type d -newer /tmp/marker-stuck -print -quit ; sleep 1 ; done Fri Feb 22 10:57:51 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:52 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:53 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:54 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:55 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:56 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:57 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:58 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:57:59 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:00 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:01 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:02 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer Fri Feb 22 10:58:03 UTC 2019 /var/log/fluentd-buffers/kubernetes.system.buffer ``` And example of the directory being updated as new buffers come in: ``` root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer total 0 drwxr-xr-x 2 root root 6 Feb 22 11:17 . drwxr-xr-x 3 root root 38 Feb 22 11:14 .. root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer total 16K drwxr-xr-x 2 root root 224 Feb 22 11:18 . drwxr-xr-x 3 root root 38 Feb 22 11:14 .. -rw-r--r-- 1 root root 1.8K Feb 22 11:18 buffer.b58279be6e21e8b29fc333a7d50096ed0.log -rw-r--r-- 1 root root 215 Feb 22 11:18 buffer.b58279be6e21e8b29fc333a7d50096ed0.log.meta -rw-r--r-- 1 root root 429 Feb 22 11:18 buffer.b58279be6f09bdfe047a96486a525ece2.log -rw-r--r-- 1 root root 195 Feb 22 11:18 buffer.b58279be6f09bdfe047a96486a525ece2.log.meta root@fluentd-eks-playground-jdc8m:/# ls -lah /var/log/fluentd-buffers/kubernetes.system.buffer total 0 drwxr-xr-x 2 root root 6 Feb 22 11:18 . drwxr-xr-x 3 root root 38 Feb 22 11:14 .. ```	2019-02-25 11:48:31 +00:00
André Bauer	2bd6d3dc12	use image version 6.6.1 Signed-off-by: André Bauer <monotek23@gmail.com>	2019-02-25 11:05:52 +01:00
André Bauer	2d15ffc9cc	updated to 6.5.2 Signed-off-by: André Bauer <monotek23@gmail.com>	2019-02-25 10:56:50 +01:00
André Bauer	0c29ea1a2e	Update es-statefulset.yaml	2019-02-25 10:55:23 +01:00
André Bauer	53a936c359	Update Makefile	2019-02-25 10:55:23 +01:00
André Bauer	0e44fa6359	updated elasticsearch to 6.5.0	2019-02-25 10:55:23 +01:00
André Bauer	fc850b5ecd	fixed wording Signed-off-by: André Bauer <monotek23@gmail.com>	2019-02-25 10:49:43 +01:00
André Bauer	421fcd8262	added prodution note to readme Signed-off-by: André Bauer <monotek23@gmail.com>	2019-02-25 10:47:26 +01:00
Xiang Dai	36065c6dd7	delete all duplicate empty blanks Signed-off-by: Xiang Dai <764524258@qq.com>	2019-02-23 10:28:04 +08:00
Kubernetes Prow Robot	743f864310	Merge pull request #73819 from coffeepac/move-fluentd-es-images Move fluentd es images	2019-02-22 17:58:12 -08:00
Patrick Christopher	1bd45ba6eb	review updates	2019-02-22 10:00:10 -08:00
Kubernetes Prow Robot	125dc6c8ea	Merge pull request #74187 from xichengliudui/fixgolint0218 Fix shellcheck lint errors in cluster/addons/fluentd-elasticsearch/fl……uentd-es-image/run.sh	2019-02-21 20:51:13 -08:00
Kubernetes Prow Robot	042f9ed3af	Merge pull request #74093 from blakebarnett/lower-neg-cache-ttl Lowers the default nodelocaldns denial cache TTL	2019-02-21 17:47:16 -08:00
Blake	46c299c1b1	Match default cache size of 10000 https://github.com/coredns/coredns/blob/master/plugin/cache/cache.go#L236 This gets rounded down to the nearest multiple of 256: 9984	2019-02-21 15:03:30 -08:00
xichengliudui	053332ad46	Fix shellcheck lint errors in cluster/addons/fluentd-elasticsearch/fluentd-es-image/run.sh update pull request update pull request update pull request update pull request update pull request	2019-02-21 02:00:48 -05:00
Kubernetes Prow Robot	7b203c6809	Merge pull request #74137 from rajansandeep/readinessprobe Add readinessProbe to CoreDNS	2019-02-19 16:24:04 -08:00
Kubernetes Prow Robot	cbf45eea13	Merge pull request #74138 from rramkumar1/ingress-docs-fix Update docs for Ingress-GCE related cluster addon	2019-02-19 15:05:50 -08:00
Sandeep Rajan	37c3d68a91	Add readinessProbe	2019-02-19 10:14:12 -05:00
Rohit Ramkumar	a50752ceb7	Update docs for Ingress-GCE related cluster addon	2019-02-18 13:17:01 -08:00
André Bauer	d82d5fda35	updated kibana to 6.6.0 Signed-off-by: André Bauer <monotek23@gmail.com>	2019-02-18 11:00:02 +01:00
André Bauer	fa859e4644	Merge branch 'master' into kibana	2019-02-18 10:58:49 +01:00
Kubernetes Prow Robot	a22763b24e	Merge pull request #74063 from huynq0911/fix_wrong_format_yaml_influxdb Fix incorrect influxdb yaml file	2019-02-15 16:46:18 -08:00
Ben Moss	34ac4d9ee9	Update deprecated links	2019-02-15 09:13:07 -05:00
Nguyen Quang Huy	ac8466444c	Fix incorrect influxdb yamle file Remove redundant attribute in container declaration	2019-02-14 14:26:05 +07:00
Blake	e51c9025ac	Lowers the default nodelocaldns denial cache TTL Similar to `--no-negcache` on dnsmasq, this prevents issues which poll DNS for orchestration such as operators with StatefulSets. It can also be very confusing for users when negative caching results in a change they just made seeming to be broken until the cache expires. This assumes that 5 seconds is reasonable and will still catch repeated AAAA negative responses. We could also set the denial cache size to zero which should effectively fully disable it like dnsmasq in kube-dns but testing shows this approach seems to work well in our (albeit small) test clusters.	2019-02-13 13:23:53 -08:00
Jeff Grafton	e216995ef1	Update repo-infra, bazel-skylib, rules_docker, and rules_go dependencies Also require bazel 0.18.0+	2019-02-12 17:55:10 -08:00
Kubernetes Prow Robot	aa00afe231	Merge pull request #73649 from ojmhetar/coredns-priorityclass Add priority class to CoreDNS pods	2019-02-11 22:55:45 -08:00
Jiaying Zhang	52e92ab4b9	Update nvidia-gpu-device-plugin addon. This includes changes from GoogleCloudPlatform/container-engine-accelerators#102	2019-02-11 15:52:33 -08:00
Kubernetes Prow Robot	b50c643be0	Merge pull request #73540 from rlenferink/patch-5 Updated OWNERS files to include link to docs	2019-02-08 09:05:56 -08:00
patc	0e219f4caa	boilerplate fix	2019-02-07 21:12:46 -08:00

1 2 3 4 5 ...

1668 Commits (1a01154162341dfff89f9526324b30f1cfccec64)