Commit Graph

65 Commits (7fc21ccf92152537134d3f8a7d6eec03c7d6ac3b)

Author SHA1 Message Date
danielqsj 79a3eb816c rename latency to duration in metrics 2019-02-18 17:40:04 +08:00
danielqsj 9fd99a48f5 Change kubelet metrics to conform guideline 2019-02-18 14:01:58 +08:00
Kubernetes Prow Robot 289a60ad71
Merge pull request #72709 from changyaowei/pleg_relist
When pleg channel is full, discard events and record its count
2019-02-13 01:44:48 -08:00
changyaowei 19f73899fc modify test case 2019-02-13 16:27:15 +08:00
xichengliudui 5dd26ecab5 Fix function comment to consistent with its name
update pull request

update pull request
2019-02-12 01:37:20 -05:00
changyaowei c70ee4272b delete prometheus in unit testing 2019-01-31 12:18:02 +08:00
changyaowei b52afc350f when pleg channel is full, discard events and record how many events discard 2019-01-30 20:43:54 +08:00
Robert Krawitz 3373fcf0fc Reduce logspam for crash looping containers 2018-11-28 10:48:52 -05:00
Davanum Srinivas 954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
k8s-ci-robot 45f6845a59
Merge pull request #69008 from sjenning/better-pleg-msg
improve pleg error msg when it has never been successful
2018-10-30 16:15:43 -07:00
Seth Jennings 5eab76934b improve pleg error msg when it has never been successful 2018-10-01 16:41:01 -05:00
Pingan2017 158552ff35 fix golint failures - /pkg/kubelet/images 2018-09-17 10:52:25 +08:00
Jeff Grafton 23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Jeff Grafton ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Lee Verberne e10042d22f Increment CRI version from v1alpha1 to v1alpha2
This also incorporates the version string into the package name so
that incompatibile versions will fail to connect.

Arbitrary choices:
- The proto3 package name is runtime.v1alpha2. The proto compiler
  normally translates this to a go package of "runtime_v1alpha2", but
  I renamed it to "v1alpha2" for consistency with existing packages.
- kubelet/apis/cri is used as "internalapi". I left it alone and put the
  public "runtimeapi" in kubelet/apis/cri/runtime.
2018-02-07 09:06:26 +01:00
Jeff Grafton efee0704c6 Autogenerate BUILD files 2017-12-23 13:12:11 -08:00
Marcin Owsiany 36dc1c4515 Fix typo in function name.
Also remove a superfluous comment.
2017-10-17 11:31:46 +02:00
Jeff Grafton aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
Kubernetes Submit Queue 28df7a1cae Merge pull request #47806 from dcbw/fix-pod-ip-race
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

kubelet: fix inconsistent display of terminated pod IPs

PLEG and kubelet race when reading and sending pod status to the apiserver.  PLEG
inserts status into a cache, and then signals kubelet.  Kubelet then eventually
reads the status out of that cache, but in the mean time the status could have
been changed by PLEG.

When a pod exits, pod status will no longer include the pod's IP address because
the network plugin/runtime will report "" for terminated pod IPs.  If this status
gets inserted into the PLEG cache before kubelet gets the status out of the cache,
kubelet will see a blank pod IP address.  This happens in about 1/5 of cases when
pods are short-lived, and somewhat less frequently for longer running pods.

To ensure consistency for properties of dead pods, copy an old status update's
IP address over to the new status update if (a) the new status update's IP is
missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility
for a valid IP from the sandbox).

Fixes: https://github.com/kubernetes/kubernetes/issues/47265
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1449373

@eparis @freehan @kubernetes/rh-networking @kubernetes/sig-network-misc
2017-09-22 21:01:50 -07:00
Casey Davenport be5cd7fed2 Recreate pod sandbox when the sandbox does not have an IP address. 2017-09-15 09:23:52 -07:00
Jeff Grafton a7f49c906d Use buildozer to delete licenses() rules except under third_party/ 2017-08-11 09:32:39 -07:00
Jeff Grafton 33276f06be Use buildozer to remove deprecated automanaged tags 2017-08-11 09:31:50 -07:00
Dan Williams 8c16260160 kubelet: fix inconsistent display of terminated pod IPs by using events instead
PLEG and kubelet race when reading and sending pod status to the apiserver.  PLEG
inserts status into a cache, and then signals kubelet.  Kubelet then eventually
reads the status out of that cache, but in the mean time the status could have
been changed by PLEG.

When a pod exits, pod status will no longer include the pod's IP address because
the network plugin/runtime will report "" for terminated pod IPs.  If this status
gets inserted into the PLEG cache before kubelet gets the status out of the cache,
kubelet will see a blank pod IP address.  This happens in about 1/5 of cases when
pods are short-lived, and somewhat less frequently for longer running pods.

To ensure consistency for properties of dead pods, copy an old status update's
IP address over to the new status update if (a) the new status update's IP is
missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility
for a valid IP from the sandbox).

Fixes: https://github.com/kubernetes/kubernetes/issues/47265
2017-07-21 09:52:10 -05:00
Kubernetes Submit Queue c1f8fcd9fe Merge pull request #45496 from andyxning/fix_pleg_relist_time
Automatic merge from submit-queue

fix pleg relist time

This PR fix pleg reslist time. According to current implementation, we have a `Healthy` method periodically check the relist time. If current timestamp subtracts latest relist time is longer than `relistThreshold`(default is 3 minutes), we should return an error to indicate the error of runtime.

`relist` method is also called periodically. If runtime(docker) hung, the relist method should return immediately without updating the latest relist time. If we update latest relist time no matter runtime(docker) hung(default timeout is 2 minutes), the `Healthy` method will never return an error.

```release-note
Kubelet PLEG updates the relist timestamp only after successfully relisting.
```

/cc @yujuhong @Random-Liu @dchen1107
2017-05-21 04:17:14 -07:00
Clayton Coleman 3e095d12b4
Refactor move of client-go/util/clock to apimachinery 2017-05-20 14:19:48 -04:00
Andy Xie af6c040630 fix pleg relist time 2017-05-18 11:40:04 +08:00
Mike Danese a05c3c0efd autogenerated 2017-04-14 10:40:57 -07:00
deads2k 5a8f075197 move authoritative client-go utils out of pkg 2017-01-24 08:59:18 -05:00
deads2k c47717134b move utils used in restclient to client-go 2017-01-19 07:55:14 -05:00
Kubernetes Submit Queue 9a88687e24 Merge pull request #37865 from yujuhong/decouple_lifecycle
Automatic merge from submit-queue

kubelet: remove the pleg health check from healthz

This prevents kubelet from being killed when docker hangs.

Also, kubelet will report node not ready if PLEG hangs (`docker ps` + `docker inspect`).
2017-01-12 19:10:14 -08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Yu-Ju Hong ec0e99c2ed Check the health of PLEG when updating the node status 2017-01-10 16:34:00 -08:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
Mike Danese c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Kubernetes Submit Queue b2d02bd1ab Merge pull request #31395 from yujuhong/getpods
Automatic merge from submit-queue

Instruct PLEG to detect pod sandbox state changes

This PR adds a Sandboxes list in `kubecontainer.Pod`, so that PLEG can check
sandbox changes using `GetPods()` . The sandboxes are treated as regular
containers (type `kubecontainer.Container`) for now to avoid additional
changes in PLEG.

/cc @feiskyer @yifan-gu @euank
2016-09-08 05:41:16 -07:00
Yu-Ju Hong a49d28710a Extend PLEG to handle pod sandboxes
PLEG will treat them as if they are regular containers and detect changes the
same manner. Note that this makes an assumption that container IDs will not
collide with the podsandbox IDs.
2016-08-30 09:54:24 -07:00
Pengfei Ni 1c62d2c368 Kubelet: implement PodStatus for new runtime API 2016-08-25 09:36:00 +08:00
Andrey Kurilin 9f1c3a4c56 Fix various typos in kubelet 2016-08-03 01:14:44 +03:00
Michal Rostecki 59ca5986dd Print/log pointers of structs with %#v instead of %+v
There are many places in k8s where %+v is used to format a pointer
to struct, which isn't working as expected.

Fixes #26591
2016-08-01 22:27:56 +02:00
Harry Zhang cb14b35bde Refactor util clock into it's own pkg 2016-07-28 02:29:04 -04:00
Davanum Srinivas 2b0ed014b7 Use Go canonical import paths
Add canonical imports only in existing doc.go files.
https://golang.org/doc/go1.4#canonicalimports

Fixes #29014
2016-07-16 13:48:21 -04:00
Ron Lai a58c774c08 Including ContainerRemoved in PLEG event reporting 2016-07-14 16:39:03 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Dan Williams 9865ac325c kubelet/cni: make cni plugin runtime agnostic
Use the generic runtime method to get the netns path.  Also
move reading the container IP address into cni (based off kubenet)
instead of having it in the Docker manager code.  Both old and new
methods use nsenter and /sbin/ip and should be functionally
equivalent.
2016-06-22 11:36:10 -05:00
Tim Hockin 817abc3213 Kill our atomic pkg, now that 1.6 is req'd 2016-05-08 20:30:37 -07:00
Andy Goldstein 3a87bfb6f7 PLEG: reinspect pods that failed prior inspections
Fix the following sequence of events:

1. relist call 1 successfully inspects a pod (just has infra container)
1. relist call 2 gets an error inspecting the same pod (has infra container and a transient
container that failed to create) and doesn't update the old/new pod records
1. relist calls 3+ don't inspect the pod any more (just has infra container so it doesn't look like
anything changed)

This change adds a new list that keeps track of pods that failed inspection and retries them the
next time relist is called. Without this change, a pod in this state would never be inspected again,
its entry in the status cache would never be updated, and the pod worker would never call syncPod
again because the most recent entry in the status cache has an error associated with it. Without
this change, pods in this state would be stuck Terminating forever, unless the user issued a
deletion with a grace period value of 0.
2016-05-03 11:06:35 -04:00
goltermann 34d4eaea08 Fixing several (but not all) go vet errors. Most are around string formatting, or unreachable code. 2016-03-22 17:26:50 -07:00
harry b0900bf0d4 Refactor diff into sub pkg 2016-03-21 20:21:39 +08:00