Commit Graph

17 Commits (4fdde68f78c952edeae183816d4726b39a4f359b)

Author SHA1 Message Date
Ron Lai a58c774c08 Including ContainerRemoved in PLEG event reporting 2016-07-14 16:39:03 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Tim Hockin 817abc3213 Kill our atomic pkg, now that 1.6 is req'd 2016-05-08 20:30:37 -07:00
Andy Goldstein 3a87bfb6f7 PLEG: reinspect pods that failed prior inspections
Fix the following sequence of events:

1. relist call 1 successfully inspects a pod (just has infra container)
1. relist call 2 gets an error inspecting the same pod (has infra container and a transient
container that failed to create) and doesn't update the old/new pod records
1. relist calls 3+ don't inspect the pod any more (just has infra container so it doesn't look like
anything changed)

This change adds a new list that keeps track of pods that failed inspection and retries them the
next time relist is called. Without this change, a pod in this state would never be inspected again,
its entry in the status cache would never be updated, and the pod worker would never call syncPod
again because the most recent entry in the status cache has an error associated with it. Without
this change, pods in this state would be stuck Terminating forever, unless the user issued a
deletion with a grace period value of 0.
2016-05-03 11:06:35 -04:00
goltermann 34d4eaea08 Fixing several (but not all) go vet errors. Most are around string formatting, or unreachable code. 2016-03-22 17:26:50 -07:00
Yu-Ju Hong 4846c1e1b2 pleg: add an internal clock for testability
Also add tests for the health check.
2016-03-01 17:53:03 -08:00
Yu-Ju Hong 94368df91a kubelet: monitor the health of pleg
PLEG is reponsible for listing the pods running on the node. If it's hung
due to non-responsive container runtime or internal bugs, we should restart
kubelet.
2016-03-01 17:24:27 -08:00
Random-Liu 96eeb812ff kubelet: clear current pod records before relist 2016-02-28 13:19:47 -08:00
Yu-Ju Hong 388689238b pleg: ensure the cache is updated whenever container are removed
Even though we don't rely on the cache for garbage collection yet, we should
keep it up-to-date.
2016-02-28 13:16:34 -08:00
Yu-Ju Hong f9880d4a3a kubelet: lower the verbosity level of some logging messages 2016-02-24 18:42:26 -08:00
laushinka 7ef585be22 Spelling fixes inspired by github.com/client9/misspell 2016-02-18 06:58:05 +07:00
Jan Chaloupka 4389b3f0d6 Rewritte util.* -> wait.* wherever reasonable 2016-02-07 12:02:20 +01:00
Yu-Ju Hong b56ed1a8c2 Support populating the runtime cache in PLEG
This changes does not turn on this feature (cache) for kubelet.
2016-01-13 10:19:47 -08:00
Yu-Ju Hong 73a4f8225c PLEG should report events if a container is removed
Currently, pleg would report a event if a container transitions from running to
exited between relisting. However, if would not report any event if a container
gets stopped and removed between relisting. This event will eventually be
handled when the pod syncs periodically, but this is undesirable. This change
ensures that we detect all such events.
2016-01-12 16:25:19 -08:00
Yu-Ju Hong 7d180b337b Record pleg pod relist interval and latency
Relisting latency/interval affects how quick kubelet discovers changes. Record
the metrics in Prometheus to surface such information.
2016-01-04 10:56:38 -08:00
Random-Liu 3cbdf79f8c Change original PodStatus to APIPodStatus, and start using kubelet internal PodStatus in dockertools 2015-12-04 17:37:39 -08:00
Yu-Ju Hong bc6414a873 kubelet: add a generic pod lifecycle event generator
This change introduces pod lifecycle event generator (PLEG), and adds a generic
PLEG. The generic PLEG relies on relisting to discover container events, and is
container-runtime-agnostic. Both docker and rkt are changed to use generic
PLEG.
2015-11-13 09:55:36 -08:00