Commit Graph

37 Commits (30e811c97c6820132f6b9128bc472878ab679800)

Author SHA1 Message Date
Tony Han b49e07846f reset resultRun to 0 on pod restart 2018-04-19 22:58:19 +08:00
vikaschoudhary16 cedbd93255 Make 'pod' package to use unified checkpointManager
Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
2018-04-16 01:30:20 -04:00
Di Xu 48388fec7e fix all the typos across the project 2018-02-11 11:04:14 +08:00
ymqytw 3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Jacob Simpson 29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
Chao Xu 60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
Kubernetes Submit Queue b641aedcac Merge pull request #46371 from sjenning/fix-liveness-probe-reset
Automatic merge from submit-queue

reset resultRun on pod restart

xref https://bugzilla.redhat.com/show_bug.cgi?id=1455056

There is currently an issue where, if the pod is restarted due to liveness probe failures exceeding failureThreshold, the failure count is not reset on the probe worker.  When the pod restarts, if the liveness probe fails even once, the pod is restarted again, not honoring failureThreshold on the restart.

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  containers:
  - name: busybox
    image: busybox
    command:
    - sleep
    - "3600"
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 3
      timeoutSeconds: 1
      periodSeconds: 3
      successThreshold: 1
      failureThreshold: 5
  terminationGracePeriodSeconds: 0
```

Before this PR:
```
$ kubectl create -f busybox-probe-fail.yaml 
pod "busybox" created
$ kubectl get pod -w
NAME      READY     STATUS    RESTARTS   AGE
busybox   1/1       Running   0          4s
busybox   1/1       Running   1         24s
busybox   1/1       Running   2         33s
busybox   0/1       CrashLoopBackOff   2         39s
```

After this PR:
```
$ kubectl create -f busybox-probe-fail.yaml
$ kubectl get pod -w
NAME      READY     STATUS              RESTARTS   AGE
busybox   0/1       ContainerCreating   0          2s
busybox   1/1       Running   0         4s
busybox   1/1       Running   1         27s
busybox   1/1       Running   2         45s
```

```release-note
Fix kubelet reset liveness probe failure count across pod restart boundaries
```

Restarts are now happen at even intervals.

@derekwaynecarr
2017-06-03 15:15:49 -07:00
Shyam Jeedigunta 4425864707 Migrate kubelet configmap management logic to an interface 2017-05-31 10:39:36 +02:00
Seth Jennings 2c866a7aaa reset resultRun on pod restart 2017-05-24 14:55:53 -05:00
David Ashpole 1d38818326 Revert "Merge pull request #41202 from dashpole/revert-41095-deletion_pod_lifecycle"
This reverts commit ff87d13b2c, reversing
changes made to 46becf2c81.
2017-02-15 08:44:03 -08:00
David Ashpole b224f83c37 Revert "[Kubelet] Delay deletion of pod from the API server until volumes are deleted" 2017-02-09 08:45:18 -08:00
David Ashpole 67cb2704c5 delete volumes before pod deletion 2017-02-08 07:34:49 -08:00
deads2k 8a12000402 move client/record 2017-01-31 19:14:13 -05:00
Aleksandra Malinowska 74e1d8078e Revert "Delay deletion of pod from the API server until volumes are deleted" 2017-01-27 13:31:02 +01:00
David Ashpole 9094b57570 cleanup volumes before deleting from the api server 2017-01-25 10:21:15 -08:00
Wojciech Tyczynski 09e4de385c Enable nontrivial secret manager 2017-01-19 19:47:33 +01:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Chao Xu 03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Clayton Coleman 3454a8d52c
refactor: update bazel, codec, and gofmt 2016-12-03 19:10:53 -05:00
Clayton Coleman 5df8cc39c9
refactor: generated 2016-12-03 19:10:46 -05:00
Chao Xu 5e1adf91df cmd/kubelet 2016-11-23 15:53:09 -08:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Yu-Ju Hong 1a3d205faf kubelet: stop probing if liveness check fails
This change puts the worker on hold when the liveness check fails. It will
resume probing when observing a new container ID.
2016-02-26 17:12:27 -08:00
Chao Xu ad46715f51 generate fake client for release_1_2 2016-02-17 16:10:02 -08:00
Tim St. Clair da0d37f1e0 Fix panic from multiple probe cleanup calls. 2016-02-08 11:23:07 -08:00
Jan Chaloupka 4389b3f0d6 Rewritte util.* -> wait.* wherever reasonable 2016-02-07 12:02:20 +01:00
Chao Xu cddd7b56a4 replace client with clientset in kubelet and other places 2016-02-02 20:28:45 -08:00
harry 1032067ff9 Replace runtime reference by pkg 2016-02-01 21:06:44 +08:00
Tim St. Clair 67cfed5bf3 Don't wait for sync to update readiness
Push status updates as soon as readiness state changes for containers,
rather than waiting for the sync loop to update the status. In
particular, this should help new containers to come online faster.

Additionally, consolidates prober test helpers into a single file.
2015-11-10 14:00:12 -08:00
Tim St. Clair 1e88a682da Add liveness/readiness probe parameters
- PeriodSeconds - How often to probe
- SuccessThreshold - Number of successful probes to go from failure to success state
- FailureThreshold - Number of failing probes to go from success to failure state

This commit includes to changes in behavior:

1. InitialDelaySeconds now defaults to 10 seconds, rather than the
kubelet sync interval (although that also defaults to 10 seconds).
2. Prober only retries on probe error, not failure. To compensate, the
default FailureThreshold is set to the maxRetries, 3.
2015-11-06 10:46:40 -08:00
Tim St. Clair 858126b42a Clean up static/mirror pod status logic
- status.Manager always deals with the local (static) pod, but gets the
  mirror pod when syncing
  - This lets components like the probe workers ignore mirror pods
2015-11-04 11:42:25 -08:00
Tim St. Clair 9a2089adc8 Concurrency fixes in status.Manager
- Fix deadlock when syncing deleted pods with full update channel
- Prevent sending stale updates to API server
- Don't delete cached status when sync fails (causes problems for prober)
2015-10-28 17:40:55 -07:00
Tim St. Clair a263c77b65 Refactor liveness probing
This commit builds on previous work and creates an independent
worker for every liveness probe. Liveness probes behave largely the same
as readiness probes, so much of the code is shared by introducing a
probeType paramater to distinguish the type when it matters. The
circular dependency between the runtime and the prober is broken by
exposing a shared liveness ResultsManager, owned by the
kubelet. Finally, an Updates channel is introduced to the ResultsManager
so the kubelet can react to unhealthy containers immediately.
2015-10-19 15:15:59 -07:00
Tim St. Clair 9b8bc50357 Generalize readinessManager to handle liveness 2015-10-07 16:40:30 -07:00
Tim St. Clair 551eff63b8 Use strong type for container ID
Change all references to the container ID in pkg/kubelet/... to the
strong type defined in pkg/kubelet/container: ContainerID

The motivation for this change is to make the format of the ID
unambiguous, specifically whether or not it includes the runtime
prefix (e.g. "docker://").
2015-10-07 10:58:05 -07:00
Tim St. Clair 52ece0c34e Refactor readiness probing
Each container with a readiness has an individual go-routine which
handles periodic probing for that container. The results are cached, and
written to the status.Manager in the pod sync path.
2015-10-02 15:37:10 -07:00