Commit Graph

164 Commits (1a6a01ee79b7cb3654b8c7efd3e185e2f5b15d91)

Author SHA1 Message Date
Kubernetes Submit Queue 270ed995f4
Merge pull request #59841 from dashpole/metrics_after_reclaim
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reevaluate eviction thresholds after reclaim functions

**What this PR does / why we need it**:
When the node comes under `DiskPressure` due to inodes or disk space, the eviction manager runs garbage collection functions to clean up dead containers and unused images.
Currently, we use the strategy of trying to measure the disk space and inodes freed by garbage collection.  However, as #46789 and #56573 point out, there are gaps in the implementation that can cause extra evictions even when they are not required.  Furthermore, for nodes which frequently cycle through images, it results in a large number of evictions, as running out of inodes always causes an eviction.

This PR changes this strategy to call the garbage collection functions and ignore the results.  Then, it triggers another collection of node-level metrics, and sees if the node is still under DiskPressure.
This way, we can simply observe the decrease in disk or inode usage, rather than trying to measure how much is freed.

**Which issue(s) this PR fixes**:
Fixes #46789
Fixes #56573
Related PR #56575

**Special notes for your reviewer**:
This will look cleaner after #57802  removes arguments from [makeSignalObservations](https://github.com/kubernetes/kubernetes/pull/57802/files#diff-9e5246d8c78d50ce4ba440f98663f3e9R719).

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority important-soon
cc @kubernetes/sig-node-pr-reviews
2018-02-16 16:31:33 -08:00
Kubernetes Submit Queue eac5bc0035
Merge pull request #57136 from k82cn/k8s_54313
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updated PID pressure node condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:

```release-note
Updated PID pressure node condition
```
2018-02-16 10:35:33 -08:00
David Ashpole e0830d0b71 reevaluate eviction thresholds after reclaim functions 2018-02-16 08:35:24 -08:00
Michael Taufen 21dbbe14f2 Ignore 0% and 100% eviction thresholds
Primarily, this gives a way to explicitly disable eviction, which is
necessary to use omitempty on EvictionHard.
See: https://github.com/kubernetes/kubernetes/pull/53833#discussion_r166672137

As justification for this approach, neither 0% nor 100% make sense as
eviction thresholds; in the "less-than" case, you can't have less than
0% of a resource and 100% perpetually evicts; in the
"greater-than" case (assuming we ever add a resource with this
semantic), the reasoning is the reverse (not more than 100%, 0%
perpetually evicts).
2018-02-12 14:13:00 -08:00
Lantao Liu 91cbf93b90 Remove unnecessary summary api call.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-08 22:57:30 +00:00
Yassine TIJANI ed8e75a15c fixing array out of bound by checking initContainers instead of containers 2018-01-25 09:58:51 +01:00
linweibin fa8afc1d39 Remove unused code in UT files in pkg/ 2018-01-15 16:02:35 +08:00
Da K. Ma 9a78753144 Updated PID pressure node condition.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-01-14 18:26:00 +08:00
Kubernetes Submit Queue 5636634879
Merge pull request #56112 from dashpole/on_demand_metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable on-demand collection of node metrics

**What this PR does / why we need it**:
This PR enables collecting node-level metrics on-demand.  This is useful because it allows the kubelet to respond to resource pressure more quickly.

**Which issue(s) this PR fixes**:
Ref: #51745

**Release note**:
```release-note
NONE
```

/sig node
/priority important-soon
/kind bug

/assign @vishh @derekwaynecarr 
cc @tallclair
2018-01-12 15:38:42 -08:00
Kubernetes Submit Queue 1824684c7d
Merge pull request #57036 from lcfang/fixevictfunc
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fixed the some typo in eviction_manager

**What this PR does / why we need it**:

fixed some wrong typo in `eviction_manager.go`
2018-01-12 10:12:29 -08:00
Kubernetes Submit Queue 656cb30bb5
Merge pull request #57733 from stewart-yu/fixtypeErrorInEviction
Automatic merge from submit-queue (batch tested with PRs 57733, 57613, 57953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[eviction manager]fix type error

**What this PR does / why we need it**:
It should not  wrong hint messages when create memory threshold notifier failed

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-01-09 21:51:34 -08:00
David Ashpole f6721480f4 enable on-demand metrics for eviction 2018-01-08 10:20:02 -08:00
Kubernetes Submit Queue cc22b10278
Merge pull request #52638 from wackxu/fixbadcom
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the wrong code comment

**What this PR does / why we need it**:

Fix the wrong code comment


**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #55608


**Release note**:

```release-note
NONE
```
2018-01-07 10:22:02 -08:00
Jonathan Basseri 85c5862552 Fix scheduler refs in BUILD files.
Update references to moved scheduler code.
2018-01-05 15:05:01 -08:00
Jonathan Basseri 30b89d830b Move scheduler code out of plugin directory.
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.

Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
2018-01-05 15:05:01 -08:00
lcfang 62f29fcb39 fixed the some typo in eviction_manager 2018-01-04 12:23:11 +08:00
stewart-yu cccd18333b fix type error in cteate Memory Threshold Notifier 2018-01-02 15:08:21 +08:00
Jeff Grafton efee0704c6 Autogenerate BUILD files 2017-12-23 13:12:11 -08:00
xiangpengzhao 8048823d0e Auto generated BUILD files. 2017-12-01 11:24:41 +08:00
xiangpengzhao 1f2262e6b0 Move some kubelet constants to a common place. 2017-12-01 11:24:04 +08:00
David Ashpole 8b3bd5ae60 take disk requests into account during evictions 2017-11-21 10:21:30 -08:00
David Ashpole 527611ee41 remove disk allocatable evictions 2017-11-18 10:34:59 -08:00
Michael Taufen 1085b6f730 Lift embedded structure out of eviction-related KubeletConfiguration fields
- Changes the following KubeletConfiguration fields from `string` to
`map[string]string`:
  - `EvictionHard`
  - `EvictionSoft`
  - `EvictionSoftGracePeriod`
  - `EvictionMinimumReclaim`
- Adds flag parsing shims to maintain Kubelet's public flags API, while
enabling structured input in the file API.
- Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag
parsing shim living in the kubeletconfig API group, and replaces it
with the `MapStringString` shim introduced in this PR. Flag parsing
shims belong in a common place, not in the kubeletconfig API.
I manually audited these to ensure that this wouldn't cause errors
parsing the command line for syntax that would have previously been
error free (`kubeletconfig.ConfigurationMap` was unique in that it
allowed keys to be provided on the CLI without values. I believe this was
done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag,
which rightfully accepts value-free keys, and that this shim was then
just copied to `kubeletconfig`). Fortunately, the affected fields
(`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect
non-empty strings in the values of the map, and as a result passing the
empty string is already an error. Thus requiring keys shouldn't break
anyone's scripts.
- Updates code and tests accordingly.

Regarding eviction operators, directionality is already implicit in the
signal type (for a given signal, the decision to evict will be made when
crossing the threshold from either above or below, never both). There is
no need to expose an operator, such as `<`, in the API. By changing
`EvictionHard` and `EvictionSoft` to `map[string]string`, this PR
simplifies the experience of working with these fields via the
`KubeletConfiguration` type. Again, flags stay the same.

Other things:
- There is another flag parsing shim, `flags.ConfigurationMap`, from the
shared flag utility. The `NodeLabels` field still uses
`flags.ConfigurationMap`. This PR moves the allocation of the
`map[string]string` for the `NodeLabels` field from
`AddKubeletConfigFlags` to the defaulter for the external
`KubeletConfiguration` type. Flags are layered on top of an internal
object that has undergone conversion from a defaulted external object,
which means that previously the mere registration of flags would have
overwritten any previously-defined defaults for `NodeLabels` (fortunately
there were none).
2017-11-16 18:35:13 -08:00
Dr. Stefan Schimanski bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski 012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Jeff Grafton aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
David Ashpole 539fddb49d kubelet evictions take priority into account 2017-10-12 13:15:05 -07:00
David Ashpole 8659676408 feature gate local storage allocatable eviction 2017-10-11 09:53:56 -07:00
Kubernetes Submit Queue 6fcf841d69 Merge pull request #52692 from wackxu/fbc
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the bad code comment and make the format unify

**What this PR does / why we need it**:

Fix the bad code comment and make the format unify

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #


**Release note**:

```release-note
NONE
```
2017-09-28 21:15:43 -07:00
Kubernetes Submit Queue 0bd2ed16a0 Merge pull request #47080 from jingxu97/May/allocatable
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Map a resource to multiple signals in eviction manager

It is possible to have multiple signals that point to the same type of
resource, e.g., both SignalNodeFsAvailable and
SignalAllocatableNodeFsAvailable refer to the same resource NodeFs.
Change the map from map[v1.ResourceName]evictionapi.Signal to
map[v1.ResourceName][]evictionapi.Signal



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52661

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-19 17:31:07 -07:00
wackxu d8aa0ca82a fix the bad code comment and make the format unify 2017-09-19 11:15:10 +08:00
wackxu 2aeb234c80 fix the bad code comment 2017-09-18 15:13:24 +08:00
Kubernetes Submit Queue 20a4112e88 Merge pull request #46542 from derekwaynecarr/quota-ignore-pod-whose-node-lost
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)

Ignore pods for quota marked for deletion whose node is unreachable

**What this PR does / why we need it**:
Traditionally, we charge to quota all pods that are in a non-terminal phase.  We have a user report that noted the behavior change in kube 1.5 for the node controller to no longer force delete pods whose nodes have been lost.  Instead, the pod is marked for deletion, and the reason is updated to state that the node is unreachable.  The user expected the quota to be released.  If the user was at their quota limit, their application may not be able to create a new replica given the current behavior.  As a result, this PR ignores pods marked for deletion that have exceeded their grace period.

**Which issue this PR fixes**
xref https://bugzilla.redhat.com/show_bug.cgi?id=1455743
fixes https://github.com/kubernetes/kubernetes/issues/52436

**Release note**:
```release-note
Ignore pods marked for deletion that exceed their grace period in ResourceQuota
```
2017-09-15 00:11:10 -07:00
Derek Carr da01c6d3a2 Ignore pods for quota that exceed deletion grace period 2017-09-11 13:31:52 -04:00
David Ashpole d60d4a4420 soft eviction timer works 2017-09-06 13:01:49 -07:00
Kubernetes Submit Queue 0554520495 Merge pull request #50938 from cblecker/threshold-crossbuild
Automatic merge from submit-queue (batch tested with PRs 51666, 49829, 51058, 51004, 50938)

Fix threshold notifier build tags

**What this PR does / why we need it**:
Cross building from darwin is currently broken on the following error:
```
# k8s.io/kubernetes/pkg/kubelet/eviction
pkg/kubelet/eviction/threshold_notifier_unsupported.go:25: NewMemCGThresholdNotifier redeclared in this block
        previous declaration at pkg/kubelet/eviction/threshold_notifier_linux.go:38
```
It looks like #49300 broke the build tags introduced in #38630 and #37384. This fixes the build tag on `threshold_notifier_unsupported.go` as the cgo requirement was removed from `threshold_notifier_linux.go`.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50935

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-09-02 22:52:11 -07:00
Jing Xu 8f98230f20 Map a resource to multiple signals in eviction manager
It is possible to have multiple signals that point to the same type of
resource, e.g., both SignalNodeFsAvailable and
SignalAllocatableNodeFsAvailable refer to the same resource NodeFs.
Change the map from map[v1.ResourceName]evictionapi.Signal to
map[v1.ResourceName][]evictionapi.Signal
2017-09-01 12:54:37 -07:00
Kubernetes Submit Queue aa50c0f54c Merge pull request #51490 from NickrenREN/eviction-podLocalEphemeralStorageUsage
Automatic merge from submit-queue (batch tested with PRs 51628, 51637, 51490, 51279, 51302)

Fix pod local ephemeral storage usage calculation

We use podDiskUsage to calculate pod local ephemeral storage which is not correct, because podDiskUsage also contains HostPath volume  which is considered as persistent storage
This pr fixes it
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51489

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```

/assign @jingxu97  @vishh 
cc @ddysher
2017-09-01 00:11:17 -07:00
Jing Xu 4d6da1fd9a Change SizeLimit to a pointer
This PR fixes issue #50121
2017-08-30 11:50:35 -07:00
NickrenREN 9fadd3bd9a Fix pod local ephemeral storage usage 2017-08-30 13:53:54 +08:00
NickrenREN 27901ad5df Change eviction policy to manage one single local storage resource 2017-08-26 05:14:49 +08:00
Christoph Blecker 8163afddd2
Fix threshold notifier build tags
This was preventing cross builds from darwin
2017-08-18 15:54:31 -07:00
Jeff Grafton a7f49c906d Use buildozer to delete licenses() rules except under third_party/ 2017-08-11 09:32:39 -07:00
Jeff Grafton 33276f06be Use buildozer to remove deprecated automanaged tags 2017-08-11 09:31:50 -07:00
Jeff Grafton cf55f9ed45 Autogenerate BUILD files 2017-08-11 09:30:23 -07:00
Malepati Bala Siva Sai Akhil ee82de565a Fixed typo in comment in eviction_manager
Fixed typo in comment in eviction_manager
2017-08-06 01:04:41 +05:30
Kubernetes Submit Queue 5d24a2c199 Merge pull request #49300 from tklauser/syscall-to-x-sys-unix
Automatic merge from submit-queue

Switch from package syscall to golang.org/x/sys/unix

**What this PR does / why we need it**:

The syscall package is locked down and the comment in https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24 advises to switch code to use the corresponding package from golang.org/x/sys. This PR does so and replaces usage of package syscall with package golang.org/x/sys/unix where applicable. This will also allow to get updates and fixes
without having to use a new go version.

In order to get the latest functionality, golang.org/x/sys/ is re-vendored. This also allows to use Eventfd() from this package instead of calling the eventfd() C function.

**Special notes for your reviewer**:

This follows previous works in other Go projects, see e.g. moby/moby#33399, cilium/cilium#588

**Release note**:

```release-note
NONE
```
2017-08-03 04:02:12 -07:00
Kubernetes Submit Queue 9350afd772 Merge pull request #48976 from supereagle/cleanup-api-package
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)

Remove duplicated import and wrong alias name of api package

**What this PR does / why we need it**:

**Which issue this PR fixes**: fixes #48975

**Special notes for your reviewer**:
/assign @caesarxuchao

**Release note**:
```release-note
NONE
```
2017-07-25 12:14:38 -07:00
Kubernetes Submit Queue 7f1d9382ec Merge pull request #48846 from dashpole/remove_ood
Automatic merge from submit-queue

Remove flags low-diskspace-threshold-mb and outofdisk-transition-frequency

issue: #48843

This removes two flags replaced by the eviction manager.  These have been depreciated for two releases, which I believe correctly follows the kubernetes depreciation guidelines.

```release-note
Remove depreciated flags: --low-diskspace-threshold-mb and --outofdisk-transition-frequency, which are replaced by --eviction-hard
```

cc @mtaufen since I am changing kubelet flags
cc @vishh @derekwaynecarr 
/sig node
2017-07-24 23:05:50 -07:00
supereagle adc0eef43e remove duplicated import and wrong alias name of api package 2017-07-25 10:04:25 +08:00