Commit Graph

192 Commits (980242c2092df3594ed38540a80c4bf172c5a68d)

Author SHA1 Message Date
k8s-ci-robot 30a06af453
Merge pull request #69671 from mooncak/fix_kubelet
Delete duplicated words in logs
2018-10-17 11:57:12 -07:00
tanshanshan b7c7966b9f Move pkg/scheduler/algorithm/well_known_labels.go out 2018-10-13 09:10:00 +08:00
mooncake 1e6602d6d8 Fixup log
Signed-off-by: mooncake <xcoder@tenxcloud.com>
2018-10-11 19:14:36 +08:00
Christoph Blecker 97b2992dc1
Update gofmt for go1.11 2018-10-05 12:59:38 -07:00
Krzysztof Jastrzebski 138a3c7172 Add "only_cpu_and_memory" GET parameter to /stats/summary http handler in kubelet. If parameter is true then only cpu and memory will be present in response. The parameter will be used by Metric Server to avoid sending/decoding unneeded data. 2018-09-06 21:49:00 +02:00
Seth Jennings f2a7654978 move feature gate checks inside IsCriticalPod 2018-07-11 16:10:05 -05:00
Jeff Grafton 23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Jeff Grafton a725660640 Update to gazelle 0.12.0 and run hack/update-bazel.sh 2018-06-22 16:22:18 -07:00
David Ashpole b7deb6d9e0 fix eviction event formatting 2018-06-11 11:38:00 -07:00
David Ashpole 93b6d026d9 fix memcg fd leak 2018-06-11 11:37:50 -07:00
David Ashpole fd1f19fc42 add metadata to kubelet eviction event annotations 2018-05-23 16:12:54 -07:00
Kubernetes Submit Queue 2accf11f1a
Merge pull request #57849 from dashpole/eviction_test_event
Automatic merge from submit-queue (batch tested with PRs 63865, 57849, 63932, 63930, 63936). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Eviction Node e2e test checks for eviction reason

**What this PR does / why we need it**:
Currently, the eviction test simply ensures that pods are marked `Failed`.  However, this could occur because of an OOM, rather than an eviction.
To ensure that pods are actually being evicted, check for the Reason in the pod status to ensure it is evicted.

**Release note**:
```release-note
NONE
```

cc @kubernetes/sig-node-pr-reviews
2018-05-17 00:28:19 -07:00
Kubernetes Submit Queue c7bfc2a14e
Merge pull request #63220 from dashpole/fix_memcg_format
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix formatting for kubelet memcg notification threshold

/kind bug
**What this PR does / why we need it**:
This fixes the following errors (found in [this node_e2e serial test log](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-serial/4118/artifacts/tmp-node-e2e-49baaf8a-cos-stable-63-10032-71-0/kubelet.log)):
`eviction_manager.go:256] eviction manager attempting to integrate with kernel memcg notification api`
`threshold_notifier_linux.go:70] eviction: setting notification threshold to 4828488Ki`
`eviction_manager.go:272] eviction manager: failed to create hard memory threshold notifier: invalid argument`

**Special notes for your reviewer**:
This needs to be cherrypicked back to 1.10.
This regression was added in https://github.com/kubernetes/kubernetes/pull/60531, because the `quantity` being used was changed from a DecimalSI to BinarySI, which changes how it is printed out in the String() method.  To make it more explicit that we want the value, just convert Value() to a string.

**Release note**:
```release-note
Fix memory cgroup notifications, and reduce associated log spam.
```
2018-05-16 15:25:06 -07:00
Kubernetes Submit Queue 6934c4f599
Merge pull request #63521 from dashpole/allocatable_memcg
Automatic merge from submit-queue (batch tested with PRs 63314, 63884, 63799, 63521, 62242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add memcg notifications for allocatable cgroup

**What this PR does / why we need it**:
Use memory cgroup notifications to trigger the eviction manager when the allocatable eviction threshold is crossed.  This allows the eviction manager to respond more quickly when the allocatable cgroup's available memory becomes low.  Evictions are preferable to OOMs in the cgroup since the kubelet can enforce its priorities on which pod is killed.

**Which issue(s) this PR fixes**:
Fixes https://github.com/kubernetes/kubernetes/issues/57901

**Special notes for your reviewer**:
This adds the alloctable cgroup from the container manager to the eviction config.

**Release note**:
```release-note
NONE
```
/sig node
/priority important-soon
/kind feature

I would like this to be included in the 1.11 release.
2018-05-15 19:55:15 -07:00
Kubernetes Submit Queue d42df4561a
Merge pull request #61976 from atlassian/ticker-with-stop
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Stop() for Ticker to enable leak-free code

**What this PR does / why we need it**:
I wanted to use the clock package but the `Ticker` without a `Stop()` method is a deal breaker for me.

**Release note**:
```release-note
NONE
```
/kind enhancement
/sig api-machinery
2018-05-09 19:06:56 -07:00
Kubernetes Submit Queue ba3176d94c
Merge pull request #58580 from k82cn/k8s_58505
Automatic merge from submit-queue (batch tested with PRs 58580, 63120). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Admit BestEffort if it tolerates memory pressure.

Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58505 

**Release note**:
```release-note
None
```
2018-05-08 21:45:10 -07:00
David Ashpole a5df208866 eviction test ensures failed pods are evicted 2018-05-08 16:08:35 -07:00
David Ashpole 2294f09e4e add memcg notifications for allocatable cgroup 2018-05-07 17:15:23 -07:00
David Ashpole db99c20a9a cleanup eviction events 2018-05-04 11:02:25 -07:00
David Ashpole b294173f5d fix formatting for memcg threshold 2018-04-19 14:48:41 -07:00
Mikhail Mazurskiy 1f393cdef9
Stop() for Ticker to enable leak-free code 2018-03-31 19:41:43 +11:00
Da K. Ma b367177f3f Admit BestEffort if it tolerates memory pressure.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-03-08 09:25:06 +08:00
David Ashpole 39d9fa60e8 refresh eviction interval periodically 2018-03-06 15:14:05 -08:00
David Ashpole 54cf14ffcc subtract inactive_file from usage when setting memcg threshold 2018-03-06 09:09:44 -08:00
David Ashpole a55119820e fix running with no eviction thresholds 2018-02-20 13:49:14 -08:00
Kubernetes Submit Queue 96ec318718
Merge pull request #59842 from ixdy/update-rules_go-02-2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies

**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.

**Release note**:

```release-note
NONE
```
2018-02-19 22:23:05 -08:00
David Ashpole 960856f4e8 collect metrics on the /kubepods cgroup on-demand 2018-02-17 12:32:40 -08:00
Kubernetes Submit Queue 270ed995f4
Merge pull request #59841 from dashpole/metrics_after_reclaim
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reevaluate eviction thresholds after reclaim functions

**What this PR does / why we need it**:
When the node comes under `DiskPressure` due to inodes or disk space, the eviction manager runs garbage collection functions to clean up dead containers and unused images.
Currently, we use the strategy of trying to measure the disk space and inodes freed by garbage collection.  However, as #46789 and #56573 point out, there are gaps in the implementation that can cause extra evictions even when they are not required.  Furthermore, for nodes which frequently cycle through images, it results in a large number of evictions, as running out of inodes always causes an eviction.

This PR changes this strategy to call the garbage collection functions and ignore the results.  Then, it triggers another collection of node-level metrics, and sees if the node is still under DiskPressure.
This way, we can simply observe the decrease in disk or inode usage, rather than trying to measure how much is freed.

**Which issue(s) this PR fixes**:
Fixes #46789
Fixes #56573
Related PR #56575

**Special notes for your reviewer**:
This will look cleaner after #57802  removes arguments from [makeSignalObservations](https://github.com/kubernetes/kubernetes/pull/57802/files#diff-9e5246d8c78d50ce4ba440f98663f3e9R719).

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority important-soon
cc @kubernetes/sig-node-pr-reviews
2018-02-16 16:31:33 -08:00
Jeff Grafton ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Kubernetes Submit Queue eac5bc0035
Merge pull request #57136 from k82cn/k8s_54313
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updated PID pressure node condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:

```release-note
Updated PID pressure node condition
```
2018-02-16 10:35:33 -08:00
David Ashpole e0830d0b71 reevaluate eviction thresholds after reclaim functions 2018-02-16 08:35:24 -08:00
Michael Taufen 21dbbe14f2 Ignore 0% and 100% eviction thresholds
Primarily, this gives a way to explicitly disable eviction, which is
necessary to use omitempty on EvictionHard.
See: https://github.com/kubernetes/kubernetes/pull/53833#discussion_r166672137

As justification for this approach, neither 0% nor 100% make sense as
eviction thresholds; in the "less-than" case, you can't have less than
0% of a resource and 100% perpetually evicts; in the
"greater-than" case (assuming we ever add a resource with this
semantic), the reasoning is the reverse (not more than 100%, 0%
perpetually evicts).
2018-02-12 14:13:00 -08:00
Lantao Liu 91cbf93b90 Remove unnecessary summary api call.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-08 22:57:30 +00:00
Yassine TIJANI ed8e75a15c fixing array out of bound by checking initContainers instead of containers 2018-01-25 09:58:51 +01:00
linweibin fa8afc1d39 Remove unused code in UT files in pkg/ 2018-01-15 16:02:35 +08:00
Da K. Ma 9a78753144 Updated PID pressure node condition.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-01-14 18:26:00 +08:00
Kubernetes Submit Queue 5636634879
Merge pull request #56112 from dashpole/on_demand_metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable on-demand collection of node metrics

**What this PR does / why we need it**:
This PR enables collecting node-level metrics on-demand.  This is useful because it allows the kubelet to respond to resource pressure more quickly.

**Which issue(s) this PR fixes**:
Ref: #51745

**Release note**:
```release-note
NONE
```

/sig node
/priority important-soon
/kind bug

/assign @vishh @derekwaynecarr 
cc @tallclair
2018-01-12 15:38:42 -08:00
Kubernetes Submit Queue 1824684c7d
Merge pull request #57036 from lcfang/fixevictfunc
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fixed the some typo in eviction_manager

**What this PR does / why we need it**:

fixed some wrong typo in `eviction_manager.go`
2018-01-12 10:12:29 -08:00
Kubernetes Submit Queue 656cb30bb5
Merge pull request #57733 from stewart-yu/fixtypeErrorInEviction
Automatic merge from submit-queue (batch tested with PRs 57733, 57613, 57953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[eviction manager]fix type error

**What this PR does / why we need it**:
It should not  wrong hint messages when create memory threshold notifier failed

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-01-09 21:51:34 -08:00
David Ashpole f6721480f4 enable on-demand metrics for eviction 2018-01-08 10:20:02 -08:00
Kubernetes Submit Queue cc22b10278
Merge pull request #52638 from wackxu/fixbadcom
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the wrong code comment

**What this PR does / why we need it**:

Fix the wrong code comment


**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #55608


**Release note**:

```release-note
NONE
```
2018-01-07 10:22:02 -08:00
Jonathan Basseri 85c5862552 Fix scheduler refs in BUILD files.
Update references to moved scheduler code.
2018-01-05 15:05:01 -08:00
Jonathan Basseri 30b89d830b Move scheduler code out of plugin directory.
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.

Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
2018-01-05 15:05:01 -08:00
lcfang 62f29fcb39 fixed the some typo in eviction_manager 2018-01-04 12:23:11 +08:00
stewart-yu cccd18333b fix type error in cteate Memory Threshold Notifier 2018-01-02 15:08:21 +08:00
Jeff Grafton efee0704c6 Autogenerate BUILD files 2017-12-23 13:12:11 -08:00
xiangpengzhao 8048823d0e Auto generated BUILD files. 2017-12-01 11:24:41 +08:00
xiangpengzhao 1f2262e6b0 Move some kubelet constants to a common place. 2017-12-01 11:24:04 +08:00
David Ashpole 8b3bd5ae60 take disk requests into account during evictions 2017-11-21 10:21:30 -08:00
David Ashpole 527611ee41 remove disk allocatable evictions 2017-11-18 10:34:59 -08:00