Commit Graph

435 Commits (65de86e72fea7e7bb2967521c2001f6c151ddebd)

Author SHA1 Message Date
Kubernetes Prow Robot 4b1282d925
Merge pull request #74016 from ahadas/topology_cleanup
Cleanup in topology.go
2019-02-27 22:49:24 -08:00
danielqsj 79a3eb816c rename latency to duration in metrics 2019-02-18 17:40:04 +08:00
danielqsj 9fd99a48f5 Change kubelet metrics to conform guideline 2019-02-18 14:01:58 +08:00
Kubernetes Prow Robot c88dcee3e9
Merge pull request #73824 from jiayingz/reallocate
Checks whether we have cached runtime state before starting a container
2019-02-15 20:35:30 -08:00
Arik Hadas c3a533e5b2 Cleanup in topology.go
1. Find the minimal thread number within a core using a
single loop rather than by sorting the thread numbers.

2. Inline getUniqueCoreID#err and Discover#numCPUs variables.

3. Narrow the scope of Discover#coreID and Discover#err variables.

Signed-off-by: Arik Hadas <ahadas@redhat.com>
2019-02-14 16:55:37 +02:00
Kubernetes Prow Robot 888ff4097a
Merge pull request #73651 from RobertKrawitz/node_pids_limit
Support total process ID limiting for nodes
2019-02-13 17:31:18 -08:00
Robert Krawitz 2597a1d97e Implement SupportNodePidsLimit, hand-tested 2019-02-13 14:56:17 -05:00
Kubernetes Prow Robot b50c643be0
Merge pull request #73540 from rlenferink/patch-5
Updated OWNERS files to include link to docs
2019-02-08 09:05:56 -08:00
Jiaying Zhang 00b88c14b0 Checks whether we have cached runtime state before starting a container
that requests any device plugin resource. If not, re-issue Allocate
grpc calls. This allows us to handle the edge case that a pod got
assigned to a node even before it populates its extended resource
capacity.
2019-02-07 11:12:36 -08:00
Kubernetes Prow Robot dc1244c6cd
Merge pull request #72785 from derekwaynecarr/hugepages-ga
Graduate HugePages feature to GA
2019-02-05 13:56:51 -08:00
Roy Lenferink b43c04452f Updated OWNERS files to include link to docs 2019-02-04 22:33:12 +01:00
Kubernetes Prow Robot 03b434c9d4
Merge pull request #58122 from tianshapjq/nit-int-is-enough
Len() is already int
2019-02-03 12:02:24 -08:00
Derek Carr deae071d78 Graduate HugePages feature to GA 2019-02-02 00:21:10 -05:00
Andrew Kim 84191eb99b replace pkg/util/file with k8s.io/utils/path 2019-01-29 15:20:13 -05:00
Bernhard Altendorfer 736f35ec29 Fix golint failures 2019-01-24 00:14:25 +01:00
David Ashpole 2b8bc85f75 fix panic in NodeAllocatable node e2e test 2019-01-17 10:57:09 -08:00
ailusazh 10995f661d clean containers in reconcileState of cpuManager 2019-01-15 16:09:28 +08:00
Kubernetes Prow Robot 0dbc99719a
Merge pull request #72076 from derekwaynecarr/pid-limiting
SupportPodPidsLimit feature beta with tests
2019-01-10 01:18:30 -08:00
Kubernetes Prow Robot d88994cf9f
Merge pull request #71306 from ping035627/k8s-181121
fix some typos
2019-01-09 09:06:31 -08:00
Derek Carr bce9d5f204 SupportPodPidsLimit feature beta with tests 2019-01-09 10:50:59 -05:00
Kubernetes Prow Robot 4e8bea4bb7
Merge pull request #71194 from yanghaichao12/dev1119-1
Fix comment error of 'cpuManagerStateFileName'
2018-12-17 20:28:19 -08:00
yuexiao-wang 7b6f60f085 modify BUILD
Signed-off-by: yuexiao-wang <wang.yuexiao@zte.com.cn>
2018-12-11 11:22:06 +08:00
yuexiao-wang f3353c358d [scheduler cleanup phase 2]: Rename to
Signed-off-by: yuexiao-wang <wang.yuexiao@zte.com.cn>
2018-12-11 11:21:12 +08:00
k8s-ci-robot 79e5cb2cb7
Merge pull request #71302 from liggitt/verify-unit-test-feature-gates
Split mutable and read-only access to feature gates, limit tests to readonly access
2018-11-29 21:45:12 -08:00
saad-ali a7c5582bba Permit use of deprecated dir in device plugin. 2018-11-21 18:37:31 -08:00
saad-ali 8f666d9e41 Modify kubelet watcher to support old versions
Modify kubelet plugin watcher to support older CSI drivers that use an
the old plugins directory for socket registration.
Also modify CSI plugin registration to support multiple versions of CSI
registering with the same name.
2018-11-21 18:37:31 -08:00
PingWang 9d541911bb fix some typos
Signed-off-by: PingWang <wang.ping5@zte.com.cn>

fix typo

Signed-off-by: PingWang <wang.ping5@zte.com.cn>
2018-11-22 08:27:14 +08:00
Jordan Liggitt 70ad4dff48 Fix unit tests calling SetFeatureGateDuringTest incorrectly 2018-11-21 11:51:33 -05:00
yanghaichao12 982d1778f8 Fix comment error of 'cpuManagerStateFileName' 2018-11-19 08:07:04 -05:00
Vladimir Vivien b195396154 Kubelet Plugin Registration v1 update fix 2018-11-15 17:40:35 -05:00
David Ashpole 630cb53f82 add kubelet grpc server for pod-resources service 2018-11-15 09:43:20 -08:00
Davanum Srinivas 954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
David Ashpole d4f6ae3615 fix slice sharing bug in cgroup manager 2018-11-05 17:42:42 -08:00
Pengfei Ni 856c83e637 Enable allocatable support for Windows nodes 2018-10-30 11:17:23 +08:00
Christoph Blecker 97b2992dc1
Update gofmt for go1.11 2018-10-05 12:59:38 -07:00
k8s-ci-robot 3fe21e5433
Merge pull request #68922 from BenTheElder/version-staging
move pkg/util/version to staging
2018-09-26 22:59:42 -07:00
k8s-ci-robot 0ca25b8db7
Merge pull request #68816 from FengyunPan2/cgroup-info
Add helpful log for checking cgrop path
2018-09-26 18:10:46 -07:00
FengyunPan2 34a8b1fd9f Add helpful log for checking cgrop path
Currently I just get 'xxx cgroup does not exist', but I don't know
which path has missed. Let's add log for it.
2018-09-25 10:10:12 +08:00
k8s-ci-robot 8346631860
Merge pull request #68053 from Pingan2017/rmifblock
clean up unneeded else block
2018-09-24 17:17:29 -07:00
Benjamin Elder 8b56eb8588 hack/update-gofmt.sh 2018-09-24 12:21:29 -07:00
Benjamin Elder f828c6f662 hack/update-bazel.sh 2018-09-24 12:03:24 -07:00
Benjamin Elder 088cf3c37b find & replace version import 2018-09-24 12:03:24 -07:00
Renaud Gaubert 8dd1d27c03 Updated the device manager pluginwatcher handler 2018-09-06 15:34:46 +02:00
Sandor Szücs 588d2808b7
fix #51135 make CFS quota period configurable, adds a cli flag and config option to kubelet to be able to set cpu.cfs_period and defaults to 100ms as before.
It requires to enable feature gate CustomCPUCFSQuotaPeriod.

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
2018-09-01 20:19:59 +02:00
Pingan2017 2f1284bc34 cleanup unneeded if block 2018-08-30 17:18:56 +08:00
Kubernetes Submit Queue c491d48cde
Merge pull request #67430 from choury/cpumanager
Automatic merge from submit-queue (batch tested with PRs 67430, 67550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cpumanager: rollback state if updateContainerCPUSet failed

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63018

If `updateContainerCPUSet`  failed, the container will start failed. We should rollback the state to avoid CPU leak.
**Special notes for your reviewer**:

**Release note**:

```release-note
cpumanager: rollback state if updateContainerCPUSet failed
```
2018-08-21 23:20:58 -07:00
Ismo Puustinen dd3eeb3f46 device manager: don't do operations on nil pointer.
If grpc.DialContext() fails, a nil connection is returned. Check the
error before calling conn.Close().
2018-08-21 15:20:36 +03:00
Kubernetes Submit Queue d017bebf6b
Merge pull request #67145 from jiayingz/reboot-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fail container start if its requested device plugin resource is unknown.

With the change, Kubelet device manager now checks whether it has cached option state for the requested device plugin resource to make sure the resource is in ready state when we start the container.



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/67107

**Special notes for your reviewer**:

**Release note**:

```release-note
Fail container start if its requested device plugin resource hasn't registered after Kubelet restart.
```
2018-08-21 01:48:54 -07:00
choury 36b92b9b29 cpumanager: rollback state if updateContainerCPUSet failed 2018-08-17 18:08:58 +08:00
tianshapjq 81081dc9e7 nits in manager.go 2018-08-15 08:16:04 +08:00