Commit Graph

200 Commits (f34a24e98e7c837b567b78be3af958ac1156cd80)

Author SHA1 Message Date
Kubernetes Submit Queue 94e77bd4ca Merge pull request #54408 from intelsdi-x/cpu-state-file
Automatic merge from submit-queue (batch tested with PRs 54656, 54552, 54389, 53634, 54408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add file backed state to cpu manager

**What this PR does / why we need it**:
Adds file backed `State` implementation to cpu manger with tests.
Reads from `State` are done from memory, while each write triggers state save to a file.

Any failure in reading the state file results in empty state

Next PR: #54409
2017-10-26 21:08:38 -07:00
Rohit Agarwal 092429be1c Better error messages and logging while registering device plugins. 2017-10-26 15:17:38 -07:00
Michał Stachowski 97e3f7bf86 State file test fixes 2017-10-26 20:03:35 +02:00
Szymon Scharmach 4ee0adc77a Added Cpu Manager file state 2017-10-26 20:03:17 +02:00
Jiaying Zhang e501f01d85 Move podDevices code into a separate file. 2017-10-24 17:48:59 -07:00
Jiaying Zhang ff4e8d429e Device plugin code refactoring to cope with file move.
While moving device_plugin_handler_test.go from pkg/kubelet/cm/ to
pkg/kubelet/cm/deviceplugin/, we can no longer uses cm in its tests
because that would cause a cycle dependency. To solve this problem,
I moved the main cm GetResources functionality as well as part of the
current device plugin handler Allocate functionality into a new device
plugin handler function, GetDeviceRunContainerOptions(). This
refactoring is also needed by another PR 51895 that moves device
allocation into admission phase. Now device plugin handler Allocate()
first checks whether there is cached device runtime state and only
issues Allocate grpc call if there is no cached state available.
The new GetDeviceRunContainerOptions() function simply returns device
runtime config from the cached state. To support this change, extended the
podDevices struct and checkpoint data structure with device runtime state.
2017-10-24 14:38:15 -07:00
Jiaying Zhang 796f488789 Move device plugin related files under pkg/kubelet/cm/deviceplugin/. 2017-10-24 14:17:20 -07:00
lichuqiang fd8b04649e unnecessary functions cleanup for deviceplugin 2017-10-20 09:37:59 +08:00
Vishnu kannan 16b0363b95 Disabling k8s.io/kubernetes/pkg/kubelet/cm TestPodContainerDeviceAllocation due to #54100
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:35:24 -07:00
Vishnu kannan e0032af916 bump device plugin version to v1alpha2 to reflect the change to AllocateResponce API
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:35:24 -07:00
Vishnu kannan 18eee1eaa0 Make AllocateResponse artifacts global across all devices per container in device plugin API
There is no use case known for passing artifacts per device as it currently exists. The current API is also
complex to use for simple clients. Hence this PR creates a flat namespace where artifacts like environment variables
and mount points apply globally to all devices returned as part of AllocateResponse proto.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:34:00 -07:00
Kubernetes Submit Queue 1d8f1e268f Merge pull request #47699 from supereagle/fix-typos
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typos: remove duplicated word in comments

**What this PR does / why we need it**: Remove the duplicated word `the` in comments

**Which issue this PR fixes** : fixes #

**Special notes for your reviewer**:

```release-note
NONE
```
2017-10-17 02:35:52 -07:00
Jeff Grafton aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
Kubernetes Submit Queue 3deab69d3b Merge pull request #53790 from yanxuean/cgroupredundancy
Automatic merge from submit-queue (batch tested with PRs 52959, 53790). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove redundancy code in setCPUCgroupConfig

fix #53925

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>



**What this PR does / why we need it**:

The check of burstableCPUShares is redundancy. We have done it in MilliCPUToShares. It is responsibility of MilliCPUToShares.
```
func (m *qosContainerManagerImpl) setCPUCgroupConfig(configs map[v1.PodQOSClass]*CgroupConfig) error {
        ........
	// set burstable shares based on current observe state
	burstableCPUShares := MilliCPUToShares(burstablePodCPURequest)
	if burstableCPUShares < uint64(MinShares) {
		burstableCPUShares = uint64(MinShares)
	}
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Improveing code.

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-13 19:19:32 -07:00
Kubernetes Submit Queue 03adf92aa9 Merge pull request #53753 from derekwaynecarr/log-spam
Automatic merge from submit-queue (batch tested with PRs 53119, 53753, 53795, 52981). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce log spam in qos container manager

**What this PR does / why we need it**:
excessive log stmts make it hard to debug actual problems.

**Release note**:
```release-note
NONE
```
2017-10-12 08:28:36 -07:00
yanxuean 8adb2181eb remove redundancy code in setCPUCgroupConfig
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-12 18:42:18 +08:00
Derek Carr 328a12d160 Reduce log spam in qos container manager 2017-10-11 19:47:40 -04:00
Euan Kemp 7aa88b5103 kubelet/cm: remove unneeded fork of 'cat'
Reading a file in Go is perfectly possible without invoking cat.

I also removed an outdated comment.
2017-10-10 21:53:35 -07:00
Kubernetes Submit Queue ec116fdc73 Merge pull request #53328 from intelsdi-x/lscpu_fix
Automatic merge from submit-queue (batch tested with PRs 53297, 53328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cpu Manager - make CoreID's platform unique

**What this PR does / why we need it**:
Cpu Manager uses topology from cAdvisor(`/proc/cpuinfo`) where coreID's are socket unique - not platform unique - this causes problems on multi-socket platforms.

All code assumes unique coreID's (on platform) -  `Discovery` function has been changed to assign CoreID as the lowest cpuID from all cpus belonging to the same core. This can be expressed as:
`CoreID=min(cpuID's on the same core)`

Since cpuID's are platform unique - above gives us guarantee that CoreID's will also be platform unique.



**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53323
2017-10-10 11:20:37 -07:00
Kubernetes Submit Queue aaf14d4619 Merge pull request #53525 from sttts/sttts-scheme-copier-romoval
Automatic merge from submit-queue (batch tested with PRs 53525, 53652). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apimachinery: remove ObjectCopier interface(s)

The big commit is a mechanical, transitive removal of the copier interfaces in all structs and function calls.
2017-10-10 08:31:41 -07:00
Szymon Scharmach b86dc9c054 Make CoreID's platform unique 2017-10-10 10:45:44 +02:00
Kubernetes Submit Queue c12dab37e7 Merge pull request #53547 from jiayingz/deviceplugin-fix
Automatic merge from submit-queue (batch tested with PRs 52662, 53547, 53588, 53573, 53599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

In DevicePluginHandlerImpl.Allocate(), skips untracked extended resou…

…rces.

Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53548

**Special notes for your reviewer**:

**Release note**:

```release-note
Ignore extended resources that are not registered with kubelet
```
2017-10-09 12:51:17 -07:00
Kubernetes Submit Queue 85b252d47e Merge pull request #51771 from dixudx/refactor_nsenter
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor nsenter

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51273

**Special notes for your reviewer**:
/assign @jsafrane 

**Release note**:

```release-note
None
```
2017-10-08 23:27:32 -07:00
Dr. Stefan Schimanski ecb65a6a71 Update generated files 2017-10-07 11:28:47 +02:00
Jiaying Zhang ee1ffa619b In DevicePluginHandlerImpl.Allocate(), skips untracked extended resources.
Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.
2017-10-06 13:57:53 -07:00
Dr. Stefan Schimanski ed586da147 apimachinery: remove Scheme.DeepCopy 2017-10-06 14:59:17 +02:00
Kubernetes Submit Queue 5e2ce3aaf2 Merge pull request #53122 from resouer/fix-cpu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Eliminate extra CRI call during processing cpu set

**What this PR does / why we need it**:

Encountered this during `kubernetes/frakti` node e2e test.

When cpuset is not set, there's still plenty of `runtime.UpdateContainerResources` been called, which seems unnecessary.

cc @ConnorDoyle Make sense? Fixes: #53304

**Special notes for your reviewer**:

**Release note**:

```release-note
Only do UpdateContainerResources when cpuset is set 
```
2017-10-01 15:30:56 -07:00
Harry Zhang 282973d87d Elimenate extra CRI call 2017-09-30 16:51:32 +08:00
Kubernetes Submit Queue 6fcf841d69 Merge pull request #52692 from wackxu/fbc
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the bad code comment and make the format unify

**What this PR does / why we need it**:

Fix the bad code comment and make the format unify

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #


**Release note**:

```release-note
NONE
```
2017-09-28 21:15:43 -07:00
Di Xu 57ead4898b use GetFileType per mount.Interface to check hostpath type 2017-09-26 09:57:06 +08:00
NickrenREN 7f9696201e Fix --kube-reserved storage key name and add test cases for node allocatable reservation 2017-09-26 09:32:21 +08:00
wackxu d8aa0ca82a fix the bad code comment and make the format unify 2017-09-19 11:15:10 +08:00
supereagle 87c29a08e1 fix typos: remove duplicated word in comments 2017-09-16 14:38:10 +08:00
Balaji Subramaniam e2cb80db4a Added large topology tests for static policy in CPU Manager.
- Added comments for tests cases.
2017-09-06 13:15:22 -07:00
Kubernetes Submit Queue dcc1aa0628 Merge pull request #51928 from mindprince/pr-45724-fix-build
Automatic merge from submit-queue

Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.

This was broken in #45724

**Release note**:
```release-note
NONE
```
/sig storage
/sig node

/cc @jsafrane, @vishh
2017-09-05 19:44:54 -07:00
Kubernetes Submit Queue 99aa992ce8 Merge pull request #51751 from dashpole/update_cadvisor_godep
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)

Update Cadvisor Dependency

Fixes: https://github.com/kubernetes/kubernetes/issues/51832
This is the worst dependency update ever... 
The root of the problem is the [name change of Sirupsen -> sirupsen](https://github.com/sirupsen/logrus/issues/570#issuecomment-313933276).  This means that in order to update cadvisor, which venders the lowercase, we need to update all dependencies to use the lower-cased version.  With that being said, this PR updates the following packages:

`github.com/docker/docker`
- `github.com/docker/distribution`
  - `github.com/opencontainers/go-digest`
  - `github.com/opencontainers/image-spec`
  - `github.com/opencontainers/runtime-spec`
  - `github.com/opencontainers/selinux`
  - `github.com/opencontainers/runc`
    - `github.com/mrunalp/fileutils`
  - `golang.org/x/crypto`
    - `golang.org/x/sys`
- `github.com/docker/go-connections`
- `github.com/docker/go-units`
- `github.com/docker/libnetwork`
- `github.com/docker/libtrust`
- `github.com/sirupsen/logrus`
- `github.com/vishvananda/netlink`

`github.com/google/cadvisor`
- `github.com/euank/go-kmsg-parser`

`github.com/json-iterator/go`

Fixed https://github.com/kubernetes/kubernetes/issues/51832

```release-note
Fix journalctl leak on kubelet restart
Fix container memory rss
Add hugepages monitoring support
Fix incorrect CPU usage metrics with 4.7 kernel
Add tmpfs monitoring support
```
2017-09-05 17:30:06 -07:00
Kubernetes Submit Queue 8b9e8cf80a Merge pull request #51744 from jiayingz/deviceplugin-checkpoint
Automatic merge from submit-queue (batch tested with PRs 50072, 51744)

Deviceplugin checkpoint

**What this PR does / why we need it**:
Extends on top of PR 51209 to checkpoint device to pod allocation information on Kubelet to recover from Kubelet restarts.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-05 13:33:01 -07:00
David Ashpole e5a6a79fd7 update cadvisor, docker, and runc godeps 2017-09-05 12:38:57 -07:00
Jiaying Zhang 3b2bc58c11 Extends device_plugin_handler to checkpoint device to container allocation information. 2017-09-05 09:52:14 -07:00
Derek Carr 38d5dee677 Node validation restricts pre-allocated hugepages to single page size 2017-09-05 10:34:30 -04:00
Derek Carr 1ec2a69d9a Kubelet changes to support hugepages 2017-09-05 09:46:08 -04:00
Rohit Agarwal 08ea02b9a5 Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.
This was broken in #45724
2017-09-04 21:48:55 -07:00
Balaji Subramaniam 5b5958ecec Add tests for the static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle d0bcbbb437 Added static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle e03a6435bb Added cpu assignment helpers. 2017-09-04 07:24:59 -07:00
Szymon Scharmach 242439c9d7 Add topology helper and tests to cpumanager. 2017-09-04 07:24:59 -07:00
Connor Doyle e4d5565228 Fix Start signature in container_manager_windows. 2017-09-04 07:24:59 -07:00
Connor Doyle 81ccd396d7 Fixed nil InternalContainerLifecycle in cm stubs. 2017-09-04 07:24:59 -07:00
Connor Doyle ec706216e6 Un-revert "CPU manager wiring and `none` policy"
This reverts commit 8d2832021a.
2017-09-04 07:24:59 -07:00
Jiaying Zhang 29d178fbc3 Fixes a cross-build failure introduced in PR 51209. FYI, issue 51863. 2017-09-02 21:56:39 -07:00