Commit Graph

435 Commits (65de86e72fea7e7bb2967521c2001f6c151ddebd)

Author SHA1 Message Date
xiangpengzhao 1f2262e6b0 Move some kubelet constants to a common place. 2017-12-01 11:24:04 +08:00
tianshapjq 0cc6a4d937 new testcase to cgroup_manager_linux.go 2017-11-30 14:14:59 +08:00
Szymon Scharmach 552e4d3a9d Cpu manager reconclie loop can restore state 2017-11-27 11:22:21 +01:00
vikaschoudhary16 de358fb21f Use file store utility for device plugin check-pointing 2017-11-24 08:41:11 -05:00
Rohit Agarwal 4b216f7cd9 Remove redundant code in container manager.
- Reuse stub implementations from unsupported implementations.
- Delete test file that didn't contain any tests.
2017-11-24 03:15:55 -08:00
Connor Doyle 4f185e6b7f CPU Manager panics on state initialization error.
- Update unit tests accordingly.
- Minor related cleanup in state_file.go
2017-11-22 10:25:38 -08:00
Jing Xu a66ee2eb3f Add pod-level metric for CPU and memory stats
This PR adds the pod-level metrics for CPU and memory stats. cAdvisor
can get all pod cgroup information so we can add this pod-level CPU and
memory stats information from the corresponding pod cgroup
2017-11-22 09:25:23 -08:00
Jiaying Zhang 048bafdd0b Adds device plugin registration count metric and allocation latency metric. 2017-11-21 13:44:10 -08:00
Jiaying Zhang 1eb4e79453 Extends deviceplugin to gracefully handle full device plugin lifecycle.
- Instead of using cm.capacity field to communicate device plugin resource
capacity, this PR changes to use an explicit cm.GetDevicePluginResourceCapacity()
function that returns device plugin resource capacity as well as any inactive
device plugin resource. Kubelet syncNodeStatus call this function during its
periodic run to update node status capacity and allocatable. After this call,
device plugin can remove the inactive device plugin resource from its allDevices
field as the update is already pushed to API server.
- Extends device plugin checkpoint data to record registered resources
so that we can finish resource removing even upon kubelet restarts.
- Passes sourcesReady from kubelet to device plugin to avoid removing
inactive pods during grace period of kubelet restart.
2017-11-20 23:40:14 -08:00
Niklas Q. Nielsen b16bfc768d Merging handler into manager API 2017-11-20 21:37:46 +00:00
Kubernetes Submit Queue 0b1d023aa7
Merge pull request #55884 from mpolednik/dpi-race-fix
Automatic merge from submit-queue (batch tested with PRs 55839, 54495, 55884, 55983, 56069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

deviceplugin: fix race when multiple plugins are registered

**What this PR does / why we need it**:
When registering multiple device plugins to Kubelet concurrently, there exists a race that crashes the Kubelet.

Consider two plugins: D1 and D2. The call order method is roughly

D1 -> manager.go:register -> endpoint.go:listAndWatch -> device_plugin_handler.go:(*D1).callback
D2 -> manager.go:register -> endpoint.go:listAndWatch -> device_plugin_handler.go:(*D2).callback

The callback function accesses HandlerImpl's allDevices map that maps (resourceName -> DeviceID). If both plugins reach these accesses at the same time, Kubelet crashes with "fatal error: concurrent map read and map write".

This can be solved by making sure handler is locked when allDevices are being updated. The functionality is needed to avoid Kubelet crashes when multiple device plugins are trying to register with Kubelet at the same moment. Occurs frequently when single binary tries to register itself as multiple plugins.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-11-20 13:08:09 -08:00
Kubernetes Submit Queue 869b5ab191
Merge pull request #55841 from ConnorDoyle/cpuman-file-state-for-none-policy
Automatic merge from submit-queue (batch tested with PRs 55841, 55948, 55945). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

CPU Manager: file state for all policies

**What this PR does / why we need it**:

Before this change, the new file-backed state was only enabled for the static CPU manager policy. This patch enables persistent state for all policies.

This PR fixes #55736 and the potential CPU resource leak described in that issue.

**Release note**:

```release-note
NONE
```

/kind bug
/sig node
/assign @balajismaniam
2017-11-18 14:10:12 -08:00
Kubernetes Submit Queue c60b35bcd3
Merge pull request #52977 from yanxuean/improvecgroup
Automatic merge from submit-queue (batch tested with PRs 54837, 55970, 55912, 55898, 52977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve kubelet cgroup

**What this PR does / why we need it**:
1.Use arg cgroupRoot,not nodeConfig.CgroupRoot
    Using both arg cgroupRoot and nodeConfig.CgroupRoot is confused in function NewQOSContainerManager
2.improve cgroupmanager in qosContainerManager
3. improve arg "cgroupRoot" type in NewQOSContainerManager

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-11-18 13:13:28 -08:00
Michael Taufen 1085b6f730 Lift embedded structure out of eviction-related KubeletConfiguration fields
- Changes the following KubeletConfiguration fields from `string` to
`map[string]string`:
  - `EvictionHard`
  - `EvictionSoft`
  - `EvictionSoftGracePeriod`
  - `EvictionMinimumReclaim`
- Adds flag parsing shims to maintain Kubelet's public flags API, while
enabling structured input in the file API.
- Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag
parsing shim living in the kubeletconfig API group, and replaces it
with the `MapStringString` shim introduced in this PR. Flag parsing
shims belong in a common place, not in the kubeletconfig API.
I manually audited these to ensure that this wouldn't cause errors
parsing the command line for syntax that would have previously been
error free (`kubeletconfig.ConfigurationMap` was unique in that it
allowed keys to be provided on the CLI without values. I believe this was
done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag,
which rightfully accepts value-free keys, and that this shim was then
just copied to `kubeletconfig`). Fortunately, the affected fields
(`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect
non-empty strings in the values of the map, and as a result passing the
empty string is already an error. Thus requiring keys shouldn't break
anyone's scripts.
- Updates code and tests accordingly.

Regarding eviction operators, directionality is already implicit in the
signal type (for a given signal, the decision to evict will be made when
crossing the threshold from either above or below, never both). There is
no need to expose an operator, such as `<`, in the API. By changing
`EvictionHard` and `EvictionSoft` to `map[string]string`, this PR
simplifies the experience of working with these fields via the
`KubeletConfiguration` type. Again, flags stay the same.

Other things:
- There is another flag parsing shim, `flags.ConfigurationMap`, from the
shared flag utility. The `NodeLabels` field still uses
`flags.ConfigurationMap`. This PR moves the allocation of the
`map[string]string` for the `NodeLabels` field from
`AddKubeletConfigFlags` to the defaulter for the external
`KubeletConfiguration` type. Flags are layered on top of an internal
object that has undergone conversion from a defaulted external object,
which means that previously the mere registration of flags would have
overwritten any previously-defined defaults for `NodeLabels` (fortunately
there were none).
2017-11-16 18:35:13 -08:00
Martin Polednik 6e3f8f3890 deviceplugin: fix race when multiple plugins are registered
Signed-off-by: Martin Polednik <mpolednik@redhat.com>
2017-11-16 15:20:00 +01:00
Connor Doyle c95ee34234 Use file-backed state for all cpumanager policies
- Add unit test to verify policy name mismatch behavior.
2017-11-15 22:38:11 -08:00
Kubernetes Submit Queue e99544d018
Merge pull request #54409 from intelsdi-x/cpu-enable-state-file
Automatic merge from submit-queue (batch tested with PRs 55764, 55683, 55468, 54409, 55546). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable file back state in static policy

**What this PR does / why we need it**:
Enables file back `State` in `static policy` and cpu manager + tests.
Upon policy start, state read from file is validated whether it meets the policy assumption. In case of any error, state is cleared.

Previous PR: #54408
Next PR: #54409
2017-11-15 22:16:05 -08:00
Kubernetes Submit Queue 6f35d49079
Merge pull request #52149 from lichuqiang/combineListwatch
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Deviceplugin refactoring: merge func list and listwatch in endpoint into one

**What this PR does / why we need it**:
merge func list and listwatch in endpoint into one, since we won't call list func individually

**Which issue this PR fixes**
fixes #51993
Part2

**Special notes for your reviewer**:
/cc @jiayingz @RenaudWasTaken @vishh

**Release note**:

```release-note
NONE
```
2017-11-15 16:56:51 -08:00
Jiaying Zhang 93916242f7 Adds jiayingz@ and vish@ as approvers for pkg/kubelet/cm/deviceplugin/. 2017-11-14 15:27:02 -08:00
Michał Stachowski 809ac834a0 Cpu manager file state tests 2017-11-14 18:26:41 +01:00
Szymon Scharmach 7e7301ffaf Enable file state in static policy 2017-11-14 18:25:58 +01:00
lichuqiang 4fa0fa5ad1 pass devices of previous endpoint into re-registered one to avoid potential orphaned devices upon re-registration 2017-11-14 16:43:19 +08:00
Kubernetes Submit Queue e2c02f425a
Merge pull request #53970 from ScorpioCPH/add-more-comments
Automatic merge from submit-queue (batch tested with PRs 55283, 55461, 55288, 53970, 55487). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more comments for DevicePluginHandlerImpl struct

**What this PR does / why we need it**:

Add more comments

**Special notes for your reviewer**:

@jiayingz PTAL.

**Release note**:

```
NONE
```
2017-11-13 12:32:27 -08:00
Dr. Stefan Schimanski bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski 012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Clayton Coleman 66590d6f83
Container manager has a bad fake interface 2017-11-03 22:21:29 -04:00
Penghao Cen 1d4e1942d8 Add more comments for HandlerImpl struct 2017-11-03 18:24:32 +08:00
Kubernetes Submit Queue 2084f7f4f3
Merge pull request #54488 from lichuqiang/plugin_base
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add admission handler for device resources allocation

**What this PR does / why we need it**:
Add admission handler for device resources allocation to fail fast during pod creation

**Which issue this PR fixes** 
fixes #51592

**Special notes for your reviewer**:
@jiayingz Sorry, there is something wrong with my branch in #51895. And I think the existing comments in the PR might be too long for others to view. So I closed it and opened the new one, as we have basically reach an agreement on the implement :)
I have covered the functionality and unit test part here, and would set about the e2e part ASAP

/cc @jiayingz @vishh @RenaudWasTaken 

**Release note**:

```release-note
NONE
```
2017-11-02 17:24:06 -07:00
lichuqiang 0630896383 update unit test for plugin resources allocation reinforcement 2017-11-02 09:18:24 +08:00
lichuqiang ebd445eb8c add admission handler for device resources allocation 2017-11-02 09:17:48 +08:00
Shawn Hsiao f7a15cb751 set leveled logging (v=4) for 'updating container' message 2017-11-01 16:54:23 -04:00
Kubernetes Submit Queue 94e77bd4ca Merge pull request #54408 from intelsdi-x/cpu-state-file
Automatic merge from submit-queue (batch tested with PRs 54656, 54552, 54389, 53634, 54408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add file backed state to cpu manager

**What this PR does / why we need it**:
Adds file backed `State` implementation to cpu manger with tests.
Reads from `State` are done from memory, while each write triggers state save to a file.

Any failure in reading the state file results in empty state

Next PR: #54409
2017-10-26 21:08:38 -07:00
Rohit Agarwal 092429be1c Better error messages and logging while registering device plugins. 2017-10-26 15:17:38 -07:00
Michał Stachowski 97e3f7bf86 State file test fixes 2017-10-26 20:03:35 +02:00
Szymon Scharmach 4ee0adc77a Added Cpu Manager file state 2017-10-26 20:03:17 +02:00
lichuqiang 6a39ac3874 merge func list and listwatch into one 2017-10-26 16:36:16 +08:00
Jiaying Zhang e501f01d85 Move podDevices code into a separate file. 2017-10-24 17:48:59 -07:00
Jiaying Zhang ff4e8d429e Device plugin code refactoring to cope with file move.
While moving device_plugin_handler_test.go from pkg/kubelet/cm/ to
pkg/kubelet/cm/deviceplugin/, we can no longer uses cm in its tests
because that would cause a cycle dependency. To solve this problem,
I moved the main cm GetResources functionality as well as part of the
current device plugin handler Allocate functionality into a new device
plugin handler function, GetDeviceRunContainerOptions(). This
refactoring is also needed by another PR 51895 that moves device
allocation into admission phase. Now device plugin handler Allocate()
first checks whether there is cached device runtime state and only
issues Allocate grpc call if there is no cached state available.
The new GetDeviceRunContainerOptions() function simply returns device
runtime config from the cached state. To support this change, extended the
podDevices struct and checkpoint data structure with device runtime state.
2017-10-24 14:38:15 -07:00
Jiaying Zhang 796f488789 Move device plugin related files under pkg/kubelet/cm/deviceplugin/. 2017-10-24 14:17:20 -07:00
lichuqiang fd8b04649e unnecessary functions cleanup for deviceplugin 2017-10-20 09:37:59 +08:00
Vishnu kannan 16b0363b95 Disabling k8s.io/kubernetes/pkg/kubelet/cm TestPodContainerDeviceAllocation due to #54100
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:35:24 -07:00
Vishnu kannan e0032af916 bump device plugin version to v1alpha2 to reflect the change to AllocateResponce API
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:35:24 -07:00
Vishnu kannan 18eee1eaa0 Make AllocateResponse artifacts global across all devices per container in device plugin API
There is no use case known for passing artifacts per device as it currently exists. The current API is also
complex to use for simple clients. Hence this PR creates a flat namespace where artifacts like environment variables
and mount points apply globally to all devices returned as part of AllocateResponse proto.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:34:00 -07:00
Kubernetes Submit Queue 1d8f1e268f Merge pull request #47699 from supereagle/fix-typos
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typos: remove duplicated word in comments

**What this PR does / why we need it**: Remove the duplicated word `the` in comments

**Which issue this PR fixes** : fixes #

**Special notes for your reviewer**:

```release-note
NONE
```
2017-10-17 02:35:52 -07:00
Jeff Grafton aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
Kubernetes Submit Queue 3deab69d3b Merge pull request #53790 from yanxuean/cgroupredundancy
Automatic merge from submit-queue (batch tested with PRs 52959, 53790). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove redundancy code in setCPUCgroupConfig

fix #53925

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>



**What this PR does / why we need it**:

The check of burstableCPUShares is redundancy. We have done it in MilliCPUToShares. It is responsibility of MilliCPUToShares.
```
func (m *qosContainerManagerImpl) setCPUCgroupConfig(configs map[v1.PodQOSClass]*CgroupConfig) error {
        ........
	// set burstable shares based on current observe state
	burstableCPUShares := MilliCPUToShares(burstablePodCPURequest)
	if burstableCPUShares < uint64(MinShares) {
		burstableCPUShares = uint64(MinShares)
	}
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Improveing code.

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-13 19:19:32 -07:00
yanxuean 5d5fee8cab capitalize the first letter
capitalize the first letter for the field comment of containerManagerImpl

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-13 14:54:06 +08:00
Kubernetes Submit Queue 03adf92aa9 Merge pull request #53753 from derekwaynecarr/log-spam
Automatic merge from submit-queue (batch tested with PRs 53119, 53753, 53795, 52981). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce log spam in qos container manager

**What this PR does / why we need it**:
excessive log stmts make it hard to debug actual problems.

**Release note**:
```release-note
NONE
```
2017-10-12 08:28:36 -07:00
yanxuean 8adb2181eb remove redundancy code in setCPUCgroupConfig
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-12 18:42:18 +08:00
Derek Carr 328a12d160 Reduce log spam in qos container manager 2017-10-11 19:47:40 -04:00
Euan Kemp 7aa88b5103 kubelet/cm: remove unneeded fork of 'cat'
Reading a file in Go is perfectly possible without invoking cat.

I also removed an outdated comment.
2017-10-10 21:53:35 -07:00
Kubernetes Submit Queue ec116fdc73 Merge pull request #53328 from intelsdi-x/lscpu_fix
Automatic merge from submit-queue (batch tested with PRs 53297, 53328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cpu Manager - make CoreID's platform unique

**What this PR does / why we need it**:
Cpu Manager uses topology from cAdvisor(`/proc/cpuinfo`) where coreID's are socket unique - not platform unique - this causes problems on multi-socket platforms.

All code assumes unique coreID's (on platform) -  `Discovery` function has been changed to assign CoreID as the lowest cpuID from all cpus belonging to the same core. This can be expressed as:
`CoreID=min(cpuID's on the same core)`

Since cpuID's are platform unique - above gives us guarantee that CoreID's will also be platform unique.



**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53323
2017-10-10 11:20:37 -07:00
Kubernetes Submit Queue aaf14d4619 Merge pull request #53525 from sttts/sttts-scheme-copier-romoval
Automatic merge from submit-queue (batch tested with PRs 53525, 53652). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apimachinery: remove ObjectCopier interface(s)

The big commit is a mechanical, transitive removal of the copier interfaces in all structs and function calls.
2017-10-10 08:31:41 -07:00
Szymon Scharmach b86dc9c054 Make CoreID's platform unique 2017-10-10 10:45:44 +02:00
Kubernetes Submit Queue c12dab37e7 Merge pull request #53547 from jiayingz/deviceplugin-fix
Automatic merge from submit-queue (batch tested with PRs 52662, 53547, 53588, 53573, 53599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

In DevicePluginHandlerImpl.Allocate(), skips untracked extended resou…

…rces.

Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53548

**Special notes for your reviewer**:

**Release note**:

```release-note
Ignore extended resources that are not registered with kubelet
```
2017-10-09 12:51:17 -07:00
Kubernetes Submit Queue 85b252d47e Merge pull request #51771 from dixudx/refactor_nsenter
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor nsenter

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51273

**Special notes for your reviewer**:
/assign @jsafrane 

**Release note**:

```release-note
None
```
2017-10-08 23:27:32 -07:00
Dr. Stefan Schimanski ecb65a6a71 Update generated files 2017-10-07 11:28:47 +02:00
Jiaying Zhang ee1ffa619b In DevicePluginHandlerImpl.Allocate(), skips untracked extended resources.
Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.
2017-10-06 13:57:53 -07:00
Dr. Stefan Schimanski ed586da147 apimachinery: remove Scheme.DeepCopy 2017-10-06 14:59:17 +02:00
Kubernetes Submit Queue 5e2ce3aaf2 Merge pull request #53122 from resouer/fix-cpu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Eliminate extra CRI call during processing cpu set

**What this PR does / why we need it**:

Encountered this during `kubernetes/frakti` node e2e test.

When cpuset is not set, there's still plenty of `runtime.UpdateContainerResources` been called, which seems unnecessary.

cc @ConnorDoyle Make sense? Fixes: #53304

**Special notes for your reviewer**:

**Release note**:

```release-note
Only do UpdateContainerResources when cpuset is set 
```
2017-10-01 15:30:56 -07:00
Harry Zhang 282973d87d Elimenate extra CRI call 2017-09-30 16:51:32 +08:00
Kubernetes Submit Queue 6fcf841d69 Merge pull request #52692 from wackxu/fbc
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the bad code comment and make the format unify

**What this PR does / why we need it**:

Fix the bad code comment and make the format unify

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #


**Release note**:

```release-note
NONE
```
2017-09-28 21:15:43 -07:00
Di Xu 57ead4898b use GetFileType per mount.Interface to check hostpath type 2017-09-26 09:57:06 +08:00
NickrenREN 7f9696201e Fix --kube-reserved storage key name and add test cases for node allocatable reservation 2017-09-26 09:32:21 +08:00
yanxuean f011c044d4 improve cgroupmanager in qosContainerManager
improve arg "cgroupRoot" type in NewQOSContainerManager

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-09-25 16:59:15 +08:00
yanxuean 45146cff4e Use arg cgroupRoot,not nodeConfig.CgroupRoot
Using both arg cgroupRoot and nodeConfig.CgroupRoot is confused in function NewQOSContainerManager

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-09-25 15:19:20 +08:00
wackxu d8aa0ca82a fix the bad code comment and make the format unify 2017-09-19 11:15:10 +08:00
supereagle 87c29a08e1 fix typos: remove duplicated word in comments 2017-09-16 14:38:10 +08:00
Balaji Subramaniam e2cb80db4a Added large topology tests for static policy in CPU Manager.
- Added comments for tests cases.
2017-09-06 13:15:22 -07:00
Kubernetes Submit Queue dcc1aa0628 Merge pull request #51928 from mindprince/pr-45724-fix-build
Automatic merge from submit-queue

Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.

This was broken in #45724

**Release note**:
```release-note
NONE
```
/sig storage
/sig node

/cc @jsafrane, @vishh
2017-09-05 19:44:54 -07:00
Kubernetes Submit Queue 99aa992ce8 Merge pull request #51751 from dashpole/update_cadvisor_godep
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)

Update Cadvisor Dependency

Fixes: https://github.com/kubernetes/kubernetes/issues/51832
This is the worst dependency update ever... 
The root of the problem is the [name change of Sirupsen -> sirupsen](https://github.com/sirupsen/logrus/issues/570#issuecomment-313933276).  This means that in order to update cadvisor, which venders the lowercase, we need to update all dependencies to use the lower-cased version.  With that being said, this PR updates the following packages:

`github.com/docker/docker`
- `github.com/docker/distribution`
  - `github.com/opencontainers/go-digest`
  - `github.com/opencontainers/image-spec`
  - `github.com/opencontainers/runtime-spec`
  - `github.com/opencontainers/selinux`
  - `github.com/opencontainers/runc`
    - `github.com/mrunalp/fileutils`
  - `golang.org/x/crypto`
    - `golang.org/x/sys`
- `github.com/docker/go-connections`
- `github.com/docker/go-units`
- `github.com/docker/libnetwork`
- `github.com/docker/libtrust`
- `github.com/sirupsen/logrus`
- `github.com/vishvananda/netlink`

`github.com/google/cadvisor`
- `github.com/euank/go-kmsg-parser`

`github.com/json-iterator/go`

Fixed https://github.com/kubernetes/kubernetes/issues/51832

```release-note
Fix journalctl leak on kubelet restart
Fix container memory rss
Add hugepages monitoring support
Fix incorrect CPU usage metrics with 4.7 kernel
Add tmpfs monitoring support
```
2017-09-05 17:30:06 -07:00
Kubernetes Submit Queue 8b9e8cf80a Merge pull request #51744 from jiayingz/deviceplugin-checkpoint
Automatic merge from submit-queue (batch tested with PRs 50072, 51744)

Deviceplugin checkpoint

**What this PR does / why we need it**:
Extends on top of PR 51209 to checkpoint device to pod allocation information on Kubelet to recover from Kubelet restarts.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-05 13:33:01 -07:00
David Ashpole e5a6a79fd7 update cadvisor, docker, and runc godeps 2017-09-05 12:38:57 -07:00
Jiaying Zhang 3b2bc58c11 Extends device_plugin_handler to checkpoint device to container allocation information. 2017-09-05 09:52:14 -07:00
Derek Carr 38d5dee677 Node validation restricts pre-allocated hugepages to single page size 2017-09-05 10:34:30 -04:00
Derek Carr 1ec2a69d9a Kubelet changes to support hugepages 2017-09-05 09:46:08 -04:00
Rohit Agarwal 08ea02b9a5 Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.
This was broken in #45724
2017-09-04 21:48:55 -07:00
Balaji Subramaniam 5b5958ecec Add tests for the static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle d0bcbbb437 Added static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle e03a6435bb Added cpu assignment helpers. 2017-09-04 07:24:59 -07:00
Szymon Scharmach 242439c9d7 Add topology helper and tests to cpumanager. 2017-09-04 07:24:59 -07:00
Connor Doyle e4d5565228 Fix Start signature in container_manager_windows. 2017-09-04 07:24:59 -07:00
Connor Doyle 81ccd396d7 Fixed nil InternalContainerLifecycle in cm stubs. 2017-09-04 07:24:59 -07:00
Connor Doyle ec706216e6 Un-revert "CPU manager wiring and `none` policy"
This reverts commit 8d2832021a.
2017-09-04 07:24:59 -07:00
Jiaying Zhang 29d178fbc3 Fixes a cross-build failure introduced in PR 51209. FYI, issue 51863. 2017-09-02 21:56:39 -07:00
Kubernetes Submit Queue 917f9f02ef Merge pull request #45724 from jsafrane/mount-propagation2
Automatic merge from submit-queue

Make /var/lib/kubelet as shared during startup

This is part of ~~https://github.com/kubernetes/community/pull/589~~ https://github.com/kubernetes/community/pull/659

We'd like kubelet to be able to consume mounts from containers in the future, therefore kubelet should make sure that `/var/lib/kubelet` has shared mount propagation to be able to see these mounts. 

On most distros, root directory is already mounted with shared mount propagation and this code will not do anything. On older distros such as Debian Wheezy, this code detects that `/var/lib/kubelet` is a directory on `/` which has private mount propagation and kubelet bind-mounts `/var/lib/kubelet` as rshared.

Both "regular" linux mounter and `NsenterMounter` are updated here.

@kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews 
@vishh 

Release note:
```release-note
Kubelet re-binds /var/lib/kubelet directory with rshared mount propagation during startup if it is not shared yet.
```
2017-09-02 12:00:30 -07:00
Jiaying Zhang 02001af752 Kubelet side extension to support device allocation 2017-09-01 11:56:35 -07:00
Renaud Gaubert c4a1c97329 Device Plugin Kubelet integration 2017-09-01 11:47:09 -07:00
Shyam JVS 8d2832021a Revert "CPU manager wiring and `none` policy" 2017-09-01 18:17:36 +02:00
Connor Doyle 50674ec614 Added cpu-manager-reconcile-period config.
- Defaults to sync-frequency.
2017-08-30 23:42:32 -07:00
Connor Doyle 7c6e31617d CPU Manager initialization and lifecycle calls. 2017-08-30 08:50:41 -07:00
Connor Doyle 5dee682796 CPU manager config and feature gate. 2017-08-30 08:27:23 -07:00
Balaji Subramaniam 7567f1765f Added CPU manager unit tests (none policy) 2017-08-30 08:26:22 -07:00
Seth Jennings ff471913f9 Added none policy for CPU manager. 2017-08-30 08:26:21 -07:00
Connor Doyle 01d1d8f23f Added in-memory CPU manager state. 2017-08-30 08:26:21 -07:00
Jan Safranek d9500105d8 Share /var/lib/kubernetes on startup
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
2017-08-30 16:45:04 +02:00
Connor Doyle 726bd8e27b Add CPU manager interfaces. 2017-08-29 03:42:17 -07:00
Kubernetes Submit Queue 98fb8cacf9 Merge pull request #50773 from huzhengchuan/bug/50770
Automatic merge from submit-queue (batch tested with PRs 51391, 51338, 51340, 50773, 49599)

Delete "hugetlb" from whitelistControllers

**What this PR does / why we need it**:
Delete "hugetlb" from whitelistControllers

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50770

**Special notes for your reviewer**:

**Release note**:

```
NONE
```
2017-08-26 08:49:26 -07:00
NickrenREN 27901ad5df Change eviction policy to manage one single local storage resource 2017-08-26 05:14:49 +08:00
Connor Doyle 515d86faa0 Add CPUSetBuilder, make CPUSet immutable. 2017-08-22 22:33:04 -07:00
Connor Doyle e686ecb6ea Renamed CPUSet.AsSlice() => CPUSet.ToSlice() 2017-08-22 21:21:26 -07:00
Connor Doyle 8f38abb350 Add cpuset helper library. 2017-08-22 11:42:01 -07:00
Kubernetes Submit Queue d2cf96d6ef Merge pull request #48057 from NickrenREN/fix-validateNodeAllocatable
Automatic merge from submit-queue (batch tested with PRs 50758, 48057)

Fix node allocatable resource validation

GetNodeAllocatableReservation gets all the reserved resource value
Allocatable resource = capacity - reservation


**Release note**:

```release-note
NONE
```
2017-08-16 07:57:24 -07:00
zhengchuan hu 938bffcb04 Delete "hugetlb" from whitelistControllers 2017-08-16 22:52:56 +08:00
Michael Taufen 24bab4c20f move KubeletConfiguration out of componentconfig API group 2017-08-15 08:12:42 -07:00
NickrenREN eadb7ca8c0 Fix node allocatable resource validation
GetNodeAllocatableReservation gets all the reserved resource, and we need to compare it with capacity
2017-08-14 10:20:40 +08:00
Jeff Grafton a7f49c906d Use buildozer to delete licenses() rules except under third_party/ 2017-08-11 09:32:39 -07:00
Jeff Grafton 33276f06be Use buildozer to remove deprecated automanaged tags 2017-08-11 09:31:50 -07:00
Jeff Grafton cf55f9ed45 Autogenerate BUILD files 2017-08-11 09:30:23 -07:00
Kubernetes Submit Queue fae79dd4b4 Merge pull request #47181 from dims/fail-on-swap-enabled
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)

Fail on swap enabled and deprecate experimental-fail-swap-on flag

**What this PR does / why we need it**:

    * Deprecate the old experimental-fail-swap-on
    * Add a new flag fail-swap-on and set it to true

    Before this change, we would not fail when swap is on. With this
    change we fail for everyone when swap is on, unless they explicitly
    set --fail-swap-on to false.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

Fixes #34726

**Special notes for your reviewer**:

**Release note**:

```release-note
Kubelet will by default fail with swap enabled from now on. The experimental flag "--experimental-fail-swap-on" has been deprecated, please set the new "--fail-swap-on" flag to false if you wish to run with /proc/swaps on.
```
2017-08-04 14:29:36 -07:00
Davanum Srinivas 71e8c8eba4 Fail on swap enabled and deprecate experimental-fail-swap-on flag
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true

Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
2017-08-02 16:20:01 -04:00
zhengchuan hu 1e2ac80c75 Fix some typos 2017-07-27 21:31:31 +08:00
Kubernetes Submit Queue 9350afd772 Merge pull request #48976 from supereagle/cleanup-api-package
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)

Remove duplicated import and wrong alias name of api package

**What this PR does / why we need it**:

**Which issue this PR fixes**: fixes #48975

**Special notes for your reviewer**:
/assign @caesarxuchao

**Release note**:
```release-note
NONE
```
2017-07-25 12:14:38 -07:00
Kubernetes Submit Queue e623fed778 Merge pull request #48636 from jingxu97/July/allocatable
Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494)

Fix issues for local storage allocatable feature

This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.

This PR fixes issue #47809
2017-07-24 19:30:33 -07:00
supereagle adc0eef43e remove duplicated import and wrong alias name of api package 2017-07-25 10:04:25 +08:00
xiangpengzhao 01daf707c5 Refactor: pkg/util into sub-pkgs 2017-07-18 14:34:08 +08:00
Jing Xu bb1920edcc Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
2017-07-13 12:06:19 -07:00
Kubernetes Submit Queue dbb42838db Merge pull request #48567 from jingxu97/July/getcapacity
Automatic merge from submit-queue (batch tested with PRs 47232, 48625, 48613, 48567, 39173)

Fix issue when setting fileysystem capacity in container manager

In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
This PR fixes issue #48452
2017-07-12 00:10:18 -07:00
Kubernetes Submit Queue 03360d7b65 Merge pull request #48402 from ianchakeres/local-storage-teardown-fix
Automatic merge from submit-queue

Local storage teardown fix

**What this PR does / why we need it**: Local storage uses bindmounts and the method IsLikelyNotMountPoint does not detect these as mountpoints. Therefore, local PVs are not properly unmounted when they are deleted.

**Which issue this PR fixes**: fixes #48331

**Special notes for your reviewer**:

You can use these e2e tests to reproduce the issue and validate the fix works appropriately https://github.com/kubernetes/kubernetes/pull/47999

The existing method IsLikelyNotMountPoint purposely does not check mountpoints reliability (4c5b22d4c6/pkg/util/mount/mount_linux.go (L161)), since the number of mountpoints can be large. 4c5b22d4c6/pkg/util/mount/mount.go (L46)

This implementation changes the behavior for local storage to detect mountpoints reliably, and avoids changing the behavior for any other callers to a UnmountPath.

**Release note**:

```
Fixes bind-mount teardown failure with non-mount point Local volumes (issue https://github.com/kubernetes/kubernetes/issues/48331).
```
2017-07-11 20:35:29 -07:00
Ian Chakeres 2b18d3b6f7 Fixes bind-mount teardown failure with non-mount point Local volumes
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
2017-07-11 17:19:58 -04:00
Jing Xu 9606a54049 Fix issue when setting fileysystem capacity in container manager
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
2017-07-10 16:43:18 -07:00
Xing Zhou 37f9e13025 Remove useless error 2017-07-03 14:59:54 +08:00
xiangpengzhao 53c536b59c
Implement GetCapacity in container_manager_unsupported 2017-06-29 10:22:57 +08:00
Vishnu kannan 82f7820066 Kubelet:
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.

This patch fixes a bug in container manager where it was writing to a map without synchronization.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-06-27 18:45:02 -07:00
Chao Xu 60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu f2d3220a11 run root-rewrite-import-client-go-api-types 2017-06-22 11:30:59 -07:00
Chao Xu f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
Kubernetes Submit Queue 69342bd1df Merge pull request #43005 from cmluciano/cml/consolidatesysctl
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)

Consolidate sysctl commands for kubelet

**What this PR does / why we need it**:
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #26005 

**Special notes for your reviewer**:
I had a difficult time writing tests for this. It is trivial to create a fake sysctl for testing, but the Kubelet does not have any tests for the prior settings.
**Release note**:

```release-note
```
2017-06-07 13:30:54 -07:00
Jing Xu 943fc53bf7 Add predicates check for local storage request
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
2017-06-01 15:57:50 -07:00
Jing Xu dd67e96c01 Add local storage (scratch space) allocatable support
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
2017-06-01 15:57:50 -07:00
zhengjiajin 9d68ae5963 Fix naming and comments in Container Manage 2017-05-26 10:53:09 +08:00
Klaus Ma 83b7f77ee2 Moved qos to api.helpers. 2017-05-20 07:17:57 -04:00
Xing Zhou 22ab45b575 While calculating pod's cpu limits, need to count in init-container.
Need to count in init-container when calculating a pod's cpu limits.
Otherwise, may cause pod start failure due to "invalid argument"
error while trying to write "cpu.cfs_quota_us" file.
2017-05-19 12:31:27 +08:00
Kubernetes Submit Queue 873ce9ca4a Merge pull request #45515 from derekwaynecarr/ignore-openrc
Automatic merge from submit-queue (batch tested with PRs 45515, 45579)

Ignore openrc cgroup

**What this PR does / why we need it**:
It is a work-around for the following: https://github.com/opencontainers/runc/issues/1440

**Special notes for your reviewer**:
I am open to a cleaner way to do this, but we have many developer users on Macs that ran containerized kubelets that are not able to run them right now due to the inclusion of openrc tripping up our existence checks.  Ideally, runc can give us a call to say "does this exist according to what runc knows about".  Or we could add a whitelist check.  Right now, this was the smallest hack pending more discussion.
2017-05-10 23:20:40 -07:00
Yu-Ju Hong daa329c9ae Remove the deprecated `--enable-cri` flag
Except for rkt, CRI is the default and only integration point for
container runtimes.
2017-05-10 13:03:41 -07:00
Derek Carr 4e002eacb1 Do not fail cgroup exists checks for unknown controllers 2017-05-10 14:52:09 -04:00
Kubernetes Submit Queue b5831357dc Merge pull request #45305 from jwforres/fix-error-msg-spelling
Automatic merge from submit-queue (batch tested with PRs 43006, 45305, 45390, 45412, 45392)

Fix spelling in container manager error message
2017-05-05 16:39:06 -07:00
Jessica Forrester bd64b3b15c
Fix spelling in container manager error message 2017-05-03 16:08:16 -04:00
Christopher M. Luciano bafabcbb44
Consolidate sysctl commands for kubelet
These commands are important enough to be in the Kubelet itself.
By default, Ubuntu 14.04 and Debian Jessie have these set to 200 and
20000. Without this setting, nodes are limited in the number of
containers that they can start.
2017-05-02 12:15:01 -07:00
Manjunath A Kumatagi f8063879a0 Use Docker API Version instead of docker version 2017-04-27 10:05:22 -04:00
Klaus Ma 6d29cfc0cc Registered node before other initialization. 2017-04-18 10:43:56 +08:00
Chao Xu 4f9591b1de move pkg/api/v1/ref.go and pkg/api/v1/resource.go to subpackages. move some functions in resource.go to pkg/api/v1/node and pkg/api/v1/pod 2017-04-17 11:38:11 -07:00
Mike Danese a05c3c0efd autogenerated 2017-04-14 10:40:57 -07:00
Seth Jennings ebb1243aba refactor getPidsForProcess and change error handling 2017-03-28 11:34:49 -05:00
Random-Liu e6341cc3c7 Fix kubelet panic in cgroup manager. 2017-03-13 12:06:08 -07:00
Kubernetes Submit Queue 31db570a00 Merge pull request #42497 from derekwaynecarr/lower_cgroup_names
Automatic merge from submit-queue

cgroup names created by kubelet should be lowercased

**What this PR does / why we need it**:
This PR modifies the kubelet to create cgroupfs names that are lowercased.  This better aligns us with the naming convention for cgroups v2 and other cgroup managers in ecosystem (docker, systemd, etc.)

See: https://www.kernel.org/doc/Documentation/cgroup-v2.txt
"2-6-2. Avoid Name Collisions"

**Special notes for your reviewer**:
none

**Release note**:
```release-note
kubelet created cgroups follow lowercase naming conventions
```
2017-03-06 20:43:03 -08:00
Derek Carr 48d822eafe cgroup names created by kubelet should be lowercased 2017-03-06 11:19:21 -05:00
Seth Jennings ccd87fca3f kubelet: add cgroup manager metrics 2017-03-06 08:53:47 -06:00
Seth Jennings cc50aa9dfb kubelet: enable qos-level memory request reservation 2017-03-02 15:04:13 -06:00
Derek Carr 1947e76e91 Set Burstable QOS Cgroup cpu.shares 2017-03-01 14:51:34 -05:00
Seth Jennings b9adb66426 kubelet: cm: refactor QoS logic into seperate interface 2017-02-28 09:19:29 -06:00
Derek Carr a7684569fb Fix get all pods from cgroups logic 2017-02-27 21:24:45 -08:00
Vishnu Kannan cc5f5474d5 add support for node allocatable phase 2 to kubelet
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-27 21:24:44 -08:00
Vishnu Kannan 70e340b045 adding kubelet flags for node allocatable phase 2
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-27 21:24:44 -08:00
Kubernetes Submit Queue 28a8d783e6 Merge pull request #41621 from derekwaynecarr/best-effort-qos-shares
Automatic merge from submit-queue

BestEffort QoS class has min cpu shares

**What this PR does / why we need it**:
BestEffort QoS class is given the minimum amount of CPU shares per the QoS design.
2017-02-26 06:32:43 -08:00
Derek Carr 43ae6f49ad Enable per pod cgroups, fix defaulting of cgroup-root when not specified 2017-02-21 16:34:22 -05:00
Derek Carr 7fe105ebc7 stop double encoding systemd style cgroup names 2017-02-21 16:34:21 -05:00
Derek Carr 9a1e30f776 BestEffort QoS class has min cpu shares 2017-02-20 12:28:00 -05:00
Derek Carr 04a909a257 Rename cgroups-per-qos flag to not be experimental 2017-02-03 17:10:53 -05:00
Dr. Stefan Schimanski 44ea6b3f30 Update generated files 2017-01-29 21:41:45 +01:00
Dr. Stefan Schimanski bc6fdd925d pkg/api/resource: move to apimachinery 2017-01-29 21:41:44 +01:00
Kubernetes Submit Queue b5929bfb2b Merge pull request #38789 from jessfraz/cleanup-temp-dirs
Automatic merge from submit-queue (batch tested with PRs 37228, 40146, 40075, 38789, 40189)

Cleanup temp dirs

So funny story my /tmp ran out of space running the unit tests so I am cleaning up all the temp dirs we create.
2017-01-20 12:34:58 -08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Seth Jennings 4c30459e49 switch from local qos types to api types 2017-01-10 10:54:30 -06:00
Jeff Grafton 20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Jess Frazelle db75904b42
fix when os.Remove should be os.RemoveAll
Signed-off-by: Jess Frazelle <acidburn@google.com>
2017-01-04 10:34:06 -08:00
Mike Danese 161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
Dan Winship c788793868 Port remaining code to pkg/util/version 2016-12-13 08:53:24 -05:00
Mike Danese c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
Chao Xu bcc783c594 run hack/update-all.sh 2016-11-23 15:53:09 -08:00
Chao Xu 5e1adf91df cmd/kubelet 2016-11-23 15:53:09 -08:00
mdshuai 2189acdd4f [kubelet]update --cgroups-per-qos to --experimental-cgroups-per-qos 2016-11-15 15:55:47 +08:00
Tim St. Clair 3aaa6fca88
BUILD changes for cgroup pids 2016-11-10 13:08:39 -08:00
Tim St. Clair cb588e823c
Fix getting cgroup pids 2016-11-10 13:08:17 -08:00
Davanum Srinivas cf9e9505f3 Fix build break
Problem introduced in #31996

Fixes #36454
2016-11-08 14:23:33 -05:00
Michael Taufen 0c6c622434 Fail kubelet creation if swap enabled
Provides an opt-in flag, --experimental-fail-swap-on (and corresponding
KubeletConfiguration value, ExperimentalFailSwapOn), which is false by default.
2016-11-08 08:39:31 -08:00
Yu-Ju Hong dcce768a3e Rename experimental-runtime-integration-type to experimental-cri 2016-11-07 11:29:24 -08:00
Kubernetes Submit Queue 486a1ad3e4 Merge pull request #31707 from apprenda/windows_infra_container
Automatic merge from submit-queue

Initial work on running windows containers on Kubernetes

<!--  Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->

This is the first stab at getting the Kubelet running on Windows (fixes #30279), and getting it to deploy network-accessible pods that consist of Windows containers. Thanks @csrwng, @jbhurat for helping out.

The main challenge with Windows containers at this point is that container networking is not supported. In other words, each container in the pod will get it's own IP address. For this reason, we had to make a couple of changes to the kubelet when it comes to setting the pod's IP in the Pod Status. Instead of using the infra-container's IP, we use the IP address of the first container.

Other approaches we investigated involved "disabling" the infra container, either conditionally on `runtime.GOOS` or having a separate windows-docker container runtime that re-implemented some of the methods (would require some refactoring to avoid maintainability nightmare). 

Other changes:
- The default docker endpoint was removed. This results in the docker client using the default for the specific underlying OS.

More detailed documentation on how to setup the Windows kubelet can be found at https://docs.google.com/document/d/1IjwqpwuRdwcuWXuPSxP-uIz0eoJNfAJ9MWwfY20uH3Q. 

cc: @ikester @brendandburns @jstarks
2016-11-06 01:30:11 -07:00
Seth Jennings 05bb27023b fix cross build for kubelet/cm 2016-11-03 10:54:22 -05:00
derekwaynecarr 42289c2758 pod and qos level cgroup support 2016-11-02 08:07:04 -04:00
Paulo Pires 9e6815e7c7
Fixed kubelet build. 2016-11-01 16:34:47 -04:00
Yu-Ju Hong 87aaf4c0ac dockershim: move docker to the given cgruop
This change add a container manager inside the dockershim to move docker daemon
and associated processes to a specified cgroup. The original kubelet container
manager will continue checking the name of the cgroup, so that kubelet know how
to report runtime stats.
2016-11-01 11:39:20 -07:00
Alexander Brand 244152544c
Changes to kubelet to support win containers 2016-10-31 14:20:49 -04:00
Cesar Wong 09285864db
Initial windows container runtime 2016-10-31 14:20:49 -04:00
Mike Danese 3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
derekwaynecarr 62e1759ac0 update kubelet to look at all cgroup mounts 2016-10-10 14:24:18 -04:00
Vish Kannan a1fe3adbc7 Revert "Revert "[kubelet] Fix oom-score-adj policy in kubelet"" 2016-09-16 16:32:58 -07:00
Vish Kannan 492ca3bc9c Revert "[kubelet] Fix oom-score-adj policy in kubelet" 2016-09-15 19:28:59 -07:00
Vishnu kannan ba6feb2771 fix kubelet ignoring docker daemon in container feature
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-14 12:43:59 -07:00
Vishnu kannan e4acad7afb Fix oom-score-adj policy in kubelet.
Docker daemon and kubelet needs to be protected by setting oom-score-adj to -999.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-14 11:56:10 -07:00
Kubernetes Submit Queue c49d8360ec Merge pull request #31958 from ZTE-PaaS/zhangke-patch-034
Automatic merge from submit-queue

Redundant code process for container_mananger start

1. need not sum the total numEnsureStateFuncs
2. numEnsureStateFuncs should > 0, otherwise, calculate numEnsureStateFuncs would be not neccessary
2016-09-11 17:48:55 -07:00
Ke Zhang eca14886ac Redundant code process for container_mananger start 2016-09-06 12:56:54 +08:00
Kubernetes Submit Queue 06b6fb5729 Merge pull request #31489 from ZTE-PaaS/zhangke-patch-030
Automatic merge from submit-queue

optimize if-else of setupNode of container_manager_linix

make the code more readable
2016-09-05 17:35:09 -07:00
Ke Zhang 86163979f4 optimize if-else of setupNode of container_manager_linix 2016-08-26 10:30:39 +08:00
Justin Santa Barbara 2c103af2b6 Create testable implementation of sysctl
This is so we can test kubenet Init, which calls sysctl
2016-08-23 01:42:37 -04:00
dubstack 4ddfe172ce Add support for pod container management 2016-08-19 11:07:33 -04:00
Kubernetes Submit Queue 79ed7064ca Merge pull request #27970 from jingxu97/restartKubelet-6-22
Automatic merge from submit-queue

Add volume reconstruct/cleanup logic in kubelet volume manager

Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.

Fixes https://github.com/kubernetes/kubernetes/issues/27653
2016-08-15 13:48:43 -07:00
Jing Xu f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Kubernetes Submit Queue 96655d7578 Merge pull request #30087 from dims/remove-pkill-dependency
Automatic merge from submit-queue

Remove kubelet pkill dependency

Issue #26093 identified pkill as one of the dependencies of kublet
which could be worked around.  Build on the code introduced for pidof
and regexp for the process(es) we need to send a signal to.

Related to #26093
2016-08-12 18:38:38 -07:00
Jan Chaloupka eb967ad143 kubelet: introduce --protect-kernel-defaults to make the KernelTunableBehavior configurable 2016-08-11 13:08:27 +02:00
Davanum Srinivas ce93cb9d9c Remove kubelet dependency on pkill
Issue #26093 identified pkill as one of the dependencies of kublet
which could be worked around.  Build on the code introduced for pidof
and regexp for the process(es) we need to send a signal to.

Related to #26093
2016-08-10 17:14:49 -04:00
Davanum Srinivas 1fdcea28e5 Remove kubelet dependency on pidof
Issue #26093 identified pidof as one of the dependencies of kublet
which could be worked around. In this PR, we just look at /proc
to construct the list of pids we need for a specified process
instead of running "pidof" executable

Related to #26093
2016-08-09 19:55:24 -04:00
Buddha Prakash 49201f6923 Update Libcontainer's Cgroup Config: AllowAllDevices to be Nil 2016-08-04 10:05:30 -07:00
Andrey Kurilin 9f1c3a4c56 Fix various typos in kubelet 2016-08-03 01:14:44 +03:00
Cindy Wang e13c678e3b Make volume unmount more robust using exclusive mount w/ O_EXCL 2016-07-18 16:20:08 -07:00
k8s-merge-robot 1d8c15ba14 Merge pull request #28755 from dubstack/remove-systemd-check
Automatic merge from submit-queue

Do not skip check for cgroup creation in the systemd mount

As soon as libcontainer dependency is update in #28410, we can skip check for cgroup creation in the systemd mount. As the latest version of libcontainer should create cgroups in the sytemd mount aswell.

This is tied to the upstream issue: #27204

@vishh PTAL
2016-07-18 15:05:51 -07:00
Buddha Prakash 5000e74664 Inject top level QoS cgroup creation in the Kubelet 2016-07-15 10:02:22 -07:00
Buddha Prakash 238f833f65 Do not skip check for cgroup creation in the systemd mount 2016-07-12 16:03:41 -07:00
Buddha Prakash dcfff45ab7 Add checks in Create and Update Cgroup methods 2016-07-07 14:17:14 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Jordan Liggitt c202a405cd Fix reference to linux-only struct 2016-06-27 11:13:49 -04:00
Buddha Prakash a5ead79d43 Add support for basic cgroup management 2016-06-26 15:41:34 -07:00
derekwaynecarr 08cdc0ef4f Fix system container detection 2016-06-10 16:49:16 -04:00
Wojciech Tyczynski fcfaf1a3bd Revert "Fix system container detection in kubelet on systemd" 2016-05-28 16:11:53 +02:00
k8s-merge-robot c730198aad Merge pull request #25982 from derekwaynecarr/fix_stats
Automatic merge from submit-queue

Fix system container detection in kubelet on systemd

```release-note
Fix system container detection in kubelet on systemd.

This fixed environments where CPU and Memory Accounting were not enabled on the unit 
that launched the kubelet or docker from reporting the root cgroup when 
monitoring usage stats for those components.
```

Fixes https://github.com/kubernetes/kubernetes/issues/25909

/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @vishh @dchen1107
2016-05-28 05:38:15 -07:00
Tim St. Clair e4d8dea0d7 Move containerd process into docker cgroup for versions >= v1.11 2016-05-26 17:27:00 -07:00
derekwaynecarr 5a8851d436 Fix container detection on systemd in kubelet 2016-05-23 14:22:32 -04:00
Andy Goldstein 6744a7417a Fix detection of docker cgroup on RHEL
Check docker's pid file, then fallback to pidof when trying to determine the pid for docker. The
latest docker RPM for RHEL changes /usr/bin/docker from an executable to a shell script (to support
/usr/bin/docker-current and /usr/bin/docker-latest). The pidof check for docker fails in this case,
so we check /var/run/docker.pid first (the default location), and fallback to pidof if that fails.
2016-05-19 16:42:52 -04:00
goltermann 34d4eaea08 Fixing several (but not all) go vet errors. Most are around string formatting, or unreachable code. 2016-03-22 17:26:50 -07:00
k8s-merge-robot 663f7b8a4c Merge pull request #22487 from vishh/node-status-cpu-hardcap
Auto commit by PR queue bot
2016-03-05 02:32:33 -08:00
Vishnu kannan c54ba12faa Update node status to include the absense of cpu hardcapping.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-03-04 10:33:56 -08:00
Vishnu kannan f9129b02a5 Start for real background tasks in container manager.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-03-02 14:55:26 -08:00
k8s-merge-robot 2bca7c5287 Merge pull request #21337 from vishh/ensure-runtime-cgroups
Auto commit by PR queue bot
2016-02-26 16:52:14 -08:00
Yu-Ju Hong 7061ba20bb Fix finding pid of a process 2016-02-17 12:43:16 -08:00
Vishnu kannan 7de6a25383 Identify runtime's cgroups periodically to avoid race with runtime
uptime.
The runtime could also move between cgroups.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-02-16 16:39:48 -08:00
Vishnu kannan 575812787d Replace `--resource-container` and `--system-container` with
`--kubelet-cgroups` and `--system-cgroups` respectively.
Updated `--runtime-container` to `--runtime-cgroups`.
Cleaned up most of the kubelet code that consumes these flags to match
the flag name changes.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-02-10 17:33:28 -08:00
Vishnu kannan 38efc837b9 Make container runtime's cgroup configurable.
Use the real cgroups for metrics generation.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-02-10 16:02:34 -08:00
Jan Chaloupka 4389b3f0d6 Rewritte util.* -> wait.* wherever reasonable 2016-02-07 12:02:20 +01:00
k8s-merge-robot c8e5e89491 Merge pull request #20395 from jimmidyson/system-container-fix
Auto commit by PR queue bot
2016-02-06 04:06:42 -08:00
Vishnu kannan 62fe566e68 Kubelet will not move docker daemons running in containers.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-02-04 13:34:56 -08:00
Jimmi Dyson e9c1d1ebd6 Do not move pid 1 to system container 2016-01-31 23:27:56 +00:00
Jimmi Dyson 1c289943f5 Ensure kubelet pid is not moved to system container 2016-01-29 09:30:20 +00:00
Jimmi Dyson 041ab17a67 Bump cadvisor to fix interface stats bugs & improve performance
Includes necessary godep upgrades for docker & systemd packages as well as
migrating from docker/libcontainer to opencontainers/runc/libcontainer.
2015-12-21 17:07:21 +00:00
Brendan Burns fb576f30c8 Refactor an interface for style 2015-11-13 15:56:27 -08:00
Vishnu kannan 4ad3d6f5fe Move container manager into a separate package.
Inject container manager into Kubelet. This lets us stub out container
manager during integration testing.
2015-11-11 15:00:37 -08:00