github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Sandor Szücs	588d2808b7	fix #51135 make CFS quota period configurable, adds a cli flag and config option to kubelet to be able to set cpu.cfs_period and defaults to 100ms as before. It requires to enable feature gate CustomCPUCFSQuotaPeriod. Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>	2018-09-01 20:19:59 +02:00
Kubernetes Submit Queue	c491d48cde	Merge pull request #67430 from choury/cpumanager Automatic merge from submit-queue (batch tested with PRs 67430, 67550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. cpumanager: rollback state if updateContainerCPUSet failed What this PR does / why we need it: Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #63018 If `updateContainerCPUSet` failed, the container will start failed. We should rollback the state to avoid CPU leak. Special notes for your reviewer: Release note: ```release-note cpumanager: rollback state if updateContainerCPUSet failed ```	2018-08-21 23:20:58 -07:00
Ismo Puustinen	dd3eeb3f46	device manager: don't do operations on nil pointer. If grpc.DialContext() fails, a nil connection is returned. Check the error before calling conn.Close().	2018-08-21 15:20:36 +03:00
Kubernetes Submit Queue	d017bebf6b	Merge pull request #67145 from jiayingz/reboot-fix Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Fail container start if its requested device plugin resource is unknown. With the change, Kubelet device manager now checks whether it has cached option state for the requested device plugin resource to make sure the resource is in ready state when we start the container. What this PR does / why we need it: Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes https://github.com/kubernetes/kubernetes/issues/67107 Special notes for your reviewer: Release note: ```release-note Fail container start if its requested device plugin resource hasn't registered after Kubelet restart. ```	2018-08-21 01:48:54 -07:00
choury	36b92b9b29	cpumanager: rollback state if updateContainerCPUSet failed	2018-08-17 18:08:58 +08:00
tianshapjq	81081dc9e7	nits in manager.go	2018-08-15 08:16:04 +08:00
Jiaying Zhang	7b1ae66432	Fail container start if its requested device plugin resource doesn't have cached option state to make sure the device plugin resource is in ready state when we start the container.	2018-08-08 13:11:36 -07:00
Kubernetes Submit Queue	60ac433922	Merge pull request #66946 from LinEricYang/unused-variable Automatic merge from submit-queue (batch tested with PRs 66512, 66946, 66083). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError" The variable "skipIfPermissionsError" is not needed even when permission error happened.	2018-08-06 19:44:04 -07:00
Kubernetes Submit Queue	d114692a58	Merge pull request #58058 from tianshapjq/cleanup-useless-var-deviceplugin/types.go Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. clean up useless variables in deviceplugin/types.go What this PR does / why we need it: some variables is useless for reasons, I think we need a clean up. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: Release note: ```release-note ```NONE	2018-08-06 16:33:54 -07:00
Lin Yang	b7e1f0bf17	kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError" The variable "skipIfPermissionsError" is not needed even when permission error happened.	2018-08-02 17:24:33 -07:00
Kubernetes Submit Queue	266cf70ac0	Merge pull request #66617 from pravisankar/fix-pod-cgroup-parent Automatic merge from submit-queue (batch tested with PRs 66190, 66871, 66617, 66293, 66891). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Do not set cgroup parent when --cgroups-per-qos is disabled When --cgroups-per-qos=false (default is true), kubelet sets pod container management to podContainerManagerNoop implementation and GetPodContainerName() returns '/' as cgroup parent (default cgroup root). (1) In case of 'systemd' cgroup driver, '/' is invalid parent as docker daemon expects '.slice' suffix and throws this error: 'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"' (`5fc12449d8/daemon/daemon_unix.go (L618)`) '/' corresponds to '-.slice' (root slice) in systemd but I don't think we want to assign root slice instead of runtime specific default value. In case of docker runtime, this will be 'system.slice' (`e2593239d9/daemon/oci_linux.go (L698)`) (2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't think we want to assign root instead of runtime specific default value. In case of docker runtime, this will be '/docker' (`e2593239d9/daemon/oci_linux.go (L695)`) Current fix will not set the cgroup parent when --cgroups-per-qos is disabled. ```release-note Fix pod launch by kubelet when --cgroups-per-qos=false and --cgroup-driver="systemd" ```	2018-08-02 15:42:16 -07:00
Kubernetes Submit Queue	2f21394859	Merge pull request #66190 from linyouchong/issue-66189 Automatic merge from submit-queue (batch tested with PRs 66190, 66871, 66617, 66293, 66891). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. fix nil pointer dereference in node_container_manager#enforceExisting What this PR does / why we need it: fix nil pointer dereference in node_container_manager#enforceExisting Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #66189 Special notes for your reviewer: NONE Release note: ```release-note kubelet: fix nil pointer dereference while enforce-node-allocatable flag is not config properly ```	2018-08-02 15:42:09 -07:00
Kubernetes Submit Queue	c2536e2b0d	Merge pull request #61159 from linyouchong/linyouchong-20180314 Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Skip checking when failSwapOn=false What this PR does / why we need it: Skip checking when failSwapOn=false Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: NONE Release note: ``` NONE ```	2018-08-02 14:09:39 -07:00
Kubernetes Submit Queue	f2c6473e25	Merge pull request #66718 from ipuustin/cpu-manager-validate-offline Automatic merge from submit-queue (batch tested with PRs 66623, 66718). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. cpumanager: validate topology in static policy What this PR does / why we need it: This patch adds a check for the static policy state validation. The check fails if the CPU topology obtained from cadvisor doesn't match with the current topology in the state file. If the CPU topology has changed in a node, cpumanager static policy might try to assign non-present cores to containers. For example in my test case, static policy had the default CPU set of `0-1,4-7`. Then kubelet was shut down and CPU 7 was offlined. After restarting the kubelet, CPU manager tries to assign the non-existent CPU 7 to containers which don't have exclusive allocations assigned to them: Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6) This breaks the exclusivity, since the CPUs from the shared pool don't get assigned to non-exclusive containers, meaning that they can execute on the exclusive CPUs. Release note: ```release-note Added CPU Manager state validation in case of changed CPU topology. ```	2018-07-31 08:05:06 -07:00
Ismo Puustinen	3bb5ca9257	cpumanager: add test for available CPUs in static policy. Test the cases where the number of CPUs available in the system is smaller or larger than the number of CPUs known in the state, which should lead to a panic. This covers both CPU onlining and offlining. The case where the number of CPUs matches is already covered by the "non-corrupted state" test.	2018-07-31 10:20:37 +03:00
Ismo Puustinen	4f604eb73c	cpumanager: validate topology in static policy. This patch adds a check for the static policy state validation. The check fails if the CPU topology obtained from cadvisor doesn't match with the current topology in the state file. If the CPU topology has changed in a node, cpu manager static policy might try to assign non-present cores to containers. For example in my test case, static policy had the default CPU set of 0-1,4-7. Then kubelet was shut down and CPU 7 was offlined. After restarting the kubelet, CPU manager tries to assign the non-existent CPU 7 to containers which don't have exclusive allocations assigned to them: Error response from daemon: Requested CPUs are not available - requested 0-1,4-7, available: 0-6) This breaks the exclusivity, since the CPUs from the shared pool don't get assigned to non-exclusive containers, meaning that they can execute on the exclusive CPUs.	2018-07-30 08:49:13 +03:00
hui luo	7101c17498	While reviewing devicemanager code, found the caching layer on endpoint is redundant. Here are the 3 related objects in picture: devicemanager <-> endpoint <-> plugin Plugin is the source of truth for devices and device health status. devicemanager maintain healthyDevices, unhealthyDevices, allocatedDevices based on updates from plugin. So there is no point for endpoint caching devices, this patch is removing this caching layer on endpoint, Also removing the Manager.Devices() since i didn't find any caller of this other than test, i am adding a notification channel to facilitate testing, If we need to get all devices from manager in future, it just need to return healthyDevices + unhealthyDevices, we don't have to call endpoint after all. This patch makes code more readable, data model been simplified.	2018-07-29 21:07:14 -07:00
Kubernetes Submit Queue	32e38b6659	Merge pull request #58755 from vikaschoudhary16/probing-mode Automatic merge from submit-queue (batch tested with PRs 58755, 66414). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Use probe based plugin watcher mechanism in Device Manager What this PR does / why we need it: Uses this probe based utility in the device plugin manager. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #56944 Notes For Reviewers: Changes are backward compatible and existing device plugins will continue to work. At the same time, any new plugins that has required support for probing model (Identity service implementation), will also work. Release note ```release-note Add support kubelet plugin watcher in device manager. ``` /sig node /area hw-accelerators /cc /cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle	2018-07-27 15:20:06 -07:00
bingshen.wbs	b1bdd043c4	fix kubelet npe on device plugin return zero container Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>	2018-07-25 10:15:30 +08:00
Ravi Sankar Penta	0282720e29	Do not set cgroup parent when --cgroups-per-qos is disabled When --cgroups-per-qos=false (default is true), kubelet sets pod container management to podContainerManagerNoop implementation and GetPodContainerName() returns '/' as cgroup parent (default cgroup root). (1) In case of 'systemd' cgroup driver, '/' is invalid parent as docker daemon expects '.slice' suffix and throws this error: 'cgroup-parent for systemd cgroup should be a valid slice named as \"xxx.slice\"' (`5fc12449d8/daemon/daemon_unix.go (L618)`) '/' corresponds to '-.slice' (root slice) in systemd but I don't think we want to assign root slice instead of runtime specific default value. In case of docker runtime, this will be 'system.slice' (`e2593239d9/daemon/oci_linux.go (L698)`) (2) In case of 'cgroupfs' cgroup driver, '/' is valid parent but I don't think we want to assign root instead of runtime specific default value. In case of docker runtime, this will be '/docker' (`e2593239d9/daemon/oci_linux.go (L695)`) Current fix will not set the cgroup parent when --cgroups-per-qos is disabled.	2018-07-20 10:25:50 -07:00
vikaschoudhary16	a5842503eb	Use probe based plugin discovery mechanism in device manager	2018-07-17 04:02:31 -04:00
linyouchong	6ff285bce3	fix nil pointer dereference in node_container_manager#enforceExistingCgroup	2018-07-14 10:42:42 +08:00
choury	8e4b62a74b	Remove duplicate check line There is a same [line](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cpumanager/policy_static.go#L81).	2018-07-05 11:07:56 +08:00
Seth Jennings	3234b0fa5b	feature gate LSI capacity calculation	2018-06-28 14:01:08 -05:00
Kubernetes Submit Queue	991a84758f	Merge pull request #59214 from kdembler/cpumanager-checkpointing Automatic merge from submit-queue (batch tested with PRs 59214, 65330). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Migrate cpumanager to use checkpointing manager What this PR does / why we need it: This PR migrates `cpumanager` to use new kubelet level node checkpointing feature (#56040) to decrease code redundancy and improve consistency. Which issue(s) this PR fixes: Fixes #58339 Notes: At point of submitting PR the most straightforward approach was used - `state_checkpoint` implementation of `State` interface was added. However, with checkpointing implementation there might be no point to keep `State` interface and just use single implementation with checkpoint backend and in case of different backend than filestore needed just supply `cpumanager` with custom `CheckpointManager` implementation. /kind feature /sig node cc @flyingcougar @ConnorDoyle	2018-06-25 18:19:00 -07:00
Jeff Grafton	23ceebac22	Run hack/update-bazel.sh	2018-06-22 16:22:57 -07:00
Jeff Grafton	a725660640	Update to gazelle 0.12.0 and run hack/update-bazel.sh	2018-06-22 16:22:18 -07:00
Kubernetes Submit Queue	148350d3c4	Merge pull request #64426 from cofyc/remove_unnecessary_fakemounters Automatic merge from submit-queue (batch tested with PRs 64142, 64426, 62910, 63942, 64548). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Clean up fake mounters. What this PR does / why we need it: Fixes https://github.com/kubernetes/kubernetes/issues/61502 Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: list of fake mounters: - (keep) pkg/util/mount.FakeMounter - (removed) pkg/kubelet/cm.fakeMountInterface: - (inherit from mount.FakeMounter) pkg/util/mount.fakeMounter - (inherit from mount.FakeMounter) pkg/util/removeall.fakeMounter - (removed) pkg/volume/host_path.fakeFileTypeChecker Release note: ```release-note NONE ```	2018-06-20 00:05:10 -07:00
Kubernetes Submit Queue	c399c306e2	Merge pull request #59174 from tianshapjq/todo-already-done Automatic merge from submit-queue (batch tested with PRs 65230, 57355, 59174, 63698, 63659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. TODO has already been implemented What this PR does / why we need it: TODO has already been implemented, remove the TODO tag. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: Release note: ```release-note ```NONE	2018-06-19 20:19:17 -07:00
Klaudiusz Dembler	a9df2acc4b	Typo fix	2018-06-07 12:08:48 +02:00
Yecheng Fu	40c3937320	Clean up fake mounters.	2018-06-02 15:55:19 +08:00
Kubernetes Submit Queue	d2495b8329	Merge pull request #63143 from jsafrane/containerized-subpath Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Containerized subpath What this PR does / why we need it: Containerized kubelet needs a different implementation of `PrepareSafeSubpath` than kubelet running directly on the host. On the host we safely open the subpath and then bind-mount `/proc/<pidof kubelet>/fd/<descriptor of opened subpath>`. With kubelet running in a container, `/proc/xxx/fd/yy` on the host contains path that works only inside the container, i.e. `/rootfs/path/to/subpath` and thus any bind-mount on the host fails. Solution: - safely open the subpath and gets its device ID and inode number - blindly bind-mount the subpath to `/var/lib/kubelet/pods/<uid>/volume-subpaths/<name of container>/<id of mount>`. This is potentially unsafe, because user can change the subpath source to a link to a bad place (say `/run/docker.sock`) just before the bind-mount. - get device ID and inode number of the destination. Typical users can't modify this file, as it lies on /var/lib/kubelet on the host. - compare these device IDs and inode numbers. Which issue(s) this PR fixes Fixes #61456 Special notes for your reviewer: The PR contains some refactoring of `doBindSubPath` to extract the common code. New `doNsEnterBindSubPath` is added for the nsenter related parts. Release note: ```release-note NONE ```	2018-06-01 12:12:19 -07:00
Guoliang Wang	761cf41427	Move pkg/scheduler/schedulercache -> pkg/scheduler/cache	2018-05-31 22:55:34 +08:00
Jan Safranek	74ba0878a1	Enhance ExistsPath check It should return error when the check fails (e.g. no permissions, symlink link loop etc.)	2018-05-23 10:21:20 +02:00
Jan Safranek	97b5299cd7	Add GetMode to mounter interface. Kubelet must not call os.Lstat on raw volume paths when it runs in a container. Mounter knows where the file really is.	2018-05-23 10:17:59 +02:00
Klaudiusz Dembler	9384937f2f	Update bazel	2018-05-21 17:39:51 +02:00
Klaudiusz Dembler	de1063bc7d	Add compatibility tests	2018-05-21 14:50:31 +02:00
Klaudiusz Dembler	3d09101b6f	Add docstrings	2018-05-21 11:40:04 +02:00
Jan Safranek	598ca5accc	Add GetSELinuxSupport to mounter.	2018-05-17 13:36:37 +02:00
Klaudiusz Dembler	aa325ec2d9	Change JSON letter case in tests	2018-05-15 18:43:48 +02:00
Klaudiusz Dembler	7bb047ec75	Rebase and backward compatibility	2018-05-15 18:34:53 +02:00
Klaudiusz Dembler	ba8d82c96a	Update error indicating unexistent checkpoint	2018-05-14 09:51:27 +02:00
Klaudiusz Dembler	0b1a73e94b	Make cpuManagerCheckpoint exported	2018-05-14 09:51:27 +02:00
Klaudiusz Dembler	cc3fa67bda	Add comments to MockCheckpoint functions and gofmt	2018-05-14 09:51:27 +02:00
Klaudiusz Dembler	0fbd19bc06	Tweaks	2018-05-14 09:51:26 +02:00
Klaudiusz Dembler	3991ed5d2f	Add tests	2018-05-14 09:51:26 +02:00
Klaudiusz Dembler	6bfceed4ab	Migrate cpumanager to use checkpointing manager	2018-05-14 09:45:58 +02:00
Kubernetes Submit Queue	204520b029	Merge pull request #63344 from RobertKrawitz/fix-process-kill-algorithm Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Correct kill logic for pod processes Correct the kill logic for processes in the pod's cgroup. os.FindProcess() does not check whether the process exists on POSIX systems.	2018-05-11 11:41:19 -07:00
Kubernetes Submit Queue	321201f672	Merge pull request #63406 from derekwaynecarr/label-pod-cgroups Automatic merge from submit-queue (batch tested with PRs 60200, 63623, 63406). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Apply pod name and namespace labels for pod cgroup for cadvisor metrics What this PR does / why we need it: 1. Enable Prometheus users to determine usage by pod name and namespace for pod cgroup sandbox. 1. Label cAdvisor metrics for pod cgroups by pod name and namespace. 1. Aligns with kubelet stats summary endpoint pod cpu and memory stats. Special notes for your reviewer: This provides parity with the summary API enhancements done here: https://github.com/kubernetes/kubernetes/pull/55969 Release note: ```release-note Apply pod name and namespace labels to pod cgroup in cAdvisor metrics ```	2018-05-10 08:33:11 -07:00
Derek Carr	a09990cd43	Apply pod name and namespace labels for pod cgroup for cadvisor metrics	2018-05-07 14:51:12 -04:00

1 2 3 4 5 ...

390 Commits (2c933695fa61d57d1c6fa5defb89caed7d49f773)