github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
supereagle	b98c36394d	update docker version parser for its new versioning scheme	2017-04-07 14:53:44 +08:00
Kubernetes Submit Queue	27cf62ac29	Merge pull request #43940 from xlgao-zju/rm-all-containers Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940) [CRI] Remove all containers in the sandbox Remove all containers in the sandbox, when we remove the sandbox. /cc @feiskyer @Random-Liu Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>	2017-04-06 17:00:23 -07:00
Kubernetes Submit Queue	063a5ca6fd	Merge pull request #41189 from NickrenREN/kubelet-dswp-test Automatic merge from submit-queue (batch tested with PRs 41189, 43818) add dswp unit test case	2017-04-06 15:12:18 -07:00
Andy Goldstein	2c30dc1a60	ConstructPodPortMapping: move & export Move ConstructPodPortMapping to pkg/kubelet/network/hostport and export it so downstream projects (such as OpenShift) can use it.	2017-04-06 13:47:33 -04:00
Andy Goldstein	010b71a5f7	kubelet: make dockershim.sock configurable Make the location of dockershim.sock configurable, so downstream projects (such as OpenShift) can place it in a location that does not require root access (e.g. for integration tests). Make the kubelet respect and use the values of --container-runtime-endpoint and --image-service-endpoint, if set. If unset, the default value of /var/run/dockershim.sock is used.	2017-04-06 12:01:21 -04:00
Kubernetes Submit Queue	422497b4cf	Merge pull request #43447 from NickrenREN/vm-updateStates Automatic merge from submit-queue (batch tested with PRs 42141, 43447) Fix AddPodToVolume: Change arg to volumeGidValue instead of devicePath Release note: ```release-note NONE ```	2017-04-05 22:31:23 -07:00
Kubernetes Submit Queue	d661ea971b	Merge pull request #43432 from NickrenREN/vmmanager-cleanup Automatic merge from submit-queue cleanup: remove TODO(resolved) and var(unused) Release note: ```release-note NONE ```	2017-04-05 21:31:11 -07:00
NickrenREN	4b7b0e2bc2	add dswp unit test case add test case for dswp	2017-04-06 11:47:23 +08:00
Kubernetes Submit Queue	62c7c66ff4	Merge pull request #42772 from timchenxiaoyu/sometypo Automatic merge from submit-queue (batch tested with PRs 44097, 42772, 43880, 44031, 44066) fix some typo fix some typo Release note: ```NONE ```	2017-04-05 16:41:20 -07:00
Kubernetes Submit Queue	7b7257ac79	Merge pull request #44097 from feiskyer/hostpid Automatic merge from submit-queue Fix container hostPid settings What this PR does / why we need it: HostPid is not set correctly for containers. Which issue this PR fixes Fixes #44041. Special notes for your reviewer: Should be cherry-picked into v1.6 branch. Release note: ```release-note Fix container hostPid settings. ``` cc @yujuhong @derekwaynecarr @unclejack @kubernetes/sig-node-pr-reviews	2017-04-05 16:24:49 -07:00
Kubernetes Submit Queue	08fefc9d9a	Merge pull request #42769 from timchenxiaoyu/acrosstypo Automatic merge from submit-queue fix across typo fix across typo NONE	2017-04-05 14:28:26 -07:00
Pengfei Ni	023fe48c98	Do not clear hostPid for host-networked container	2017-04-05 22:34:30 +08:00
Pengfei Ni	5812c876f7	kuberuntime: set namespsace options regardless of security context	2017-04-05 22:29:46 +08:00
Xianglin Gao	b9c1d6c7c8	Remove all containers in the sandbox Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>	2017-04-05 13:36:30 +08:00
Kubernetes Submit Queue	1a43fd0a63	Merge pull request #44047 from yujuhong/dont_panic Automatic merge from submit-queue (batch tested with PRs 44047, 43514, 44037, 43467) Check the error before parsing the apiversion This fixes #44027	2017-04-04 14:33:20 -07:00
Kubernetes Submit Queue	faf2eca226	Merge pull request #42916 from dashpole/misleading_log Automatic merge from submit-queue Clearer ImageGC failure errors. Fewer events. Addresses #26000. Kubelet often "fails" image garbage collection if cAdvisor has not completed the first round of stats collection. Don't create events for a single failure, and make log messages more specific. @kubernetes/sig-node-bugs	2017-04-04 11:23:32 -07:00
Yu-Ju Hong	19c8b2fb0e	Check the error before parsing the apiversion	2017-04-04 09:38:44 -07:00
Kubernetes Submit Queue	e28cb42706	Merge pull request #42717 from andrewsykim/support-host-ip-downward-api Automatic merge from submit-queue Support status.hostIP in downward API What this PR does / why we need it: Exposes pod's hostIP (node IP) via downward API. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes https://github.com/kubernetes/kubernetes/issues/24657 Special notes for your reviewer: Not sure if there's more documentation that's needed, please point me in the right direction and I will add some :)	2017-04-03 15:48:12 -07:00
Kubernetes Submit Queue	953d8838ea	Merge pull request #40117 from mtaufen/flags-struct Automatic merge from submit-queue Add separate KubeletFlags struct and remove HostnameOverride and NodeIP from config type Add a separate flags struct for Kubelet flags Kubelet flags are not necessarily appropriate for the KubeletConfiguration object. For example, this PR also removes HostnameOverride and NodeIP from KubeletConfiguration.This is a preleminary step to enabling Nodes to share configurations, as part of the dynamic Kubelet configuration feature (#29459). Fields that must be unique for each node inhibit sharing, because their values, by definition, cannot be shared. /cc @ncdc @kubernetes/sig-node-misc @kubernetes/sig-cluster-lifecycle-misc	2017-04-03 15:02:51 -07:00
Kubernetes Submit Queue	e2d011e455	Merge pull request #41582 from dashpole/unit_test_status Automatic merge from submit-queue (batch tested with PRs 42973, 41582) Improve status manager unit testing This is designed to simplify testing logic in the status manager, and decrease reliance on syncBatch. This is a smaller portion of #37119, and should be easier to review than that change. It makes the following changes: - creates convenience functions for get, update, and delete core.Action - prefers using syncPod on elements in the podStatusChannel to using syncBatch to reduce unintended reliance on syncBatch - combines consuming, validating, and clearing actions into single verifyActions function. This replaces calls to testSyncBatch(), verifyActions(), and ClearActions - changes comments in testing functions into log statements for easier debugging @Random-Liu	2017-04-03 14:05:17 -07:00
Michael Taufen	f5eed7e91d	Add a separate flags struct for Kubelet flags Kubelet flags are not necessarily appropriate for the KubeletConfiguration object. For example, this PR also removes HostnameOverride and NodeIP from KubeletConfiguration. This is a preleminary step to enabling Nodes to share configurations, as part of the dynamic Kubelet configuration feature (#29459). Fields that must be unique for each node inhibit sharing, because their values, by definition, cannot be shared.	2017-04-03 13:28:29 -07:00
Kubernetes Submit Queue	6c6f4f0185	Merge pull request #43925 from Random-Liu/fix-dockershim-dns-options Automatic merge from submit-queue [CRI] Use DNSOptions passed by CRI in dockershim. When @xlgao-zju is working on the CRI validation test, he found that dockershim is not using the DNSOptions passed in CRI. https://github.com/kubernetes-incubator/cri-tools/pull/30#issuecomment-290644357 This PR fixed the issue. I've manually tried, for `ClusterFirst` DNSPolicy, the resolv.conf will be: ``` nameserver 8.8.8.8 search corp.google.com prod.google.com prodz.google.com google.com options ndots:5 ``` For `Default` DNSPolicy, the resolv.conf will be: ``` nameserver 127.0.1.1 search corp.google.com prod.google.com prodz.google.com google.com ``` @xlgao-zju You should be able to test after this PR is merged. /cc @yujuhong @feiskyer	2017-04-03 11:58:23 -07:00
David Ashpole	58c32c5228	improve testing	2017-04-03 11:32:53 -07:00
Random-Liu	b1ce4b7a1d	Use DNSOptions passed by CRI in dockershim.	2017-04-03 10:24:42 -07:00
Harry Zhang	efb10b1821	Move extract resources to its pkg Move ExtractContainerResourceValue	2017-04-03 13:06:48 +08:00
Kubernetes Submit Queue	25a87fa19c	Merge pull request #40804 from runcom/prepull-cri Automatic merge from submit-queue test/e2e_node: prepull images with CRI Part of https://github.com/kubernetes/kubernetes/issues/40739 - This PR builds on top of #40525 (and contains one commit from #40525) - The second commit contains a tiny change in the `Makefile`. - Third commit is a patch to be able to prepull images using the CRI (as opposed to run `docker` to pull images which doesn't make sense if you're using CRI most of the times) Marked WIP till #40525 makes its way into master @Random-Liu @lucab @yujuhong @mrunalp @rhatdan	2017-04-01 03:08:35 -07:00
Kubernetes Submit Queue	8dde5f2cb0	Merge pull request #43890 from xlgao-zju/more-robust Automatic merge from submit-queue Make func modifySandboxNamespaceOptions() more robust Make func `modifySandboxNamespaceOptions()` more robust, just like what we do in func [`modifyContainerNamespaceOptions`](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/security_context.go#L117). /cc @feiskyer Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>	2017-03-31 12:27:03 -07:00
Kubernetes Submit Queue	d42d630d74	Merge pull request #43762 from sjenning/docker-pid-fail Automatic merge from submit-queue refactor getPidsForProcess and change error handling xref https://github.com/openshift/origin/issues/13262 Right now, failure to read the docker pid from the pid file results in some premature nasty logging. There is still a chance we can get the docker pid from `procfs.PidOf()`. If that fails we should just log at `V(4)` rather than `runtime.HanldeError()`. This PR refactors `getPidsForProcess()` to wait until both methods for determining the pid fail before logging anything. @smarterclayton @ncdc @derekwaynecarr	2017-03-31 10:02:03 -07:00
Kubernetes Submit Queue	7629bffba2	Merge pull request #42876 from timchenxiaoyu/accuratehint Automatic merge from submit-queue (batch tested with PRs 42379, 42668, 42876, 41473, 43260) accurate hint accurate hint same err hint (Error adding network) in one method,cann't position problem	2017-03-30 23:36:26 -07:00
Xianglin Gao	e5b3e0879d	make func modifySandboxNamespaceOptions() more robust Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>	2017-03-31 14:14:10 +08:00
andrewsykim	a62653456b	use kl.getHostIPAnyWay() to get host ip even if node is not registered	2017-03-30 21:57:08 -04:00
David Ashpole	2cd65ea863	only create event for multiple imagegc failures	2017-03-30 16:19:18 -07:00
Kubernetes Submit Queue	61f5f842e2	Merge pull request #42662 from wongma7/status Automatic merge from submit-queue Print dereferenced pod status fields when logging status update Before: "Terminated:0xc421932af0" After:"Terminated:&ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:0001-01-01 00:00:00 +0000 UTC,FinishedAt:2017-03-07 14:50:48 -0500 EST,ContainerID:docker://bd453bb969264b3ace2b3934a568af7679a0d51fee543a5f8a82429ff654970e,}" "Ignoring same status for pod" messages already print status fully, these "Status for pod updated" messages should too IMO ```release-note NONE ```	2017-03-30 10:33:41 -07:00
Kubernetes Submit Queue	a644c8f968	Merge pull request #43775 from wongma7/subpath Automatic merge from submit-queue Create subPaths and set their permissions like we do mountPaths fixes https://github.com/kubernetes/kubernetes/issues/41638 If a subPath does not exist at the time MountVolume.Setup happens, SetVolumeOwnership will not have walked to the subPath and set appropriate permissions on it, leading to the above issue So later, at makeMounts when we are parsing subPaths, let's create all subPaths and set their permissions according to how the parent mountPath looks. ```release-note NONE ```	2017-03-30 01:15:50 -07:00
Matthew Wong	25bdad762b	Create subPaths and set their permissions like we do mountPaths	2017-03-30 01:52:08 -04:00
Kubernetes Submit Queue	433a0438df	Merge pull request #43792 from NickrenREN/asw-log-err Automatic merge from submit-queue Modify fatal messages Release note: ```release-note NONE ```	2017-03-29 11:50:30 -07:00
Kubernetes Submit Queue	964e1553ab	Merge pull request #43604 from k82cn/rkt_typo Automatic merge from submit-queue Fix comments typo in rkt. fixes comments typo of rkt runtime. ```release-note None ```	2017-03-29 00:15:14 -07:00
NickrenREN	75053b2d9e	Modify fatal messages	2017-03-29 14:17:11 +08:00
andrewsykim	4f6c1b5ad5	call GetHostIP from makeEnvironment	2017-03-28 20:20:21 -04:00
andrewsykim	c001deed43	fetch hostIP at runtime since status manager didn't update it yet	2017-03-28 20:20:20 -04:00
andrewsykim	824d0b11cb	e2e tests for status.hostIP in downward api	2017-03-28 20:20:20 -04:00
andrewsykim	91c027d6cc	support hostIP in downward API	2017-03-28 20:20:19 -04:00
Christoph Blecker	6681835b0c	Fix gofmt errors	2017-03-28 17:12:04 -07:00
Seth Jennings	ebb1243aba	refactor getPidsForProcess and change error handling	2017-03-28 11:34:49 -05:00
Kubernetes Submit Queue	e38c575ae6	Merge pull request #39231 from NickrenREN/getPullSecretsForPod Automatic merge from submit-queue (batch tested with PRs 42721, 39231) optimize getPullSecretsForPod() and syncPod()	2017-03-26 23:59:21 -07:00
Kubernetes Submit Queue	a7788aff24	Merge pull request #43057 from feiskyer/docker-version Automatic merge from submit-queue kubelet: check and enforce minimum docker api version What this PR does / why we need it: This PR adds enforcing a minimum docker api version (same with what we have do for dockertools). Which issue this PR fixes Fixes #42696. Release note: ```release-note NONE ```	2017-03-26 21:34:04 -07:00
Kubernetes Submit Queue	73a3c05f06	Merge pull request #43428 from feiskyer/typo Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428) Fix tiny typo What this PR does / why we need it: Which issue this PR fixes Fix type typo introduced by PR #43368. Release note: ```release-note NONE ```	2017-03-25 21:22:28 -07:00
Kubernetes Submit Queue	e281128c51	Merge pull request #43216 from JulienBalestra/rkt-host-path-volume Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428) Kubelet:rkt Create any missing hostPath Volumes When using a `hostPath` inside the `Pod.spec.volumes`, this PR allows to creates any missing directory on the node. What this PR does / why we need it: With rkt as the container runtime we cannot use `hostPath` volumes if the directory is missing. Special notes for your reviewer: This PR follows [#39965](https://github.com/kubernetes/kubernetes/pull/39965) The labels should be > area/rkt > area/kubelet	2017-03-25 21:22:23 -07:00
Kubernetes Submit Queue	ead437f165	Merge pull request #42671 from yujuhong/do_asserts Automatic merge from submit-queue (batch tested with PRs 43144, 42671, 43226, 43314, 43361) Use the assert/require package in kubelet unit tests	2017-03-25 19:10:23 -07:00
Kubernetes Submit Queue	f9e87e1dc2	Merge pull request #42902 from louyihua/allow-tcp-probe-host Automatic merge from submit-queue (batch tested with PRs 42998, 42902, 42959, 43020, 42948) Add Host field to TCPSocketAction Currently, TCPSocketAction always uses Pod's IP in connection. But when a pod uses the host network, sometimes firewall rules may prevent kubelet from connecting through the Pod's IP. This PR introduces the 'Host' field for TCPSocketAction, and if it is set to non-empty string, the probe will be performed on the configured host rather than the Pod's IP. This gives users an opportunity to explicitly specify 'localhost' as the target for the above situations. ```release-note Add Host field to TCPSocketAction ```	2017-03-25 17:17:23 -07:00
Kubernetes Submit Queue	8f40622d36	Merge pull request #42770 from eparis/efficient-debug Automatic merge from submit-queue (batch tested with PRs 42672, 42770, 42818, 42820, 40849) Return early from eviction debug helpers if !glog.V(3) Should keep us from running a bunch of loops needlessly. ```release-note NONE ```	2017-03-25 14:27:24 -07:00
Kubernetes Submit Queue	5fd0566ce7	Merge pull request #43652 from Random-Liu/avoid-kubelet-panic Automatic merge from submit-queue (batch tested with PRs 43653, 43654, 43652) CRI: Check nil pointer to avoid kubelet panic. When working on the containerd kubernetes integration, I casually returns an empty `sandboxStatus.Linux{}`, but it cause kubelet to panic. This won't happen when runtime returns valid data, but we should not make the assumption here. /cc @yujuhong @feiskyer	2017-03-24 22:16:21 -07:00
NickrenREN	2f89a6bda6	optimize getPullSecretsForPod() and syncPod() Since getPullSecretsForPod() will never return err,we do not need the second return value,and modify syncPod() function.	2017-03-25 11:05:13 +08:00
Random-Liu	9186d1568e	Check nil pointer to avoid kubelet panic.	2017-03-24 17:27:15 -07:00
Kubernetes Submit Queue	a4986e38e6	Merge pull request #42556 from resouer/fix-id Automatic merge from submit-queue (batch tested with PRs 42522, 42545, 42556, 42006, 42631) Use pod sandbox id in checkpoint What this PR does / why we need it: we should log out sandbox id when checkpoint error Release note: ```NONE ```	2017-03-24 15:10:32 -07:00
Kubernetes Submit Queue	d14854fd5c	Merge pull request #37698 from jsafrane/remove-all-filesystems Automatic merge from submit-queue (batch tested with PRs 41139, 41186, 38882, 37698, 42034) Make kubelet never delete files on mounted filesystems With bug #27653, kubelet could remove mounted volumes and delete user data. The bug itself is fixed, however our trust in kubelet is significantly lower. Let's add an extra version of RemoveAll that does not cross mount boundary (rm -rf --one-file-system). It calls lstat(path) three times for each removed directory - once in RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way it's platform independent and the directory that is being removed by kubelet should be almost empty.	2017-03-24 12:33:27 -07:00
Kubernetes Submit Queue	6eaa8610a1	Merge pull request #42226 from timchenxiaoyu/reconciletypo Automatic merge from submit-queue fix reconcile typo	2017-03-24 10:25:27 -07:00
Klaus Ma	7c91274df2	Fix comments typo in rkt.	2017-03-24 11:31:15 +08:00
Kubernetes Submit Queue	7c24d1a665	Merge pull request #43539 from yujuhong/hostnet_ip Automatic merge from submit-queue (batch tested with PRs 43533, 43539) kuberuntime: don't override the pod IP for pods using host network This fixes the issue of not passing pod IP via downward API for host network pods.	2017-03-22 15:07:18 -07:00
Yu-Ju Hong	ea868d6f7b	kuberuntime: don't override the pod IP for pods using host network	2017-03-22 13:28:17 -07:00
Kubernetes Submit Queue	fb890dee06	Merge pull request #43474 from dcbw/cni-network-status Automatic merge from submit-queue (batch tested with PRs 43465, 43529, 43474, 43521) kubelet/cni: hook network plugin Status() up to CNI network discovery Ensure that the plugin returns NotReady status until there is a CNI network available which can be used to set up pods. Fixes: https://github.com/kubernetes/kubernetes/issues/43014 I think the only reason it wasn't done like this in the first place was that the dynamic "reread /etc/cni/net.d every 10s forever" was added long after the Status() hook was. What do you think? @freehan @caseydavenport @luxas @jbeda	2017-03-22 12:35:11 -07:00
Dan Williams	193abffdbe	kubelet/cni: hook network plugin Status() up to CNI network discovery Ensure that the plugin returns NotReady status until there is a CNI network available which can be used to set up pods. Fixes: https://github.com/kubernetes/kubernetes/issues/43014	2017-03-21 15:50:39 -05:00
NickrenREN	14feb9aba8	Change AddPodToVolume() arg to volumeGidValue instead of devicePath	2017-03-21 19:07:15 +08:00
NickrenREN	a451daca0d	cleanup: remove TODO(resolved) and var(unused)	2017-03-21 15:40:32 +08:00
Pengfei Ni	a16758396c	Fix tiny typo	2017-03-21 14:22:33 +08:00
Random-Liu	fbc320af28	Use uid in config.go instead of pod full name.	2017-03-20 15:52:29 -07:00
Kubernetes Submit Queue	948e3754f8	Merge pull request #43368 from feiskyer/dns-policy Automatic merge from submit-queue (batch tested with PRs 43398, 43368) CRI: add support for dns cluster first policy What this PR does / why we need it: PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools was updated to support the feature. This PR updates kuberuntime to support it for all runtimes. Which issue this PR fixes fixes #43352 Special notes for your reviewer: Candidate for v1.6. Release note: ```release-note NONE ``` cc @thockin @luxas @vefimova @Random-Liu	2017-03-20 13:54:33 -07:00
Pengfei Ni	95c3782043	Rewrite resolv.conf for dockershim PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support setting dns options togather with hostnetwork. This commit rewrites resolv.conf same as dockertools.	2017-03-20 18:45:39 +08:00
Pengfei Ni	079158fa08	CRI: add support for dns cluster first policy PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools was updated to support the feature. This PR updates kuberuntime to support it for all runtimes. Also fixes #43352.	2017-03-20 17:50:38 +08:00
Pengfei Ni	99ed3202f3	Run hack/update-bazel.sh	2017-03-20 17:48:36 +08:00
Pengfei Ni	53b5f2df48	Add unit test for MakePortsAndBindings	2017-03-20 17:47:38 +08:00
Antonio Murdaca	caa6dd2599	pkg/kubelet/remote: fix typo Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-03-20 10:12:03 +01:00
Pengfei Ni	2ddaaec199	dockershim: process protocol correctly for port mapping	2017-03-20 16:52:24 +08:00
Kubernetes Submit Queue	7bc86d84c1	Merge pull request #43116 from dchen1107/master Automatic merge from submit-queue (batch tested with PRs 42828, 43116) Apply taint tolerations for NoExecute for all static pods. Fixed https://github.com/kubernetes/kubernetes/issues/42753 Release note: ``` Apply taint tolerations for NoExecute for all static pods. ``` cc/ @davidopp	2017-03-17 18:14:29 -07:00
Dawn Chen	d419efbe71	Fix unittest reflecting the default taint tolerations change for static pods.	2017-03-17 14:06:34 -07:00
Dawn Chen	d26e906191	Apply taint tolerations for NoExecute for all static pods.	2017-03-17 09:50:27 -07:00
Julien Balestra	cd7c480f86	Kubelet:rkt Create any missing hostPath Volumes	2017-03-17 10:47:02 +01:00
Yu-Ju Hong	b1e6e7f774	Use the assert/require package in kubelet unit tests This reduce the lines of code and improve readability.	2017-03-16 10:21:44 -07:00
Piotr Szczesniak	9bd05bdee4	Setup fluentd-ds-ready label in startup script not in kubelet	2017-03-16 13:18:31 +01:00
Kubernetes Submit Queue	ba25afd278	Merge pull request #40964 from tanshanshan/kubelet-unit-test Automatic merge from submit-queue (batch tested with PRs 40964, 42967, 43091, 43115) Improve code coverage for pkg/kubelet/status/generate.go What this PR does / why we need it: Improve code coverage for pkg/kubelet/status/generate.go from #39559 Thanks. Special notes for your reviewer: Release note: ```release-note ```	2017-03-15 16:08:23 -07:00
Kubernetes Submit Queue	222f69cf3c	Merge pull request #43030 from yujuhong/rm_corrupted_checkpoint Automatic merge from submit-queue (batch tested with PRs 42747, 43030) dockershim: remove corrupted sandbox checkpoints This is a workaround to ensure that kubelet doesn't block forever when the checkpoint is corrupted. This is a workaround for #43021	2017-03-14 22:56:20 -07:00
Yu-Ju Hong	48afc7d4e0	dockershim: call sync() after writing the checkpoint This ensures the checkpoint files are persisted.	2017-03-14 18:36:51 -07:00
Pengfei Ni	91616f666a	kubelet: check and enforce minimum docker api version	2017-03-15 09:28:06 +08:00
Kubernetes Submit Queue	6de28fab7d	Merge pull request #42942 from vishh/gpu-cont-fix Automatic merge from submit-queue (batch tested with PRs 42942, 42935) [Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs Fixes #42412 Background Support for multiple GPUs is an experimental feature in v1.6. Container restarts were handled incorrectly which resulted in stranding of GPUs Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods. What does this PR do This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs. GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache. Node e2e has been extended to validate this failure scenario. Risk Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet. Workarounds In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever` in their pods. There is no workaround for the race condition caused by using the runtime cache though. Hence it is worth including this fix in v1.6.0. cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews Replaces #42560	2017-03-14 10:19:17 -07:00
Lou Yihua	63f1b077dc	Add Host field to TCPSocketAction Currently, TCPSocketAction always uses Pod's IP in connection. But when a pod uses the host network, sometimes firewall rules may prevent kubelet from connecting through the Pod's IP. This PR introduces the 'Host' field for TCPSocketAction, and if it is set to non-empty string, the probe will be performed on the configured host rather than the Pod's IP. This gives users an opportunity to explicitly specify 'localhost' as the target for the above situations.	2017-03-14 23:48:28 +08:00
Kubernetes Submit Queue	f1e9004da9	Merge pull request #42927 from Random-Liu/fix-kubelet-panic Automatic merge from submit-queue (batch tested with PRs 42802, 42927, 42669, 42988, 43012) Fix kubelet panic in cgroup manager. Fixes https://github.com/kubernetes/kubernetes/issues/42920. Fixes https://github.com/kubernetes/kubernetes/issues/42875 Fixes #42927 Fixes #43059 Check the error in walk function, so that we don't use info when there is an error. @yujuhong @dchen1107 @derekwaynecarr @vishh /cc @kubernetes/sig-node-bugs	2017-03-14 07:31:31 -07:00
Yu-Ju Hong	035afab901	dockershim: remove corrupted sandbox checkpoints This is a workaround to ensure that kubelet doesn't block forever when the checkpoint is corrupted.	2017-03-13 15:41:01 -07:00
Random-Liu	e6341cc3c7	Fix kubelet panic in cgroup manager.	2017-03-13 12:06:08 -07:00
Vishnu kannan	ad743a922a	remove dead code in gpu manager Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-03-13 10:58:26 -07:00
Vishnu kannan	ff158090b3	use active pods instead of runtime pods in gpu manager Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-03-13 10:58:26 -07:00
Vishnu Kannan	8ed9bff073	handle container restarts for GPUs Signed-off-by: Vishnu Kannan <vishnuk@google.com>	2017-03-13 10:58:26 -07:00
tanshanshan	26ab52a3cb	fix	2017-03-13 10:00:19 +08:00
Kubernetes Submit Queue	59aa924a9b	Merge pull request #42642 from fraenkel/envfrom Automatic merge from submit-queue Invalid environment var names are reported and pod starts When processing EnvFrom items, all invalid keys are collected and reported as a single event. The Pod is allowed to start. fixes #42583	2017-03-10 17:37:31 -08:00
timchenxiaoyu	c295514443	accurate hint	2017-03-10 16:41:51 +08:00
Kubernetes Submit Queue	d790851c8f	Merge pull request #42694 from dchen1107/master Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694) Dropped docker 1.9.x support. Changed the minimumDockerAPIVersion to 1.22 cc/ @Random-Liu @yujuhong We talked about dropping docker 1.9.x support for a while. I just realized that we haven't really done it yet. ```release-note Dropped the support for docker 1.9.x and the belows. ```	2017-03-09 15:07:00 -08:00
Dawn Chen	69eaea2fcc	Merge pull request #42779 from dashpole/fix_status [Bug Fix] Allow Status Updates for Pods that can be deleted	2017-03-09 13:23:00 -08:00
David Ashpole	e3e0bc6ce0	do not skip pods that can be deleted	2017-03-09 09:35:50 -08:00
Kubernetes Submit Queue	9cfc4f1a10	Merge pull request #42739 from yujuhong/created_time Automatic merge from submit-queue (batch tested with PRs 42762, 42739, 42425, 42778) FakeDockerClient: add creation timestamp This fixes #42736	2017-03-09 02:51:38 -08:00
Kubernetes Submit Queue	4cf553f78e	Merge pull request #42767 from Random-Liu/cleanup-infra-container-on-error Automatic merge from submit-queue (batch tested with PRs 42768, 42760, 42771, 42767) Stop sandbox container when hit network error. Fixes https://github.com/kubernetes/kubernetes/issues/42698. This PR stops the sandbox container when hitting a network error. This PR also adds a unit test for it. I'm not sure whether we should try teardown pod network after `SetUpPod` failure. We don't do that in dockertools https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockertools/docker_manager.go#L2276. @yujuhong @freehan	2017-03-09 00:08:01 -08:00
Michael Fraenkel	c4d07466e8	Invalid environment var names are reported and pod starts When processing EnvFrom items, all invalid keys are collected and reported as a single event. The Pod is allowed to start.	2017-03-09 07:21:53 +00:00
Kubernetes Submit Queue	6fac75c80a	Merge pull request #42768 from yujuhong/fix_sandbox_listing Automatic merge from submit-queue dockershim: Fix the race condition in ListPodSandbox In ListPodSandbox(), we 1. List all sandbox docker containers 2. List all sandbox checkpoints. If the checkpoint does not have a corresponding container in (1), we return partial result based on the checkpoint. The problem is that new PodSandboxes can be created between step (1) and (2). In those cases, we will see the checkpoints, but not the sandbox containers. This leads to strange behavior because the partial result from the checkpoint does not include some critical information. For example, the creation timestamp'd be zero, and that would cause kubelet's garbage collector to immediately remove the sandbox. This change fixes that by getting the list of checkpoints before listing all the containers (since in RunPodSandbox we create them in the reverse order).	2017-03-08 21:33:31 -08:00
Kubernetes Submit Queue	ec46846a25	Merge pull request #38691 from xiangpengzhao/fix-empty-logpath Automatic merge from submit-queue (batch tested with PRs 42211, 38691, 42737, 42757, 42754) Only create the symlink when container log path exists When using `syslog` logging driver instead of `json-file`, there will not be container log files such as `<containerID-json.log>`. We should not create symlink in this case.	2017-03-08 18:52:26 -08:00
timchenxiaoyu	0bfbd40d4c	fix some typo	2017-03-09 09:34:43 +08:00
Random-Liu	2690461cbb	Stop sandbox container when hit network error.	2017-03-08 17:28:42 -08:00
Eric Paris	df590da6ab	Return early from eviction debug helpers if !glog.V(3) Should keep us from running a bunch of loops needlessly.	2017-03-08 20:19:52 -05:00
Yu-Ju Hong	38d8da1215	FakeDockerClient: add creation timestamp This is necessary for kubemark to work correctly.	2017-03-08 17:11:16 -08:00
timchenxiaoyu	767719ea9c	fix across typo	2017-03-09 09:07:21 +08:00
Yu-Ju Hong	8328a66bdf	dockershim: Fix the race condition in ListPodSandbox In ListPodSandbox(), we 1. List all sandbox docker containers 2. List all sandbox checkpoints. If the checkpoint does not have a corresponding container in (1), we return partial result based on the checkpoint. The problem is that new PodSandboxes can be created between step (1) and (2). In those cases, we will see the checkpoints, but not the sandbox containers. This leads to strange behavior because the partial result from the checkpoint does not include some critical information. For example, the creation timestamp'd be zero, and that would cause kubelet's garbage collector to immediately remove the sandbox. This change fixes that by getting the list of checkpoints before listing all the containers (since in RunPodSandbox we create them in the reverse order).	2017-03-08 17:02:34 -08:00
Yu-Ju Hong	1095652cb8	Add more logs to help debugging	2017-03-08 12:27:49 -08:00
xiangpengzhao	7fed242d55	Only create the symlink when container log path exists	2017-03-08 01:36:48 -05:00
Kubernetes Submit Queue	5bc7387b3c	Merge pull request #42169 from ncdc/pprof-trace Automatic merge from submit-queue (batch tested with PRs 42692, 42169, 42173) Add pprof trace support Add support for `/debug/pprof/trace` Can wait for master to reopen for 1.7. cc @smarterclayton @wojtek-t @gmarek @timothysc @jeremyeder @kubernetes/sig-scalability-pr-reviews	2017-03-07 20:10:26 -08:00
Dawn Chen	ab790b6a3a	Dropped docker 1.9.x support. Changed the minimumDockerAPIVersion to 1.22	2017-03-07 17:07:07 -08:00
Kubernetes Submit Queue	1ed3aa6750	Merge pull request #42264 from yujuhong/kubemark_cri Automatic merge from submit-queue kubemark: enable CRI for the hollow nodes This fixes #41488	2017-03-07 13:04:39 -08:00
Matthew Wong	1dabce9815	Print dereferenced pod status fields when logging status update	2017-03-07 15:00:54 -05:00
Yu-Ju Hong	a0f90e1490	Use FakeDockerPuller to bypass auth/keyring logic in tests	2017-03-07 10:11:49 -08:00
Yu-Ju Hong	516848c37d	Various fixes for the fake docker client * Properly return ImageNotFoundError * Support inject "Images" or "ImageInspects" and keep both in sync. * Remove the FakeDockerPuller and let FakeDockerClient subsumes its functinality. This reduces the overhead to maintain both objects. * Various small fixes and refactoring of the testing utils.	2017-03-07 10:11:49 -08:00
Kubernetes Submit Queue	5cc6a4e269	Merge pull request #42609 from intelsdi-x/test-out-of-oir Automatic merge from submit-queue (batch tested with PRs 41890, 42593, 42633, 42626, 42609) Pods pending due to insufficient OIR should get scheduled once sufficient OIR becomes available (e2e disabled). #41870 was reverted because it introduced an e2e test flake. This is the same code with the e2e for OIR disabled again. We can attempt to enable the e2e test cases one-by-one in follow-up PRs, but it would be preferable to get the main fix merged in time for 1.6 since OIR is broken on master (see #41861). cc @timothysc	2017-03-07 08:10:46 -08:00
Andy Goldstein	b011529d8a	Add pprof trace support Add pprof trace support and --enable-contention-profiling to those components that don't already have it.	2017-03-07 10:10:42 -05:00
Kubernetes Submit Queue	a1c5d1b80f	Merge pull request #42585 from derekwaynecarr/cgroup-flake Automatic merge from submit-queue (batch tested with PRs 42506, 42585, 42596, 42584) provide active pods to cgroup cleanup What this PR does / why we need it: This PR provides more information for when a pod cgroup is considered orphaned. The running pods cache is based on the runtime's view of the world. we create pod cgroups before containers so we should just be looking at activePods. Which issue this PR fixes Fixes https://github.com/kubernetes/kubernetes/issues/42431	2017-03-06 22:20:11 -08:00
Kubernetes Submit Queue	31db570a00	Merge pull request #42497 from derekwaynecarr/lower_cgroup_names Automatic merge from submit-queue cgroup names created by kubelet should be lowercased What this PR does / why we need it: This PR modifies the kubelet to create cgroupfs names that are lowercased. This better aligns us with the naming convention for cgroups v2 and other cgroup managers in ecosystem (docker, systemd, etc.) See: https://www.kernel.org/doc/Documentation/cgroup-v2.txt "2-6-2. Avoid Name Collisions" Special notes for your reviewer: none Release note: ```release-note kubelet created cgroups follow lowercase naming conventions ```	2017-03-06 20:43:03 -08:00
Connor Doyle	364dbc0ca5	Revert "Revert "Pods pending due to insufficient OIR should get scheduled once sufficient OIR becomes available."" - This reverts commit `60758f3fff`. - Disabled opaque integer resource end-to-end tests.	2017-03-06 17:48:09 -08:00
Derek Carr	5ce298c9aa	provide active pods to cgroup cleanup	2017-03-06 17:37:26 -05:00
Dawn Chen	60758f3fff	Revert "Pods pending due to insufficient OIR should get scheduled once sufficient OIR becomes available."	2017-03-06 14:27:17 -08:00
Kubernetes Submit Queue	0fad9ce5e2	Merge pull request #41870 from intelsdi-x/test-out-of-oir Automatic merge from submit-queue (batch tested with PRs 31783, 41988, 42535, 42572, 41870) Pods pending due to insufficient OIR should get scheduled once sufficient OIR becomes available. This appears to be a regression since v1.5.0 in scheduler behavior for opaque integer resources, reported in https://github.com/kubernetes/kubernetes/issues/41861. - [X] Add failing e2e test to trigger the regression - [x] Restore previous behavior (pods pending due to insufficient OIR get scheduled once sufficient OIR becomes available.)	2017-03-06 11:30:24 -08:00
Derek Carr	48d822eafe	cgroup names created by kubelet should be lowercased	2017-03-06 11:19:21 -05:00
Seth Jennings	ccd87fca3f	kubelet: add cgroup manager metrics	2017-03-06 08:53:47 -06:00
Harry Zhang	bc644f9e04	Use pod sandbox id in checkpoint	2017-03-06 10:46:26 +08:00
Kubernetes Submit Queue	4bbf98850f	Merge pull request #42500 from vishh/fix-gpu-init Automatic merge from submit-queue [Bug] Fix gpu initialization in Kubelet Kubelet incorrectly fails if `AllAlpha=true` feature gate is enabled with container runtimes that are not `docker`. Replaces #42407	2017-03-04 20:28:08 -08:00
Connor Doyle	8a42189690	Fix unbounded growth of cached OIRs in sched cache - Added schedulercache.Resource.SetOpaque helper. - Amend kubelet allocatable sync so that when OIRs are removed from capacity they are also removed from allocatable. - Fixes #41861.	2017-03-04 09:26:22 -08:00
Kubernetes Submit Queue	f9ccee7714	Merge pull request #42435 from dashpole/timestamps_for_fsstats Automatic merge from submit-queue (batch tested with PRs 42369, 42375, 42397, 42435, 42455) [Bug Fix]: Avoid evicting more pods than necessary by adding Timestamps for fsstats and ignoring stale stats Continuation of #33121. Credit for most of this goes to @sjenning. I added volume fs timestamps. why is this a bug This PR attempts to fix part of https://github.com/kubernetes/kubernetes/issues/31362 which results in multiple pods getting evicted unnecessarily whenever the node runs into resource pressure. This PR reduces the chances of such disruptions by avoiding reacting to old/stale metrics. Without this PR, kubernetes nodes under resource pressure will cause unnecessary disruptions to user workloads. This PR will also help deflake a node e2e test suite. The eviction manager currently avoids evicting pods if metrics are old. However, timestamp data is not available for filesystem data, and this causes lots of extra evictions. See the [inode eviction test flakes](https://k8s-testgrid.appspot.com/google-node#kubelet-flaky-gce-e2e) for examples. This should probably be treated as a bugfix, as it should help mitigate extra evictions. cc: @kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews @vishh @derekwaynecarr @sjenning	2017-03-03 23:21:48 -08:00
Kubernetes Submit Queue	51a3d7b663	Merge pull request #42397 from feiskyer/fix-42396 Automatic merge from submit-queue (batch tested with PRs 42369, 42375, 42397, 42435, 42455) Kubelet: return container runtime's version instead of CRI's one What this PR does / why we need it: With CRI enabled by default, kubelet reports the version of CRI instead of container runtime version. This PR fixes this problem. Which issue this PR fixes Fixes #42396. Special notes for your reviewer: Should also cherry-pick to 1.6 branch. Release note: ```release-note NONE ``` cc @yujuhong @kubernetes/sig-node-bugs	2017-03-03 23:21:46 -08:00
Kubernetes Submit Queue	2d319bd406	Merge pull request #42204 from dashpole/allocatable_eviction Automatic merge from submit-queue Eviction Manager Enforces Allocatable Thresholds This PR modifies the eviction manager to enforce node allocatable thresholds for memory as described in kubernetes/community#348. This PR should be merged after #41234. cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests @vishh Why is this a bug/regression Kubelet uses `oom_score_adj` to enforce QoS policies. But the `oom_score_adj` is based on overall memory requested, which means that a Burstable pod that requested a lot of memory can lead to OOM kills for Guaranteed pods, which violates QoS. Even worse, we have observed system daemons like kubelet or kube-proxy being killed by the OOM killer. Without this PR, v1.6 will have node stability issues and regressions in an existing GA feature `out of Resource` handling.	2017-03-03 20:20:12 -08:00
Kubernetes Submit Queue	9cc5480918	Merge pull request #41149 from sjenning/qos-memory-limits Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285) kubelet: enable qos-level memory limits ```release-note Experimental support to reserve a pod's memory request from being utilized by pods in lower QoS tiers. ``` Enables the QoS-level memory cgroup limits described in https://github.com/kubernetes/community/pull/314 Note: QoS level cgroups have to be enabled for any of this to take effect. Adds a new `--experimental-qos-reserved` flag that can be used to set the percentage of a resource to be reserved at the QoS level for pod resource requests. For example, `--experimental-qos-reserved="memory=50%`, means that if a Guaranteed pod sets a memory request of 2Gi, the Burstable and BestEffort QoS memory cgroups will have their `memory.limit_in_bytes` set to `NodeAllocatable - (2Gi*50%)` to reserve 50% of the guaranteed pod's request from being used by the lower QoS tiers. If a Burstable pod sets a request, its reserve will be deducted from the BestEffort memory limit. The result is that: - Guaranteed limit matches root cgroup at is not set by this code - Burstable limit is `NodeAllocatable - Guaranteed reserve` - BestEffort limit is `NodeAllocatable - Guaranteed reserve - Burstable reserve` The only resource currently supported is `memory`; however, the code is generic enough that other resources can be added in the future. @derekwaynecarr @vishh	2017-03-03 16:44:39 -08:00
Vishnu kannan	038585626d	fix gpu initialization Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-03-03 12:13:01 -08:00
Klaus Ma	41c4426a30	Removed un-necessary empty line.	2017-03-03 19:43:48 +08:00
David Ashpole	a90c7951d4	add volume timestamps	2017-03-02 15:01:59 -08:00
Seth Jennings	cc50aa9dfb	kubelet: enable qos-level memory request reservation	2017-03-02 15:04:13 -06:00
Seth Jennings	c5faf1c156	kubelet: eviction: add timestamp to FsStats	2017-03-02 11:20:24 -08:00
David Ashpole	ac612eab8e	eviction manager changes for allocatable	2017-03-02 07:36:24 -08:00
Kubernetes Submit Queue	00c0c8332f	Merge pull request #42273 from smarterclayton/evaluate_probes Automatic merge from submit-queue (batch tested with PRs 41672, 42084, 42233, 42165, 42273) ExecProbes should be able to do simple env var substitution For containers that don't have bash, we should support env substitution like we do on command and args. However, without major refactoring valueFrom is not supportable from inside the prober. For now, implement substitution based on hardcoded env and leave TODOs for future work. Improves the state of #40846, will spawn a follow up issue for future refactoring after CRI settles down	2017-03-02 03:20:29 -08:00
Kubernetes Submit Queue	5ee6ba2f59	Merge pull request #42223 from Random-Liu/dockershim-better-implement-cri Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048) CRI: Make dockershim better implements CRI. When thinking about CRI Validation test, I found that `PodSandboxStatus.Linux.Namespaces.Options.HostPid` and `PodSandboxStatus.Linux.Namespaces.Options.HostIpc` are not populated. Although they are not used by kuberuntime now, we should populate them to conform to CRI. /cc @yujuhong @feiskyer	2017-03-02 00:59:19 -08:00
Pengfei Ni	1986b78e0e	Version(): return runtime version instead of CRI	2017-03-02 14:42:37 +08:00
Kubernetes Submit Queue	fa0387c9fe	Merge pull request #42195 from Random-Liu/cri-support-non-json-logging Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195) Use `docker logs` directly if the docker logging driver is not `json-file` Fixes https://github.com/kubernetes/kubernetes/issues/41996. Post the PR first, I still need to manually test this, because we don't have test coverage for journald logging pluggin. @yujuhong @dchen1107 /cc @kubernetes/sig-node-pr-reviews	2017-03-01 20:08:08 -08:00
Kubernetes Submit Queue	dfe05e0512	Merge pull request #41753 from derekwaynecarr/burstable-cpu-shares Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212) Burstable QoS cgroup has cpu shares assigned What this PR does / why we need it: This PR sets the Burstable QoS cgroup cpu shares value to the sum of the pods cpu requests in that tier. We need it for proper evaluation of CPU shares in the new QoS hierarchy. Special notes for your reviewer: It builds against the framework proposed for https://github.com/kubernetes/kubernetes/pull/41833	2017-03-01 15:30:34 -08:00
Kubernetes Submit Queue	ddd8b5c1cf	Merge pull request #41644 from derekwaynecarr/ensure-pod-cgroup-deleted Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212) Ensure pod cgroup is deleted prior to deletion of pod What this PR does / why we need it: This PR ensures that the kubelet removes the pod cgroup sandbox prior to deletion of a pod from the apiserver. We need this to ensure that the default behavior in the kubelet is to not leak resources.	2017-03-01 15:30:30 -08:00
Kubernetes Submit Queue	d5ff69468e	Merge pull request #29378 from vefimova/docker_resolv Automatic merge from submit-queue Re-writing of the resolv.conf file generated by docker Fixes #17406 Docker 1.12 will contain feature "The option --dns and --net=host should not be mutually exclusive" (docker/docker#22408) This patch adds optional support for this ability in kubelet (for now in case of "hostNetwork: true" set all dns settings are ignored if any). To enable feature use newly added kubelet flag: --allow-dns-for-hostnet=true	2017-03-01 14:19:08 -08:00
Derek Carr	21a899cf85	Ensure pod cgroup is deleted prior to deletion of pod	2017-03-01 15:29:36 -05:00
Derek Carr	1947e76e91	Set Burstable QOS Cgroup cpu.shares	2017-03-01 14:51:34 -05:00
Random-Liu	7c261bfed7	Use `docker logs` directly if the docker logging driver is not supported.	2017-03-01 10:50:11 -08:00
Yu-Ju Hong	1759b87ffe	Generate valid container id in fake docker client.	2017-03-01 10:33:08 -08:00
vefimova	fc8a37ec86	Added ability for Docker containers to set usage of dns settings along with hostNetwork is true Introduced chages: 1. Re-writing of the resolv.conf file generated by docker. Cluster dns settings aren't passed anymore to docker api in all cases, not only for pods with host network: the resolver conf will be overwritten after infra-container creation to override docker's behaviour. 2. Added new one dnsPolicy - 'ClusterFirstWithHostNet', so now there are: - ClusterFirstWithHostNet - use dns settings in all cases, i.e. with hostNet=true as well - ClusterFirst - use dns settings unless hostNetwork is true - Default Fixes #17406	2017-03-01 17:10:00 +00:00
Hemant Kumar	2d3008fc56	Implement support for mount options in PVs Add support for mount options via annotations on PVs	2017-03-01 11:50:40 -05:00
Kubernetes Submit Queue	ed479163fa	Merge pull request #42116 from vishh/gpu-experimental-support Automatic merge from submit-queue Extend experimental support to multiple Nvidia GPUs Extended from #28216 ```release-note `--experimental-nvidia-gpus` flag is replaced by `Accelerators` alpha feature gate along with support for multiple Nvidia GPUs. To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag. Works only with Docker runtime. ``` 1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP. 1. To test this PR locally, use the node e2e. ```shell TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node ``` TODO: - [x] Run manual tests - [x] Add node e2e - [x] Add unit tests for GPU manager (< 100% coverage) - [ ] Add unit tests in kubelet package	2017-03-01 04:52:50 -08:00
Kubernetes Submit Queue	f68c824f95	Merge pull request #42139 from Random-Liu/unify-fake-runtime-helper Automatic merge from submit-queue (batch tested with PRs 41921, 41695, 42139, 42090, 41949) Unify fake runtime helper in kuberuntime, rkt and dockertools. Addresses https://github.com/kubernetes/kubernetes/pull/42081#issuecomment-282429775. Add `pkg/kubelet/container/testing/fake_runtime_helper.go`, and change `kuberuntime`, `rkt` and `dockertools` to use it. @yujuhong This is a small unit test refactoring PR. Could you help me review it?	2017-03-01 04:10:04 -08:00
Kubernetes Submit Queue	1351324bed	Merge pull request #41833 from sjenning/qos-refactor Automatic merge from submit-queue (batch tested with PRs 38676, 41765, 42103, 41833, 41702) kubelet: cm: refactor QoS logic into seperate interface This commit has no functional change. It refactors the QoS cgroup logic into a new `QOSContainerManager` interface to allow for better isolation for QoS cgroup features coming down the pike. This is a breakout of the refactoring component of my QoS memory limits PR https://github.com/kubernetes/kubernetes/pull/41149 which will need to be rebased on top of this. @vishh @derekwaynecarr	2017-03-01 01:44:10 -08:00
Kubernetes Submit Queue	9f3343df40	Merge pull request #42015 from dashpole/min_timeout_eviction Automatic merge from submit-queue (batch tested with PRs 42162, 41973, 42015, 42115, 41923) Increase Min Timeout for kill pod Should mitigate #41347, which describes flakes in the inode eviction test due to "GracePeriodExceeded" errors. When we use gracePeriod == 0, as we do in eviction, the pod worker currently sets a timeout of 2 seconds to kill a pod. We are hitting this timeout fairly often during eviction tests, causing extra pods to be evicted (since the eviction manager "fails" to evict that pod, and kills the next one). This PR increases the timeout from 2 seconds to 4, although we could increase it even more if we think that would be appropriate. cc @yujuhong @vishh @derekwaynecarr	2017-02-28 22:06:01 -08:00
Kubernetes Submit Queue	91e1933f9f	Merge pull request #42149 from Random-Liu/check-infra-container-image-existence Automatic merge from submit-queue (batch tested with PRs 42216, 42136, 42183, 42149, 36828) Check infra container image existence before pulling. Fixes https://github.com/kubernetes/kubernetes/issues/42040. This PR: * Fixes https://github.com/kubernetes/kubernetes/issues/42040 by checking image existence before pulling. * Add unit test for it. * Fix a potential panic at https://github.com/kubernetes/kubernetes/compare/master...Random-Liu:check-infra-container-image-existence?expand=1#diff-e2eefa11d78ba95197ce406772c18c30R421. @yujuhong	2017-02-28 21:17:02 -08:00
Clayton Coleman	ce62f3d4a0	ExecProbes should be able to do simple env var substitution For containers that don't have bash, we should support env substitution like we do on command and args. However, without major refactoring valueFrom is not supportable from inside the prober. For now, implement substitution based on hardcoded env and leave TODOs for future work.	2017-02-28 22:46:04 -05:00
Vishnu kannan	13582a65aa	fix a bug in nvidia gpu allocation and added unit test Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-28 13:42:08 -08:00
Vishnu kannan	2554b95994	Map nvidia devices one to one. Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-28 13:42:08 -08:00
Vishnu kannan	318f4e102a	adding an e2e for GPUs Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-28 13:42:08 -08:00
Vishnu kannan	69acb02394	use feature gate instead of flag to control support for GPUs Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-28 13:42:07 -08:00
Vishnu kannan	3b0a408e3b	improve gpu integration Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-28 11:27:53 -08:00
Hui-Zhi	57c77ffbdd	Add support for multiple nvidia gpus	2017-02-28 11:24:48 -08:00
Seth Jennings	b9adb66426	kubelet: cm: refactor QoS logic into seperate interface	2017-02-28 09:19:29 -06:00
Jan Safranek	d7d039dba2	Make kubelet never delete files on mounted filesystems With bug #27653, kubelet could remove mounted volumes and delete user data. The bug itself is fixed, however our trust in kubelet is significantly lower. Let's add an extra version of RemoveAll that does not cross mount boundary (rm -rf --one-file-system). It calls lstat(path) three times for each removed directory - once in RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way it's platform independent and the directory that is being removed by kubelet should be almost empty.	2017-02-28 14:32:07 +01:00
timchenxiaoyu	4772931e63	fix reconcile typo	2017-02-28 13:50:25 +08:00
Vishnu kannan	9b4a8f7464	fix eviction helper function description Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-27 21:24:45 -08:00
Derek Carr	a7684569fb	Fix get all pods from cgroups logic	2017-02-27 21:24:45 -08:00
Vishnu Kannan	cc5f5474d5	add support for node allocatable phase 2 to kubelet Signed-off-by: Vishnu Kannan <vishnuk@google.com>	2017-02-27 21:24:44 -08:00
Vishnu Kannan	70e340b045	adding kubelet flags for node allocatable phase 2 Signed-off-by: Vishnu Kannan <vishnuk@google.com>	2017-02-27 21:24:44 -08:00
Random-Liu	0351629517	Make dockershim better implements CRI.	2017-02-27 20:37:49 -08:00
Random-Liu	29a063e62e	Check infra container image existence before pulling.	2017-02-27 10:59:36 -08:00
David Ashpole	6daa2f2ef3	increase timeout	2017-02-27 10:59:24 -08:00
Minhan Xia	f006c8bcd3	teach kubenet to use annotation instead of pod object for traffic shaper	2017-02-27 10:11:09 -08:00
Minhan Xia	947e0e1bf5	pass pod annotation to SetUpPod	2017-02-27 10:09:45 -08:00
Kubernetes Submit Queue	be724ba3c1	Merge pull request #42111 from feiskyer/sandbox Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120) Remove SandboxReceived event This PR removes SandboxReceived event in sync pod. > This event seems somewhat meaningless, and clouds the event records for a pod. Do we actually need it? Pulling and pod received on the node are very relevant, this seems much less so. Would suggest we either remove it, or turn it into a message that clearly indicates why it has value. Refer `d65309399a (commitcomment-21052453)`. cc @smarterclayton @yujuhong	2017-02-27 04:10:28 -08:00
Kubernetes Submit Queue	d1f5331102	Merge pull request #41804 from chakri-nelluri/flex Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120) Add support for attacher/detacher interface in Flex volume Add support for attacher/detacher interface in Flex volume This change breaks backward compatibility and requires to be release noted. ```release-note Flex volume plugin is updated to support attach/detach interfaces. It broke backward compatibility. Please update your drivers and implement the new callouts. ```	2017-02-27 04:10:25 -08:00
Random-Liu	0deec63d1a	Unify fake runtime helper in kuberuntime, rkt and dockertools.	2017-02-27 01:43:37 -08:00
Kubernetes Submit Queue	077e67eb77	Merge pull request #42076 from freehan/network-owner Automatic merge from submit-queue (batch tested with PRs 42058, 41160, 42065, 42076, 39338) add OWNER file to kubelet/network	2017-02-27 01:30:05 -08:00
Kubernetes Submit Queue	bc650d7ec7	Merge pull request #42055 from derekwaynecarr/sandbox-cgroup-parent Automatic merge from submit-queue (batch tested with PRs 41962, 42055, 42062, 42019, 42054) dockershim puts pause container in pod cgroup What this PR does / why we need it: The CRI was not launching the pause container in the pod level cgroup. The non-CRI code path was.	2017-02-27 00:16:55 -08:00
Kubernetes Submit Queue	a8b629d4ee	Merge pull request #41701 from vishh/evict-non-static-critical-pods Automatic merge from submit-queue Admit critical pods under resource pressure And evict critical pods that are not static. Depends on #40952. For #40573	2017-02-26 13:43:10 -08:00
Kubernetes Submit Queue	16f87fe7d8	Merge pull request #40952 from dashpole/premption Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576) Guaranteed admission for Critical Pods This is the first step in implementing node-level preemption for critical pods. It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary. In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod. cc: @vishh	2017-02-26 12:57:59 -08:00
Kubernetes Submit Queue	2eef3b1a14	Merge pull request #41957 from liggitt/mirror-pod-secrets Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077) Use consistent helper for getting secret names from pod Kubelet secret-manager and mirror-pod admission both need to know what secrets a pod spec references. Eventually, a node authorizer will also need to know the list of secrets. This creates a single (well, double, because api versions) helper that can be used to traverse the secret names referenced from a pod, optionally short-circuiting (for places that are just looking to see if any secrets are referenced, like admission, or are looking for a particular secret ref, like authorization) Fixes: * secret manager not handling secrets used by env/envFrom in initcontainers * admission allowing mirror pods with secret references @smarterclayton @wojtek-t	2017-02-26 10:22:51 -08:00
Kubernetes Submit Queue	8e531de1d5	Merge pull request #41946 from freehan/hostport-fix Automatic merge from submit-queue (batch tested with PRs 41621, 41946, 41941, 41250, 41729) bug fix for hostport-syncer fix a bug introduced by the previous refactoring of hostport-syncer. https://github.com/kubernetes/kubernetes/pull/39443 and fix some nits	2017-02-26 06:46:55 -08:00
Kubernetes Submit Queue	28a8d783e6	Merge pull request #41621 from derekwaynecarr/best-effort-qos-shares Automatic merge from submit-queue BestEffort QoS class has min cpu shares What this PR does / why we need it: BestEffort QoS class is given the minimum amount of CPU shares per the QoS design.	2017-02-26 06:32:43 -08:00
Pengfei Ni	245dad86b4	Remove SandboxReceived event	2017-02-26 09:30:00 +08:00
Kubernetes Submit Queue	067f92e789	Merge pull request #41801 from riverzhang/patch-1 Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911) Fix some typos Release note: ```release-note ```	2017-02-25 05:02:53 -08:00
Kubernetes Submit Queue	8e6af485f9	Merge pull request #41918 from ncdc/shared-informers-14-scheduler Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515) Switch scheduler to use generated listers/informers Where possible, switch the scheduler to use generated listers and informers. There are still some places where it probably makes more sense to use one-off reflectors/informers (listing/watching just a single node, listing/watching scheduled & unscheduled pods using a field selector). I think this can wait until master is open for 1.7 pulls, given that we're close to the 1.6 freeze. After this and #41482 go in, the only code left that references legacylisters will be federation, and 1 bit in a stateful set unit test (which I'll clean up in a follow-up). @resouer I imagine this will conflict with your equivalence class work, so one of us will be doing some rebasing 😄 cc @wojtek-t @gmarek @timothysc @jayunit100 @smarterclayton @deads2k @liggitt @sttts @derekwaynecarr @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scalability-pr-reviews	2017-02-25 02:17:55 -08:00
Chakravarthy Nelluri	0d2af70e95	Add support for attacher/detacher interface in Flex volume	2017-02-24 20:18:06 -05:00
Random-Liu	8380148d48	Remove extra operations when generating pod sandbox configuration.	2017-02-24 15:06:03 -08:00
Minhan Xia	727c3f28e5	add OWNER file to kubelet/network	2017-02-24 11:41:13 -08:00
Derek Carr	0449b008a8	dockershim puts pause container in pod cgroup	2017-02-24 11:30:06 -05:00
Kubernetes Submit Queue	4c1b875ca0	Merge pull request #39196 from resouer/omit-dot Automatic merge from submit-queue kubelet config should ignore file start with dots Fixes: #39156 Ignore files started with dot.	2017-02-24 05:30:21 -08:00
David Ashpole	c58970e47c	critical pods can preempt other pods to be admitted	2017-02-23 10:31:20 -08:00
Andy Goldstein	9d8d6ad16c	Switch scheduler to use generated listers/informers Where possible, switch the scheduler to use generated listers and informers. There are still some places where it probably makes more sense to use one-off reflectors/informers (listing/watching just a single node, listing/watching scheduled & unscheduled pods using a field selector).	2017-02-23 09:57:12 -05:00
Kubernetes Submit Queue	17175b24a2	Merge pull request #40007 from JulienBalestra/rktnetes-systemd-ops-helpers Automatic merge from submit-queue (batch tested with PRs 41812, 41665, 40007, 41281, 41771) Kubelet-rkt: Add useful informations for Ops on the Kubelet Host Create a Systemd SyslogIdentifier inside the [Service] Create a Systemd Description inside the [Unit] What this PR does / why we need it: #### Overview Logged against the host, it's difficult to identify who's who. This PR add useful information to quickly get straight to the point with the DESCRIPTION field: ``` systemctl list-units "k8s" UNIT LOAD ACTIVE SUB DESCRIPTION k8s_b5a9bdf7-e396-4989-8df0-30a5fda7f94c.service loaded active running kube-controller-manager-172.20.0.206 k8s_bec0d8a1-dc15-4b47-a850-e09cf098646a.service loaded active running nginx-daemonset-gxm4s k8s_d2981e9c-2845-4aa2-a0de-46e828f0c91b.service loaded active running kube-apiserver-172.20.0.206 k8s_fde4b0ab-87f8-4fd1-b5d2-3154918f6c89.service loaded active running kube-scheduler-172.20.0.206 ``` #### Overview and Journal Always on the host, to easily retrieve the pods logs, this PR add a SyslogIdentifier named as the PodBaseName. ``` # A DaemonSet prometheus-node-exporter is running on the Kubernetes Cluster systemctl list-units "k8s" \| grep prometheus-node-exporter k8s_c60a4b1a-387d-4fce-afa1-642d6f5716c1.service loaded active running prometheus-node-exporter-85cpp # Get the logs from the prometheus-node-exporter DaemonSet journalctl -t prometheus-node-exporter \| wc -l 278 ``` Sadly the `journalctl` flag `-t` / `--identifier` doesn't allow a pattern to catch the logs. Also this field improve any queries made by any tools who exports the Journal (E.g: ES, Kibana): ``` { "__CURSOR" : "s=86fd390d123b47af89bb15f41feb9863;i=164b2c27;b=7709deb3400841009e0acc2fec1ebe0e;m=1fe822ca4;t=54635e6a62285;x=b2d321019d70f36f", "__REALTIME_TIMESTAMP" : "1484572200411781", "__MONOTONIC_TIMESTAMP" : "8564911268", "_BOOT_ID" : "7709deb3400841009e0acc2fec1ebe0e", "PRIORITY" : "6", "_UID" : "0", "_GID" : "0", "_SYSTEMD_SLICE" : "system.slice", "_SELINUX_CONTEXT" : "system_u:system_r:kernel_t:s0", "_MACHINE_ID" : "7bbb4401667243da81671e23fd8a2246", "_HOSTNAME" : "Kubelet-Host", "_TRANSPORT" : "stdout", "SYSLOG_FACILITY" : "3", "_COMM" : "ld-linux-x86-64", "_CAP_EFFECTIVE" : "3fffffffff", "SYSLOG_IDENTIFIER" : "prometheus-node-exporter", "_PID" : "88827", "_EXE" : "/var/lib/rkt/pods/run/c60a4b1a-387d-4fce-afa1-642d6f5716c1/stage1/rootfs/usr/lib64/ld-2.21.so", "_CMDLINE" : "stage1/rootfs/usr/lib/ld-linux-x86-64.so.2 stage1/rootfs/usr/bin/systemd-nspawn [....]", "_SYSTEMD_CGROUP" : "/system.slice/k8s_c60a4b1a-387d-4fce-afa1-642d6f5716c1.service", "_SYSTEMD_UNIT" : "k8s_c60a4b1a-387d-4fce-afa1-642d6f5716c1.service", "MESSAGE" : "[ 8564.909237] prometheus-node-exporter[115]: time=\"2017-01-16T13:10:00Z\" level=info msg=\" - time\" source=\"node_exporter.go:157\"" } ```	2017-02-23 00:11:38 -08:00
Kubernetes Submit Queue	0d5a638d24	Merge pull request #41665 from freehan/cri-checkpoint-fix Automatic merge from submit-queue (batch tested with PRs 41812, 41665, 40007, 41281, 41771) initialize directory while creating checkpoint file store fixes: #41616 ref: https://github.com/kubernetes/kubernetes/issues/41225	2017-02-23 00:11:35 -08:00
Jordan Liggitt	a5526304bc	Use consistent helper for getting secret names from pod	2017-02-23 00:40:17 -05:00
Kubernetes Submit Queue	c36eee2a0c	Merge pull request #41784 from dixudx/fix_issue_41746 Automatic merge from submit-queue (batch tested with PRs 41146, 41486, 41482, 41538, 41784) fix issue #41746 What this PR does / why we need it: Which issue this PR fixes : fixes #41746 Special notes for your reviewer: cc @feiskyer	2017-02-22 21:09:38 -08:00
Kubernetes Submit Queue	6024f56f80	Merge pull request #38957 from aveshagarwal/master-taints-tolerations-api-fields Automatic merge from submit-queue (batch tested with PRs 38957, 41819, 41851, 40667, 41373) Change taints/tolerations to api fields This PR changes current implementation of taints and tolerations from annotations to API fields. Taint and toleration are now part of `NodeSpec` and `PodSpec`, respectively. The annotation keys: `scheduler.alpha.kubernetes.io/tolerations` and `scheduler.alpha.kubernetes.io/taints` have been removed. Release note: Pod tolerations and node taints have moved from annotations to API fields in the PodSpec and NodeSpec, respectively. Pod tolerations and node taints that are defined in the annotations will be ignored. The annotation keys: `scheduler.alpha.kubernetes.io/tolerations` and `scheduler.alpha.kubernetes.io/taints` have been removed.	2017-02-22 19:59:31 -08:00
Minhan Xia	6b34343946	bug fix for hostport-syncer	2017-02-22 16:38:09 -08:00
Kubernetes Submit Queue	d1687d2f67	Merge pull request #41349 from derekwaynecarr/enable-pod-cgroups Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657) Enable pod level cgroups by default What this PR does / why we need it: It enables pod level cgroups by default. Special notes for your reviewer: This is intended to be enabled by default on 2/14/2017 per the plan outlined here: https://github.com/kubernetes/community/pull/314 Release note: ```release-note Each pod has its own associated cgroup by default. ```	2017-02-22 08:12:37 -08:00
Avesh Agarwal	9b640838a5	Change taint/toleration annotations to api fields.	2017-02-22 09:27:42 -05:00
Tim Hockin	98d693e9d2	Merge pull request #39837 from foxyriver/modify-comment modify-comment	2017-02-21 16:33:47 -06:00
JulienBalestra	7de2d51f90	gofmt rkt.go, rkt_test.go	2017-02-21 23:06:13 +01:00
Derek Carr	43ae6f49ad	Enable per pod cgroups, fix defaulting of cgroup-root when not specified	2017-02-21 16:34:22 -05:00
Derek Carr	7fe105ebc7	stop double encoding systemd style cgroup names	2017-02-21 16:34:21 -05:00
Kubernetes Submit Queue	b201ac2f8f	Merge pull request #41457 from alejandroEsc/ae/kubelet/debug2 Automatic merge from submit-queue Log that debug handlers have been turned on. What this PR does / why we need it: PR allows user to have a message in logs that debug handlers are on. It should allow the operator to know and automate a check for the case where debug has been left on. Release note: ``` NONE ```	2017-02-21 10:51:32 -08:00
riverzhang	5156b7f8cf	Fix some typos	2017-02-21 07:15:40 -06:00
Di Xu	49098d08b7	fix issue #41746	2017-02-21 18:41:27 +08:00
Jeff Peeler	8fb1b71c66	Implements projected volume driver Proposal: kubernetes/kubernetes#35313	2017-02-20 12:56:04 -05:00
Derek Carr	9a1e30f776	BestEffort QoS class has min cpu shares	2017-02-20 12:28:00 -05:00
Julien Balestra	89e1382dd9	Remove else if else	2017-02-20 18:24:41 +01:00
Julien Balestra	ff8fbd4c8b	Fix a typo	2017-02-20 18:16:41 +01:00
Harry Zhang	cba9a90fd1	Ignore file start with dots	2017-02-20 21:49:42 +08:00
Vishnu kannan	26f9598279	admit critical pods under resource pressure\n evict critical pods that are not static Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-02-19 19:19:09 -08:00
Kubernetes Submit Queue	7236af6162	Merge pull request #39373 from apprenda/fix_configmap Automatic merge from submit-queue (batch tested with PRs 39373, 41585, 41617, 41707, 39958) Fix ConfigMaps for Windows What this PR does / why we need it: ConfigMaps were broken for Windows as the existing code used linux specific file paths. Updated the code in `kubelet_getters.go` to use `path/filepath` to get the directories. Also reverted back the code in `secret.go` as updating `kubelet_getters.go` to use `path/filepath` also fixes `secrets` Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes https://github.com/kubernetes/kubernetes/issues/39372 ```release-note Fix ConfigMap for Windows Containers. ``` cc: @pires	2017-02-19 13:50:37 -08:00
Jacob Simpson	855627e5cb	Rotate the kubelet certificate when about to expire. Changes the kubelet so it doesn't use the cert/key files directly for starting the TLS server. Instead the TLS server reads the cert/key from the new CertificateManager component, which is responsible for requesting new certificates from the Certificate Signing Request API on the API Server.	2017-02-17 17:42:35 -08:00
Minhan Xia	4f21b0280d	initialize directory while creating checkpoint file store	2017-02-17 16:56:46 -08:00
Kubernetes Submit Queue	7bbafd259c	Merge pull request #41626 from derekwaynecarr/improve-kubelet-volume-logging Automatic merge from submit-queue (batch tested with PRs 41649, 41658, 41266, 41371, 41626) Understand why kubelet cannot cleanup orphaned pod dirs What this PR does / why we need it: Understand if we are unable to clean up orphaned pod directories due to a failure to read the directory versus paths still existing to improve ability to debug error situations.	2017-02-17 16:38:41 -08:00
Jacob Simpson	b9f3e91041	Split `RequestNodeCertificate` function. Split the `RequestNodeCertificate` function so it can be called with different arguments.	2017-02-17 07:40:48 -08:00
Derek Carr	f1b7621f42	kubelet volumes cleanupOrphanedPodDirs does not distinguish error from found volume paths	2017-02-17 09:07:54 -05:00
Kubernetes Submit Queue	98d1cffe05	Merge pull request #37036 from dcbw/docker-gc-teardown-pods Automatic merge from submit-queue (batch tested with PRs 40505, 34664, 37036, 40726, 41595) dockertools: call TearDownPod when GC-ing infra pods The docker runtime doesn't tear down networking when GC-ing pods. rkt already does so make docker do it too. To ensure this happens, infra pods are now always GC-ed rather than gating them by containersToKeep. This prevents IPAM from leaking when the pod gets killed for some reason outside kubelet (like docker restart) or when pods are killed while kubelet isn't running. Fixes: https://github.com/kubernetes/kubernetes/issues/14940 Related: https://github.com/kubernetes/kubernetes/pull/35572	2017-02-16 17:05:12 -08:00
Kubernetes Submit Queue	05c05de798	Merge pull request #41569 from yujuhong/add_healthcheck Automatic merge from submit-queue (batch tested with PRs 38101, 41431, 39606, 41569, 41509) Report node not ready on failed PLEG health check Report node not ready if PLEG health check fails.	2017-02-16 15:49:18 -08:00
Kubernetes Submit Queue	6376ad134d	Merge pull request #39606 from NickrenREN/kubelet-pod Automatic merge from submit-queue (batch tested with PRs 38101, 41431, 39606, 41569, 41509) optimize killPod() and syncPod() functions make sure that one of the two arguments must be non-nil: runningPod, status ,just like the function note says and judge the return value in syncPod() function before setting podKilled	2017-02-16 15:49:17 -08:00
Kubernetes Submit Queue	4515f72824	Merge pull request #38101 from CaoShuFeng/haripin_nsenter Automatic merge from submit-queue (batch tested with PRs 38101, 41431, 39606, 41569, 41509) [hairpin] fix argument of nsenter Release note: ```release-note None ``` We should use: nsenter --net=netnsPath -- -F some_command instend of: nsenter -n netnsPath -- -F some_command Because "nsenter -n netnsPath" get an error output: # nsenter -n /proc/67197/ns/net ip addr nsenter: neither filename nor target pid supplied for ns/net If we really want use -n, we need to use -n in such format: # sudo nsenter -n/proc/67197/ns/net ip addr	2017-02-16 15:49:10 -08:00
Dan Williams	20e1cdb97c	dockertools: tear down dead infra containers in SyncPod() Dead infra containers may still have network resources allocated to them and may not be GC-ed for a long time. But allowing SyncPod() to restart an infra container before the old one is destroyed prevents network plugins from carrying the old network details (eg IPAM) over to the new infra container.	2017-02-16 13:51:19 -06:00
Dan Williams	4d7d7faa81	dockertools: clean up networking when garbage-collecting pods The docker runtime doesn't tear down networking when GC-ing pods. rkt already does so make docker do it too. To ensure this happens, networking is always torn down for the container even if the container itself is not deleted. This prevents IPAM from leaking when the pod gets killed for some reason outside kubelet (like docker restart) or when pods are killed while kubelet isn't running. Fixes: https://github.com/kubernetes/kubernetes/issues/14940 Related: https://github.com/kubernetes/kubernetes/pull/35572	2017-02-16 13:51:19 -06:00
Dan Williams	dc2fd511ab	dockertools: use network PluginManager to synchronize pod network operations We need to tear down networking when garbage collecting containers too, and GC is run from a different goroutine in kubelet. We don't want container network operations running for the same pod concurrently.	2017-02-16 13:51:19 -06:00
Dan Williams	4c3cc67385	rkt: use network PluginManager to synchronize pod network operations	2017-02-16 13:51:19 -06:00
Dan Williams	aafd5c9ef6	dockershim: use network PluginManager to synchronize pod network operations	2017-02-16 13:51:19 -06:00
Dan Williams	60525801c1	kubelet/network: move mock network plugin to pkg/kubelet/network/testing	2017-02-16 13:48:32 -06:00
Dan Williams	5633d7423a	kubelet: add network plugin manager with per-pod operation locking The PluginManager almost duplicates the network plugin interface, but not quite since the Init() function should be called by whatever actually finds and creates the network plugin instance. Only then does it get passed off to the PluginManager. The Manager synchronizes pod-specific network operations like setup, teardown, and pod network status. It passes through all other operations so that runtimes don't have to cache the network plugin directly, but can use the PluginManager as a wrapper.	2017-02-16 13:48:32 -06:00
Kubernetes Submit Queue	3c606cdd20	Merge pull request #41456 from dashpole/pod_volume_cleanup Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416) Delay Deletion of a Pod until volumes are cleaned up #41436 fixed the bug that caused #41095 and #40239 to have to be reverted. Now that the bug is fixed, this shouldn't cause problems. @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc	2017-02-16 10:14:05 -08:00
Yu-Ju Hong	5bb43a3a24	Report node not ready on failed PLEG health check	2017-02-16 09:00:22 -08:00
Kubernetes Submit Queue	11bf535e03	Merge pull request #41434 from freehan/cri-kubenet-error Automatic merge from submit-queue (batch tested with PRs 41531, 40417, 41434) [CRI] beef up network teardown in StopPodSandbox 1. Added CheckpointNotFound error to allow dockershim to conduct error handling 2. Retry network teardown if failed ref: https://github.com/kubernetes/kubernetes/issues/41225	2017-02-15 23:01:09 -08:00
Kubernetes Submit Queue	ddf4a0cad5	Merge pull request #40417 from jsravn/fix-reconciler-external-updates-race Automatic merge from submit-queue (batch tested with PRs 41531, 40417, 41434) Always detach volumes in operator executor What this PR does / why we need it: Instead of marking a volume as detached immediately in Kubelet's reconciler, delegate the marking asynchronously to the operator executor. This is necessary to prevent race conditions with other operations mutating the same volume state. An example of one such problem: 1. pod is created, volume is added to desired state of the world 2. reconciler process starts 3. reconciler starts MountVolume, which is kicked off asynchronously via operation_executor.go 4. MountVolume mounts the volume, but hasn't yet marked it as mounted 5. pod is deleted, volume is removed from desired state of the world 6. reconciler reaches detach volume section, detects volume is no longer in desired state of world, removes it from volumes in use 7. MountVolume tries to mark mount, throws an error because volume is no longer in actual state of world list. After this, kubelet isn't aware of the mount so doesn't try to unmount again. 8. controller-manager tries to detach the volume, this fails because it is still mounted to the OS. 9. EBS gets stuck indefinitely in busy state trying to detach. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes #32881, fixes ##37854 (maybe) Special notes for your reviewer: Release note: ```release-note ```	2017-02-15 23:01:07 -08:00
Alejandro Escobar	024d750370	making log statement clearer.	2017-02-15 19:49:52 -08:00
NickrenREN	b40e575076	optimize killPod() and syncPod() functions make sure that one of the two arguments must be non-nil: runningPod, status ,just like the function note says and judge the return value in syncPod() function before setting podKilled	2017-02-16 09:13:23 +08:00
Kubernetes Submit Queue	d60d8a7b92	Merge pull request #41104 from apprenda/kubeadm_client-go_move Automatic merge from submit-queue kubeadm: Migrate to client-go What this PR does / why we need it: Finish the migration for kubeadm to use client-go wherever possible Which issue this PR fixes: fixes #https://github.com/kubernetes/kubeadm/issues/52 Special notes for your reviewer: /cc @luxas @pires Release note: ```release-note NONE ```	2017-02-15 16:01:22 -08:00
Kubernetes Submit Queue	a1afc024cb	Merge pull request #34931 from nhlfr/cadvisor-container-info-table Automatic merge from submit-queue kubelet: Make cadvisor GetContainerInfo tests table driven	2017-02-15 15:14:28 -08:00
Derek McQuay	70e7d64b46	kubeadm: moved import to client-go, where possible Some imports dont exist yet (or so it seems) in client-go (examples being: - "k8s.io/kubernetes/pkg/api/validation" - "k8s.io/kubernetes/pkg/util/initsystem" - "k8s.io/kubernetes/pkg/util/node" one change in kubelet to import to client-go	2017-02-15 13:06:15 -08:00
Kubernetes Submit Queue	3bc575c91f	Merge pull request #33550 from rtreffer/kubelet-allow-multiple-dns-server Automatic merge from submit-queue Allow multipe DNS servers as comma-seperated argument for kubelet --dns This PR explores how kubectls "--dns" could be extended to specify multiple DNS servers for in-cluster PODs. Testing on the local libvirt-coreos cluster shows that multiple DNS server are injected without issues. Specifying multiple DNS servers increases resilience against - Packet drops - Single server failure I am debugging services that do 50+ DNS requests for a single incoming interactive request, thus highly increase the chance of a slowdown (+5s) due to a single packet drop. Switching to two DNS servers will reduce the impact of the issues (roughly +1s on glibc, 0s on musl, error-rate goes down to error-rate^2). Note that there is no need to change any runtime related code as far as I know. In the case of "default" dns the /etc/resolv.conf is parsed and multiple DNS server are send to the backend anyway. This only adds the same capability for the clusterFirst case. I've heard from @thockin that multiple DNS entries are somehow considered. I've no idea what was considered, though. This is what I would like to see for our production use, though. ```release-note NONE ```	2017-02-15 12:45:32 -08:00
Minhan Xia	4ca2642dd3	update bazel	2017-02-15 10:06:49 -08:00
Minhan Xia	3cc837878f	retry StopPodSandbox on Network teardown failure	2017-02-15 10:06:41 -08:00
David Ashpole	1d38818326	Revert "Merge pull request #41202 from dashpole/revert-41095-deletion_pod_lifecycle" This reverts commit `ff87d13b2c`, reversing changes made to `46becf2c81`.	2017-02-15 08:44:03 -08:00
Michal Rostecki	4ed087e01f	kubelet: Make cadvisor GetContainerInfo tests table driven	2017-02-15 16:15:21 +01:00
Kubernetes Submit Queue	dd696683b7	Merge pull request #40647 from NickrenREN/secretManager Automatic merge from submit-queue (batch tested with PRs 41360, 41423, 41430, 40647, 41352) optimize NewSimpleSecretManager and cleanupOrphanedPodCgroups	2017-02-15 05:06:11 -08:00
Kubernetes Submit Queue	d47ffa08c7	Merge pull request #41423 from yujuhong/better_logging Automatic merge from submit-queue (batch tested with PRs 41360, 41423, 41430, 40647, 41352) kubelet: reduce extraneous logging for pods using host network For pods using the host network, kubelet/shim should not log error/warning messages when determining the pod IP address.	2017-02-15 05:06:08 -08:00

... 3 4 5 6 7 ...

4721 Commits (8e98f1dfec9d1f3a100fe9af9588bcbedc0ab801)