Automatic merge from submit-queue
Fix mount collision timeout issue
Short- or medium-term workaround for #29555. The root issue being fixed here is that the recent attach/detach work in the kubelet uses a unique volume name as a key that tracks the work that has to be done for each volume in a pod to attach/mount/umount/detach. However, the non-attachable volume plugins do not report unique names for themselves, which causes collisions when a single secret or configmap is mounted multiple times in a pod.
This is still a WIP -- I need to add a couple E2E tests that ensure that tests break in the future if there is a regression -- but posting for early review.
cc @kubernetes/sig-storage
Ultimately, I would like to refine this a bit further. A couple things I would like to change:
1. `GetUniqueVolumeName` should be a property ONLY of attachable volumes
2. I would like to see the kubelet apparatus for attach/mount/umount/detach handle non-attachable volumes specifically to avoid things like the `WaitForControllerAttach` call that has to be done for those volume types now
Add a new docker integration with kubelet using the new runtime API.
This change adds the package with some skeletons, and implements some
of the basic operations.
Automatic merge from submit-queue
Quota was not counting services with multiple nodeports properly
```release-note
If a service of type node port declares multiple ports, quota on "services.nodeports" will charge for each port in the service.
```
Fixes https://github.com/kubernetes/kubernetes/issues/29456
/cc @kubernetes/rh-cluster-infra @sdminonne
Automatic merge from submit-queue
volume/flocker: plug time.Ticker resource leak
This commit ensures that `flockerMounter.updateDatasetPrimary` does not leak
running `time.Ticker` instances. Upon termination of the consuming routine, we
stop the tickers.
```release-note
* flockerMounter.updateDatasetPrimary no longer leaks running time.Ticker instances.
Upon termination of the consuming routine, we stop the tickers.
```
Automatic merge from submit-queue
LimitRanger and PodSecurityPolicy need to check more on init containers
Container limits not applied to init containers. HostPorts not checked on podsecuritypolicy
@pweil- @derekwaynecarr
For non-attachable volumes, do not call GetVolumeName on the plugin and instead
generate a unique name based on the identity of the pod and the name of the volume
within the pod.
Automatic merge from submit-queue
Fix wrapped volume race
**EDIT:** now covers configmap, secret, downwardapi & git_repo volume plugins.
Fixes#29297.
wrappedVolumeSpec used by configMapVolumeMounter and
configMapVolumeUnmounter contained a pointer to api.Volume which was
being patched by NewWrapperMounter/NewWrapperUnmounter, causing race
condition during configmap volume mounts.
See https://github.com/kubernetes/kubernetes/issues/29297#issuecomment-235403806 for complete explanation.
The subtle bug was introduced by #18445, it also can affect other volume plugins utilizing `wrappedVolumeSpec` technique, if this PR is correct/accepted will make more PRs for secrets etc. Although tmpfs variety of inner `emptyDir` volume appears to be less susceptible to this race, there's chance it can fail too.
The errors produced by this race look like this:
```Jul 19 17:05:21 ubuntu1604 kubelet[17097]: I0719 17:05:21.854303 17097 reconciler.go:253] MountVolume operation started for volume "kubernetes.io/configmap/foo-files"
(spec.Name: "files") to pod "11786582-4dbf-11e6-9fc9-64cca009c636" (UID: "11786582-4dbf-11e6-9fc9-64cca009c636").
Jul 19 17:05:21 ubuntu1604 kubelet[17097]: I0719 17:05:21.854842 17097 reconciler.go:253] MountVolume operation started for volume "kubernetes.io/configmap/bar-file
s" (spec.Name: "files") to pod "117d2c22-4dbf-11e6-9fc9-64cca009c636" (UID: "117d2c22-4dbf-11e6-9fc9-64cca009c636").
Jul 19 17:05:21 ubuntu1604 kubelet[17097]: E0719 17:05:21.860796 17097 configmap.go:171] Error creating atomic writer: stat /var/lib/kubelet/pods/117d2c22-4dbf-11e6-9fc9-64cca009c636/volumes/kubernetes.io~configmap/files: no such file or directory
Jul 19 17:05:21 ubuntu1604 kubelet[17097]: E0719 17:05:21.861070 17097 goroutinemap.go:155] Operation for "kubernetes.io/configmap/bar-files" failed. No retries permitted until 2016-07-19 17:07:21.861036886 +0200 CEST (durationBeforeRetry 2m0s). error: MountVolume.SetUp failed for volume "kubernetes.io/configmap/bar-files" (spec.Name: "files") pod "117d2c22-4dbf-11e6-9fc9-64cca009c636" (UID: "117d2c22-4dbf-11e6-9fc9-64cca009c636") with: stat /var/lib/kubelet/pods/117d2c22-4dbf-11e6-9fc9-64cca009c636/volumes/kubernetes.io~configmap/files: no such file or directory
Jul 19 17:05:21 ubuntu1604 kubelet[17097]: E0719 17:05:21.861271 17097 configmap.go:171] Error creating atomic writer: stat /var/lib/kubelet/pods/11786582-4dbf-11e6-9fc9-64cca009c636/volumes/kubernetes.io~configmap/files: no such file or directory
Jul 19 17:05:21 ubuntu1604 kubelet[17097]: E0719 17:05:21.862284 17097 goroutinemap.go:155] Operation for "kubernetes.io/configmap/foo-files" failed. No retries permitted until 2016-07-19 17:07:21.862275753 +0200 CEST (durationBeforeRetry 2m0s). error: MountVolume.SetUp failed for volume "kubernetes.io/configmap/foo-files" (spec.Name: "files") pod "11786582-4dbf-11e6-9fc9-64cca009c636" (UID: "11786582-4dbf-11e6-9fc9-64cca009c636") with: stat /var/lib/kubelet/pods/11786582-4dbf-11e6-9fc9-64cca009c636/volumes/kubernetes.io~configmap/files: no such file or directory```
Note "Error creating atomic writer" errors.
This problem can be reproduced by making kubelet mount multiple config map volumes in parallel.
This fixes race conditions in configmap, secret, downwardapi & git_repo
volume plugins.
wrappedVolumeSpec vars used by volume mounters and unmounters contained
a pointer to api.Volume structs which were being patched by
NewWrapperMounter/NewWrapperUnmounter, causing race condition during
volume mounts.
Automatic merge from submit-queue
Use response content-type on restclient errors
Also allow a new AcceptContentTypes field to allow the client to ask for
a fallback serialization when getting responses from the server. This
allows a new client to ask for protobuf and JSON, falling back to JSON
when necessary.
The changes to request.go allow error responses from non-JSON servers to
be properly decoded.
@wojtek-t - also alters #28910 slightly (this is better output)
Automatic merge from submit-queue
Add an Azure CloudProvider Implementation
This PR adds `Azure` as a cloudprovider provider for Kubernetes. It specifically adds support for native pod networking (via Azure User Defined Routes) and L4 Load Balancing (via Azure Load Balancers).
I did have to add `clusterName` as a parameter to the `LoadBalancers` methods. This is because Azure only allows one "LoadBalancer" object per set of backend machines. This means a single "LoadBalancer" object must be shared across the cluster. The "LoadBalancer" is named via the `cluster-name` parameter passed to `kube-controller-manager` so as to enable multiple clusters per resource group if the user desires such a configuration.
There are few things that I'm a bit unsure about:
1. The implementation of the `Instances` interface. It's not extensively documented, it's not really clear what the different functions are used for, and my questions on the ML didn't get an answer.
2. Counter to the comments on the `LoadBalancers` Interface, I modify the `api.Service` object in `EnsureLoadBalancerDeleted`, but not with the intention of affecting Kube's view of the Service. I simply do it so that I can remove the `Port`s on the `Service` object and then re-use my reconciliation logic that can handle removing stale/deleted Ports.
3. The logging is a bit verbose. I'm looking for guidance on the appropriate log level to use for the chattier bits.
Due to the (current) lack of Instance Metadata Service and lack of Virtual Machine Identity in Azure, the user is required to do a few things to opt-in to this provider. These things are called-out as they are in contrast to AWS/GCE:
1. The user must provision an Azure Active Directory ServicePrincipal with `Contributor` level access to the resource group that the cluster is deployed in. This creation process is documented [by Hashicorp](https://www.packer.io/docs/builders/azure-setup.html) or [on the MSDN Blog](https://blogs.msdn.microsoft.com/arsen/2016/05/11/how-to-create-and-test-azure-service-principal-using-azure-cli/).
2. The user must place a JSON file somewhere on each Node that conforms to the `AzureConfig` struct defined in `azure.go`. (This is automatically done in the Azure flavor of [Kubernetes-Anywhere](https://github.com/kubernetes/kubernetes-anywhere).)
3. The user must specify `--cloud-config=/path/to/azure.json` as an option to `kube-apiserver` and `kube-controller-manager` similarly to how the user would need to pass `--cloud-provider=azure`.
I've been running approximately this code for a month and a half. I only encountered one bug which has since been fixed and covered by a unit test. I've just deployed a new cluster (and a Type=LoadBalancer nginx Service) using this code (via `kubernetes-anywhere`) and have posted [the `kube-controller-manager` logs](https://gist.github.com/colemickens/1bf6a26e7ef9484a72a30b1fcf9fc3cb) for anyone who is interested in seeing the logs of the logic.
If you're interested in this PR, you can use the instructions in my [`azure-kubernetes-demo` repository](https://github.com/colemickens/azure-kubernetes-demo) to deploy a cluster with minimal effort via [`kubernetes-anywhere`](https://github.com/kubernetes/kubernetes-anywhere). (There is currently [a pending PR in `kubernetes-anywhere` that is needed](https://github.com/kubernetes/kubernetes-anywhere/pull/172) in conjuncture with this PR). I also have a pre-built `hyperkube` image: `docker.io/colemickens/hyperkube-amd64:v1.4.0-alpha.0-azure`, which will be kept in sync with the branch this PR stems from.
I'm hoping this can land in the Kubernetes 1.4 timeframe.
CC (potential code reviewers from Azure): @ahmetalpbalkan @brendandixon @paulmey
CC (other interested Azure folk): @brendandburns @johngossman @anandramakrishna @jmspring @jimzim
CC (others who've expressed interest): @codefx9 @edevil @thockin @rootfs
Automatic merge from submit-queue
Add support for kubectl create quota command
Follow-up of https://github.com/kubernetes/kubernetes/pull/19625
```
Create a resourcequota with the specified name, hard limits and optional scopes
Usage:
kubectl create quota NAME [--hard=key1=value1,key2=value2] [--scopes=Scope1,Scope2] [--dry-run=bool] [flags]
Aliases:
quota, q
Examples:
// Create a new resourcequota named my-quota
$ kubectl create quota my-quota --hard=cpu=1,memory=1G,pods=2,services=3,replicationcontrollers=2,resourcequotas=1,secrets=5,persistentvolumeclaims=10
// Create a new resourcequota named best-effort
$ kubectl create quota best-effort --hard=pods=100 --scopes=BestEffort
```
Automatic merge from submit-queue
Fix panic in schema test
If the swagger files for testing are lost, the func `loadSchemaForTest` or `NewSwaggerSchemaFromBytes` will return a non-nil error and a nil schema. In this case, the calling for `ValidateBytes` will result in panic. So, call Fatalf instead of Errorf.
Also fix minor typos.
Test logs:
```
--- FAIL: TestLoad (0.01s)
schema_test.go:131: Failed to load: open ../../../api/swagger-spec/v1.json: no such file or directory
--- FAIL: TestValidateOk (0.00s)
schema_test.go:138: Failed to load: open ../../../api/swagger-spec/v1.json: no such file or directory
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x4d52df]
goroutine 10 [running]:
panic(0x15fffa0, 0xc8200100a0)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
testing.tRunner.func1(0xc820085a70)
/usr/local/go/src/testing/testing.go:467 +0x192
panic(0x15fffa0, 0xc8200100a0)
/usr/local/go/src/runtime/panic.go:443 +0x4e9
k8s.io/kubernetes/pkg/api/validation.TestValidateOk(0xc820085a70)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/api/validation/schema_test.go:159 +0x79f
testing.tRunner(0xc820085a70, 0x22aad68)
/usr/local/go/src/testing/testing.go:473 +0x98
created by testing.RunTests
/usr/local/go/src/testing/testing.go:582 +0x892
FAIL k8s.io/kubernetes/pkg/api/validation 0.048s
```
Package goroutinemap can be structurally simplified to be more
idiomatic, concise, and free of error potential. No structural changes
are made.
It is unconventional declare `sync.Mutex` directly as a pointerized
field in a parent structure. The `sync.Mutex` operates on pointer
receivers of itself; and by relying on that, the types that contain
those fields can be safely constructed using
https://golang.org/ref/spec#The_zero_value.
The duration constants are already of type `time.Duration`, so
re-declaring that is redundant.
According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers. They
leak until otherwise stopped. This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.
Automatic merge from submit-queue
Kubelet: Fail kubelet if cadvisor is not started.
Fixes https://github.com/kubernetes/kubernetes/issues/28997.
We started cadvisor in `sync.Do()`, which only run once no matter cadvisor successfully starts or not.
Once it fails, kubelet will be stuck in a bad state. Kubelet could never start sync loop because there is an internal error, but kubelet would never retry starting cadvisor again.
This PR just fails kubelet when cadvisor start fails, and then relies on the babysitter to restart kubelet.
In the future, we may want to add backoff logic in the babysitter to protect the system.
On the other hand, https://github.com/kubernetes/kubernetes/pull/29492 will fix cadvisor side to prevent cadvisor failing because of these kind of transient error.
Mark P1 to match the original issue.
@dchen1107 @vishh
Automatic merge from submit-queue
network/cni: Unconditionally bring up `lo` interface
This is already done in kubenet. This specifically fixes an issue where a kubelet-managed network for the rkt runtime does not have an "UP" lo interface.
Fixes#28561
If this fix doesn't seem right, it could also be implemented by rkt effectively managing two "cni" network plugins, one for the user requested network, one for lo.
Followup CRs can improve unit testing further and then possibly remove the vendor directory logic (which seems like dead code)
cc @kubernetes/sig-rktnetes @kubernetes/sig-network @dcbw
Automatic merge from submit-queue
TestLoadBalancer() test v1 not v2
TestLoadBalancer() should test v1 and TestLoadBalancerV2() test v2, but In TestLoadBalancerV() there are codes:
cfg.LoadBalancer.LBVersion = "v2"
Automatic merge from submit-queue
Extract kubelet node status into separate file
Extract kubelet node status management into a separate file as a continuation of the kubelet code simplification effort.
Automatic merge from submit-queue
Remove duplicate prometheus metrics
This was a relic from before Kubernetes set Docker labels properly. Cadvisor now properly exposes the Docker labels (e.g. `io.kubernetes.pod.name` as `io_kubernetes_pod_name`, etc) so this is no longer required & actually results in unnecessary duplicate Prometheus labels.
Automatic merge from submit-queue
cleanup wrong naming: limitrange -> hpa
The code is in `horizontalpodautoscaler/strategy.go`, but the parameter is "limitrange". This is legacy copy-paste issue...
Automatic merge from submit-queue
Syncing imaging pulling backoff logic
- Syncing the backoff logic in the parallel image puller and the sequential image puller to prepare for merging the two pullers into one.
- Moving image error definitions under kubelet/images
Automatic merge from submit-queue
make addition group RESTStorage registration easier
Starts factoring out `RESTStorage` creation to eventually allow for decoupled API group `RESTStorage` configuration.
Right now you can't add additional groups without modifying the main API Group registration in master.go. Allows the `master.Config` to hold a function that can build a `RESTStorage` based on the `Master` struct.
@lavalamp @caesarxuchao @kubernetes/sig-api-machinery
@liggitt @smarterclayton
Automatic merge from submit-queue
Validation logic applied to edited file
The file that is submitted via ``edit`` is now subject to validation
logic as any other file. The validation flags were added to the ``edit``
command.
Fixes: #17542
Also allow a new AcceptContentTypes field to allow the client to ask for
a fallback serialization when getting responses from the server. This
allows a new client to ask for protobuf and JSON, falling back to JSON
when necessary.
The changes to request.go allow error responses from non-JSON servers to
be properly decoded.
Automatic merge from submit-queue
rkt: Fix /etc/hosts /etc/resolv.conf permissions
#29024 introduced copying /etc/hosts and /etc/resolv.conf before mounting them into rkt containers. However, the new files' permissions are set to 0640, which make these files unusable by any other users than root in the container as shown below. This small patch changes the permissions to 0644, as typically set.
```
# host rabbitmq
rabbitmq.default.svc.cluster.local has address 10.3.0.211
# ls -la /etc/resolv.conf
-rw-r-----. 1 root root 102 Jul 23 13:20 /etc/resolv.conf
# sudo -E -u foo bash
$ cat /etc/resolv.conf
cat: /etc/resolv.conf: Permission denied
$ host rabbitmq
;; connection timed out; no servers could be reached
# exit
# chmod 0644 /etc/resolv.conf /etc/hosts
# sudo -E -u foo host rabbitmq
rabbitmq.default.svc.cluster.local has address 10.3.0.211
```
cc @kubernetes/sig-rktnetes @yifan-gu @euank
Automatic merge from submit-queue
To break the loop when object found in removeOrphanFinalizer()
To break the loop when object found in removeOrphanFinalizer()
Automatic merge from submit-queue
Eviction manager needs to start as runtime dependent module
To support disk eviction, the eviction manager needs to know if there is a dedicated device for the imagefs. In order to know that information, we need to start the eviction manager after cadvisor. This refactors the location eviction manager is started.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @vishh @ronnielai
Automatic merge from submit-queue
Allow PVs to specify supplemental GIDs
Retry of https://github.com/kubernetes/kubernetes/pull/28691 . Adds a Kubelet helper function for getting extra supplemental groups
Automatic merge from submit-queue
Add parsing code in kubelet for eviction-minimum-reclaim
The kubelet parses the eviction-minimum-reclaim flag and validates it for correctness.
The first two commits are from https://github.com/kubernetes/kubernetes/pull/29329 which has already achieved LGTM.
The requirement that ExternalID returns InstanceNotFound when the
instance not found was incorrectly documented on InstanceID and
InstanceType. This requirement arises from the node controller, which
is the only place that checks for the InstanceNotFound error.
Automatic merge from submit-queue
Fix httpclient setup for gcp credential provider to have timeout
The default http client has no timeout.
This could cause problems when not on GCP environments.
This PR changes to use a 10s timeout, and ensures the transport has our normal defaults applied.
/cc @ncdc @liggitt
Automatic merge from submit-queue
Allow shareable resources for admission control plugins.
Changes allow admission control plugins to share resources. This is done via new PluginInitialization structure. The structure can be extended for other resources, for now it is an shared informer for namespace plugins (NamespiceLifecycle, NamespaceAutoProvisioning, NamespaceExists).
If a plugins needs some kind of shared resource e.g. client, the client shall be added to PluginInitializer and Wants methods implemented to every plugin which will use it.
Automatic merge from submit-queue
Add kubelet flag for eviction-minimum-reclaim
This is taken from #27199 as its the most burdensome to rebase and should have little disagreement.
/cc @vishh @ronnielai PTAL
Automatic merge from submit-queue
CRI: add LinuxUser to LinuxContainerConfig
Following discussion in https://github.com/kubernetes/kubernetes/pull/25899#discussion_r70996068
The Container Runtime Interface should provide runtimes with User information to run the container process as (OCI being one of them).
This patch introduces a new field `user` into `LinuxContainerConfig` structure. The `user` field introduces also a new type structure `LinuxUser` which consists of `uid`, `gid` and `additional_gids`.
The `LinuxUser` struct has been embedded into `LinuxContainerConfig` to leave space for future implementations which are not Linux-related (e.g. Windows may have a different representation of _Users_).
If you feel naming can be better we can probably move `LinuxUser` to `UnixUser` also.
/cc @mrunalp @vishh @euank @yujuhong
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Automatic merge from submit-queue
Removing images with multiple tags
If an image has multiple tags, we need to remove all the tags in order to make docker image removing successful.
#28491
Automatic merge from submit-queue
add enhanced volume and mount logging for block devices
Fixes#24568
Adding better logging and debugging for block device volumes and the shared SafeFormatAndMount (aws, gce, flex, rbd, cinder, etc...)