Automatic merge from submit-queue
Fix killing child sudo process in e2e_node tests
Fixes#29211.
The context is we are trying to kill a process started as `sudo kube-apiserver`, but `sudo` ignores signals from the same process group. Applying `Setpgid` means the `sudo kill` process won't be in the same process group, so will not fall foul of this nifty feature.
I also took the liberty of removing some code setting `Pdeathsig` because it claims to be doing something in the same area, but actually it doesn't do that at all. The setting is applied to the forked process, i.e. `sudo`, and it means the `sudo` will get killed if we (`e2e_node.test`) die. This (a) isn't what the comment says and (b) doesn't help because sending SIGKILL to the sudo process leaves sudo's child alive.
I didn't use the "hack for linux-only" approach because I think `Setpgid` is available on all platforms that `e2e_node` builds on.
This fixes race conditions in configmap, secret, downwardapi & git_repo
volume plugins.
wrappedVolumeSpec vars used by volume mounters and unmounters contained
a pointer to api.Volume structs which were being patched by
NewWrapperMounter/NewWrapperUnmounter, causing race condition during
volume mounts.
Automatic merge from submit-queue
Judge the cloud isn't nil before use it in server.go
The PR add a judgement for the cloud before use it, because cloudprovider.InitCloudProvider maybe return nil for the cloud.
Automatic merge from submit-queue
Fix ConfigMap related node e2e tests on selinux enabled systems
One selinux enabled systems, it might require to relabel
/var/lib/kubelet, otherwise following tests fail:
Summarizing 7 Failures:
```
[Fail] [k8s.io] ConfigMap [It] updates should be reflected in volume [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/configmap.go:131
[Fail] [k8s.io] ConfigMap [It] should be consumable from pods in volume as non-root with FSGroup [Feature:FSGroup]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/util.go:2115
[Fail] [k8s.io] ConfigMap [It] should be consumable from pods in volume with mappings as non-root [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/util.go:2115
[Fail] [k8s.io] ConfigMap [It] should be consumable from pods in volumpe [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/util.go:2115
[Fail] [k8s.io] ConfigMap [It] should be consumable from pods in volume with mappings [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/util.go:2115
[Fail] [k8s.io] ConfigMap [It] should be consumable from pods in volume with mappings as non-root with FSGroup [Feature:FSGroup]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/util.go:2115
[Fail] [k8s.io] ConfigMap [It] should be consumable from pods in volume as non-root [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/util.go:2115
```
@kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Faster test
<!--
Checklist for submitting a Pull Request
Please remove this comment block before submitting.
1. Please read our [contributor guidelines](https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md).
2. See our [developer guide](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md).
3. If you want this PR to automatically close an issue when it is merged,
add `fixes #<issue number>` or `fixes #<issue number>, fixes #<issue number>`
to close multiple issues (see: https://github.com/blog/1506-closing-issues-via-pull-requests).
4. Follow the instructions for [labeling and writing a release note for this PR](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes) in the block below.
-->
In attempting to troubleshoot flakes with this test case I actually wanted to understand how it worked.
There's some poor comments that need work.
I added some additional output which may or may not help in debugging the flakes.
I doubt this fixes the flake.
My major concern is the 'refactor' I did of the test case to batch up runs by sub-test-case. As it stood there was a 200ms pause between each sub, so they should not have interfered with each other. Now they are just started as fast as possible, but only 20 run at a time before moving on to the next 20. I am not sure if I am violating the ethos of the original test case.
Runs on my computer are down from 2m40s -> 40s.
Getting rid of the arbitrary client limiting brings it down to ~12 seconds. 11 to fetch the image and <1 to actually run the tests against the proxies. I can add a zero to the number of loops if you want to hit it harder. It would result in 10x as much text output though.
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Automatic merge from submit-queue
Use response content-type on restclient errors
Also allow a new AcceptContentTypes field to allow the client to ask for
a fallback serialization when getting responses from the server. This
allows a new client to ask for protobuf and JSON, falling back to JSON
when necessary.
The changes to request.go allow error responses from non-JSON servers to
be properly decoded.
@wojtek-t - also alters #28910 slightly (this is better output)
Automatic merge from submit-queue
Add an Azure CloudProvider Implementation
This PR adds `Azure` as a cloudprovider provider for Kubernetes. It specifically adds support for native pod networking (via Azure User Defined Routes) and L4 Load Balancing (via Azure Load Balancers).
I did have to add `clusterName` as a parameter to the `LoadBalancers` methods. This is because Azure only allows one "LoadBalancer" object per set of backend machines. This means a single "LoadBalancer" object must be shared across the cluster. The "LoadBalancer" is named via the `cluster-name` parameter passed to `kube-controller-manager` so as to enable multiple clusters per resource group if the user desires such a configuration.
There are few things that I'm a bit unsure about:
1. The implementation of the `Instances` interface. It's not extensively documented, it's not really clear what the different functions are used for, and my questions on the ML didn't get an answer.
2. Counter to the comments on the `LoadBalancers` Interface, I modify the `api.Service` object in `EnsureLoadBalancerDeleted`, but not with the intention of affecting Kube's view of the Service. I simply do it so that I can remove the `Port`s on the `Service` object and then re-use my reconciliation logic that can handle removing stale/deleted Ports.
3. The logging is a bit verbose. I'm looking for guidance on the appropriate log level to use for the chattier bits.
Due to the (current) lack of Instance Metadata Service and lack of Virtual Machine Identity in Azure, the user is required to do a few things to opt-in to this provider. These things are called-out as they are in contrast to AWS/GCE:
1. The user must provision an Azure Active Directory ServicePrincipal with `Contributor` level access to the resource group that the cluster is deployed in. This creation process is documented [by Hashicorp](https://www.packer.io/docs/builders/azure-setup.html) or [on the MSDN Blog](https://blogs.msdn.microsoft.com/arsen/2016/05/11/how-to-create-and-test-azure-service-principal-using-azure-cli/).
2. The user must place a JSON file somewhere on each Node that conforms to the `AzureConfig` struct defined in `azure.go`. (This is automatically done in the Azure flavor of [Kubernetes-Anywhere](https://github.com/kubernetes/kubernetes-anywhere).)
3. The user must specify `--cloud-config=/path/to/azure.json` as an option to `kube-apiserver` and `kube-controller-manager` similarly to how the user would need to pass `--cloud-provider=azure`.
I've been running approximately this code for a month and a half. I only encountered one bug which has since been fixed and covered by a unit test. I've just deployed a new cluster (and a Type=LoadBalancer nginx Service) using this code (via `kubernetes-anywhere`) and have posted [the `kube-controller-manager` logs](https://gist.github.com/colemickens/1bf6a26e7ef9484a72a30b1fcf9fc3cb) for anyone who is interested in seeing the logs of the logic.
If you're interested in this PR, you can use the instructions in my [`azure-kubernetes-demo` repository](https://github.com/colemickens/azure-kubernetes-demo) to deploy a cluster with minimal effort via [`kubernetes-anywhere`](https://github.com/kubernetes/kubernetes-anywhere). (There is currently [a pending PR in `kubernetes-anywhere` that is needed](https://github.com/kubernetes/kubernetes-anywhere/pull/172) in conjuncture with this PR). I also have a pre-built `hyperkube` image: `docker.io/colemickens/hyperkube-amd64:v1.4.0-alpha.0-azure`, which will be kept in sync with the branch this PR stems from.
I'm hoping this can land in the Kubernetes 1.4 timeframe.
CC (potential code reviewers from Azure): @ahmetalpbalkan @brendandixon @paulmey
CC (other interested Azure folk): @brendandburns @johngossman @anandramakrishna @jmspring @jimzim
CC (others who've expressed interest): @codefx9 @edevil @thockin @rootfs
Automatic merge from submit-queue
Add support for kubectl create quota command
Follow-up of https://github.com/kubernetes/kubernetes/pull/19625
```
Create a resourcequota with the specified name, hard limits and optional scopes
Usage:
kubectl create quota NAME [--hard=key1=value1,key2=value2] [--scopes=Scope1,Scope2] [--dry-run=bool] [flags]
Aliases:
quota, q
Examples:
// Create a new resourcequota named my-quota
$ kubectl create quota my-quota --hard=cpu=1,memory=1G,pods=2,services=3,replicationcontrollers=2,resourcequotas=1,secrets=5,persistentvolumeclaims=10
// Create a new resourcequota named best-effort
$ kubectl create quota best-effort --hard=pods=100 --scopes=BestEffort
```
Automatic merge from submit-queue
Fix panic in schema test
If the swagger files for testing are lost, the func `loadSchemaForTest` or `NewSwaggerSchemaFromBytes` will return a non-nil error and a nil schema. In this case, the calling for `ValidateBytes` will result in panic. So, call Fatalf instead of Errorf.
Also fix minor typos.
Test logs:
```
--- FAIL: TestLoad (0.01s)
schema_test.go:131: Failed to load: open ../../../api/swagger-spec/v1.json: no such file or directory
--- FAIL: TestValidateOk (0.00s)
schema_test.go:138: Failed to load: open ../../../api/swagger-spec/v1.json: no such file or directory
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x4d52df]
goroutine 10 [running]:
panic(0x15fffa0, 0xc8200100a0)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
testing.tRunner.func1(0xc820085a70)
/usr/local/go/src/testing/testing.go:467 +0x192
panic(0x15fffa0, 0xc8200100a0)
/usr/local/go/src/runtime/panic.go:443 +0x4e9
k8s.io/kubernetes/pkg/api/validation.TestValidateOk(0xc820085a70)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/api/validation/schema_test.go:159 +0x79f
testing.tRunner(0xc820085a70, 0x22aad68)
/usr/local/go/src/testing/testing.go:473 +0x98
created by testing.RunTests
/usr/local/go/src/testing/testing.go:582 +0x892
FAIL k8s.io/kubernetes/pkg/api/validation 0.048s
```
Automatic merge from submit-queue
Redirect the website to new location in gpu-support.md
The website has been changed, should be redirected to new one.
Automatic merge from submit-queue
Add rules for all directories in federation/cmd/
federation related target is not included in Makefile. Add it.
/cc @thockin
BTW, `make help` is still WIP.
Automatic merge from submit-queue
Rework pod waiting mechanism in e2e tests to accept pod and watch based
This PR re-applies #28212 which was reverted in #29223. The only difference is that the initial PR contained also `PodStartTimeout` shortening (see [here](4b0c0bd924)) which might caused the problems. Let's give it a 2nd try. I've tested all the flakes and they were passing on my machine.
@smarterclayton @apelisse ptal
GCI QA jobs will run tests using GCI daily builds, and the kubernetes built into
the images. All QA jobs will set the `JENKINS_USE_GCI_VERSION` env var.
1. Use --client since -c is deprecated now
2. The command (./kubectl version --client | grep -o 'GitVersion:"[^"]*"')
now returns:
GitVersion:"v1.4.0-alpha.1.784+ed3a29bd6aeb98-dirty"
so parse out the version better using sed
Related to #23708
Package goroutinemap can be structurally simplified to be more
idiomatic, concise, and free of error potential. No structural changes
are made.
It is unconventional declare `sync.Mutex` directly as a pointerized
field in a parent structure. The `sync.Mutex` operates on pointer
receivers of itself; and by relying on that, the types that contain
those fields can be safely constructed using
https://golang.org/ref/spec#The_zero_value.
The duration constants are already of type `time.Duration`, so
re-declaring that is redundant.
According to the documentation for Go package time, `time.Ticker` and
`time.Timer` are uncollectable by garbage collector finalizers. They
leak until otherwise stopped. This commit ensures that all remaining
instances are stopped upon departure from their relative scopes.