Commit Graph

64850 Commits (85099ba4c2eeb125da03bdce6217b28aaf6a0abb)

Author SHA1 Message Date
Maciej Szulik 8237e1a034
Generated changes 2018-04-25 13:09:12 +02:00
Shyam Jeedigunta 2d6b4d0fa0 Revert "gce: move etcd dir cleanup to manifests"
This reverts commit ae73bed1d0.
2018-04-25 12:54:12 +02:00
Krzysztof Siedlecki e6f14191ce version typo fix 2018-04-25 12:46:33 +02:00
Davanum Srinivas b44c68eb2e Hack for testing until test-infra/pull/7846 merges 2018-04-25 06:12:20 -04:00
Kubernetes Submit Queue c1a6d5544c
Merge pull request #61882 from xiangpengzhao/fix-changelog
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add note on upgrading cluster by kubeadm.

**What this PR does / why we need it**:
Upgrading cluster from 1.9.x to 1.10.0 by kubeadm fails due to type change of `featureGates` in `KubeProxyConfiguration` done in #57962. This PR add a note on what should be done before upgrading.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #61764

**Special notes for your reviewer**:
cc @kubernetes/sig-cluster-lifecycle-bugs @liggitt @fgbreel
Thanks @berlinsaint for the workaround!

We may need a patch (if possible) for kubeadm to do this automatically as suggested by @danderson .

**Release note**:

```release-note
NONE
```
2018-04-25 02:15:22 -07:00
Kubernetes Submit Queue 046baee847
Merge pull request #63118 from vikaschoudhary16/start-stop-race
Automatic merge from submit-queue (batch tested with PRs 62951, 57460, 63118). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix device plugin re-registration

**What this PR does / why we need it**:
While registering a new endpoint, device manager copies all the devices from the old endpoint for the same resource and then it stops the old endpoint and starts the new endpoint.

There is no sync between stopping the old and starting the new. While stopping the old, manager marks devices(which are copied to new endpoint as well) as "Unhealthy".

In the endpoint.go, when after restart, plugin reports devices healthy, same health state (healthy) is found  in the endpoint database and endpoint module does not update manager database.

Solution in the PR is to mark devices as unhealthy before copying to new endpoint. 


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62773

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
/cc @jiayingz @vishh @RenaudWasTaken @derekwaynecarr
2018-04-25 02:01:56 -07:00
Kubernetes Submit Queue 4f233eb92a
Merge pull request #57460 from dixudx/validate_initcontainer_hostport
Automatic merge from submit-queue (batch tested with PRs 62951, 57460, 63118). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix hostport checking for initContainers since they run in sequential order

**What this PR does / why we need it**:
Fix hostport checking for initContainers since they run in sequential order

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
partial Fixes #57443

**Special notes for your reviewer**:
/assign @CaoShuFeng @dims 

**Release note**:

```release-note
None
```
2018-04-25 02:01:53 -07:00
xuzhonghu e1bcca681d remove useless alwaysAdmit in apiserver test 2018-04-25 16:37:08 +08:00
Kubernetes Submit Queue aa1ec693c3
Merge pull request #62951 from dims/support-nsenter-better-in-non-systemd-envs
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Support nsenter in non-systemd environments

**What this PR does / why we need it**:

In our CI, we run kubekins image for most of the jobs. This is a
debian image with upstart and does not enable systemd. So we should

* Bailout if any binary is missing other than systemd-run.
* SupportsSystemd should check the binary path to correctly
  identify if the systemd-run is present or not
* Pass the errors back to the callers so kubelet is forced to
  fail early when there is a problem. We currently assume
  that all binaries are in the root directory by default which
  is wrong.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-04-25 01:08:10 -07:00
Kubernetes Submit Queue fad6279eee
Merge pull request #62745 from cblecker/github-template-owners
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add contribex to github template owners

**What this PR does / why we need it**:
The `.github/` folder doesn't have an OWNERS file. These templates fall under contribex, so this would add an OWNERS file to this folder.

**Release note**:

```release-note
NONE
```
2018-04-25 00:17:53 -07:00
Kubernetes Submit Queue 4dfd4b219c
Merge pull request #63111 from ixdy/rpm-rules-manual
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Tag pkg_rpm rules as manual

**What this PR does / why we need it**: Mitigation step for #63108. Nothing depends on these rules yet, and tagging as manual prevents `make bazel-build` from building them, too.

**Release note**:

```release-note
NONE
```

/assign @BenTheElder 
/cc @sigma
2018-04-25 00:17:32 -07:00
Martin Vladev 40cf788013 Register Prometheus etcdmetrics only for apiserver
Removed automatic registration with `init` funciton and use `Register` function
to register metrics for etcd storage only when requested.
2018-04-25 10:06:37 +03:00
Christoph Blecker 4644761625
Clean up and remove unused deps 2018-04-24 23:55:29 -07:00
Di Xu b47ab8b2d3 add warnings for docker-only flags 2018-04-25 12:56:53 +08:00
Kubernetes Submit Queue 61892abc94
Merge pull request #62874 from dcbw/dockershim-SetUpPod-cleanup-on-failure
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/sandbox: clean up pod network even if SetUpPod() failed

If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.

The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.

In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.

Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965

```release-note
NONE
```
2018-04-24 21:48:01 -07:00
Da K. Ma 793ed98715 Added more UT for invalid case.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-04-25 11:14:24 +08:00
vikaschoudhary16 c846d5fe63 Fix race between stopping old and starting new endpoint 2018-04-24 22:22:39 -04:00
Kubernetes Submit Queue 50dd920837
Merge pull request #62284 from DirectXMan12/bug/fix-use-rest-clients-help-line
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix hpa-use-rest-clients help text

The help text erroneously says "WARNING: alpha feature" when it
shouldn't have.  When we moved to beta, this should have been removed.

**Release note**:
```release-note
NONE
```
2018-04-24 19:01:23 -07:00
Kubernetes Submit Queue a4271c03cb
Merge pull request #63090 from mtaufen/fix-qosreserved-json-tag
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix qosReserved json tag (lowercase qos, instead of uppercase QOS)

The API conventions specify that json keys should start with a lowercase
character, and if the key starts with an initialism, all characters in
the initialism should be lowercase. See `tlsCipherSuites` as an example.

API Conventions:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md

>All letters in the acronym should have the same case, using the
>appropriate case for the situation. For example, at the beginning
>of a field name, the acronym should be all lowercase, such as "httpGet".

Follow up to: https://github.com/kubernetes/kubernetes/pull/62925

```release-note
NONE
```

@sjenning @derekwaynecarr
2018-04-24 19:01:20 -07:00
Kubernetes Submit Queue 9baf337cf3
Merge pull request #63084 from mikedanese/ctx
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

replace request.Context with context.Context

Followup on https://github.com/kubernetes/kubernetes/pull/62810

cc @liggitt @kubernetes/sig-api-machinery-pr-reviews 

```release-note
NONE
```
2018-04-24 19:01:17 -07:00
Kubernetes Submit Queue be20a8d1d0
Merge pull request #62927 from hzxuzhonghu/fix-typo
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typo: mutating validating admission should be distinguished

1. fix typo: mutating validating admission should be distinguished

1. avoid calling admit.Handles twice in delete

**Release note**:

```release-note
NONE
```
2018-04-24 19:01:14 -07:00
Kubernetes Submit Queue 47ece3a2ca
Merge pull request #59220 from neolit123/test-token
Automatic merge from submit-queue (batch tested with PRs 59220, 62927, 63084, 63090, 62284). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm: add better test coverage to token.go

**What this PR does / why we need it**:
a PR for adding some more tests in `kubeadm/cmd` for `token.go`.

some areas of the `token.go` like listing, creating and deleting tokens can present challenges.
coverage was increased to around 87%.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

please, link issue # if you know of such.

**Special notes for your reviewer**:
none

**Release note**:

```release-note
NONE
```
2018-04-24 19:01:10 -07:00
Kubernetes Submit Queue 6fbca94fae
Merge pull request #63010 from deads2k/api-04-metadataaccessor
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove confusing flexibility for metadata interpretation

Metadata accessors are coded in.  This means that we don't need to inject flexibility, the flexibility is already present based on what your code relies up.  This removes the per-individual resource injection which simplifies all calling code.

intersection of @kubernetes/sig-api-machinery-pr-reviews @kubernetes/sig-cli-maintainers 

```release-note
NONE
```
2018-04-24 17:59:12 -07:00
Jeff Grafton 6ddf6c12c5 Tag pkg_rpm rules as manual 2018-04-24 16:48:22 -07:00
Kubernetes Submit Queue e5274b6376
Merge pull request #61246 from serathius/remove-examples
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove examples directory

Removes `examples` directory, which was migrated https://github.com/kubernetes/examples/pull/210
Moves manifests used in tests to `test/e2e/testing-manifests`
I will submit PR to https://github.com/kubernetes/k8s.io to fix links to https://releases.k8s.io/*/examples
Fixes #60887

**Special notes for your reviewer**:
```release-note
NONE
```
cc @ahmetb @ixdy
2018-04-24 16:17:31 -07:00
Kubernetes Submit Queue f646ece977
Merge pull request #63074 from shyamjvs/fix-ip-alias-bug
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix IP_ALIAS_SUBNETWORK env var assignment in GCE setup

/cc @wojtek-t 

```release-note
NONE
```
2018-04-24 15:16:19 -07:00
Maciej Szulik af2353f8a3
Fix discovery default timeout test 2018-04-24 23:36:30 +02:00
Kubernetes Submit Queue 5b0df3656e
Merge pull request #63000 from kawych/versions
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove METADATA_AGENT_VERSION config option

**What this PR does / why we need it**:
Remove METADATA_AGENT_VERSION configuration option. To keep Metadata Agent version consistent across Kubernetes deployments.

**Release note**:
```release-note
Remove METADATA_AGENT_VERSION configuration option.
```
2018-04-24 14:22:23 -07:00
Kubernetes Submit Queue a399d9201b
Merge pull request #62922 from krousey/node-upgrade
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use BootID instead of ExternalID to check for new instance

PR #60692 changed the way that ExternalID is reported on GCE. Its value
is no longer the GCE instance ID. It is the instance name. So it
cannot be used to determine VM uniqueness across time. Instead,
upgrade will check that the boot ID changed.

**What this PR does / why we need it**:
Node upgrades stall out because the external ID remains the same across upgrades now.

**Which issue(s) this PR fixes**:
Fixes #62713 

**Release note**:
```release-note
NONE
```
2018-04-24 14:22:20 -07:00
Kubernetes Submit Queue 7105964f62
Merge pull request #63015 from mikedanese/etcd-empty-dir
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

gce: move etcd dir cleanup to manifests

we deploy it as a manifest, not an addon so locate it with the other
master manifests.

This is the last "bare pod addon", which needs to be remove to improve the situation in https://github.com/kubernetes/kubernetes/issues/62808.
 
```release-note

```
2018-04-24 14:22:16 -07:00
Kubernetes Submit Queue 15b61bc006
Merge pull request #62818 from mikedanese/selfdelete
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

authz: nodes should not be able to delete themselves

@kubernetes/sig-auth-pr-reviews 

```release-note
kubelets are no longer allowed to delete their own Node API object. Prior to 1.11, in rare circumstances related to cloudprovider node ID changes, kubelets would attempt to delete/recreate their Node object at startup. If a legacy kubelet encounters this situation, a cluster admin can remove the Node object:
* `kubectl delete node/<nodeName>`
or grant self-deletion permission explicitly:
* `kubectl create clusterrole self-deleting-nodes --verb=delete --resource=nodes`
* `kubectl create clusterrolebinding self-deleting-nodes --clusterrole=self-deleting-nodes --group=system:nodes`
```
2018-04-24 14:22:13 -07:00
Kubernetes Submit Queue b692b7159a
Merge pull request #62590 from mlmhl/csi_test
Automatic merge from submit-queue (batch tested with PRs 62590, 62818, 63015, 62922, 63000). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix some bugs inside CSI volume plugin unit test TestAttacherMountDevice

**What this PR does / why we need it**:

Fix two bugs inside CSI volume plugin unit test `TestAttacherMountDevice`.

**Release note**:

```release-note
None
```
2018-04-24 14:22:10 -07:00
Filipe Brandenburger 54d93820ee Update libcontainer to include PRs with fixes to systemd cgroup driver
PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that
makes Kubelet startup hang when using systemd cgroup driver (by adding a
timeout) and further PR opencontainers/runc#1772 fixes that bug by
checking the proper error status before waiting on the channel.

PR opencontainers/runc#1776 checks whether Delegate works in slices,
which keeps libcontainer systemd cgroup driver working on systemd v237+.

PR opencontainers/runc#1781 makes the channel buffered, so if we time
out waiting on the channel, the updater will not block trying to it
since there are no longer any consumers.
2018-04-24 13:29:13 -07:00
Kubernetes Submit Queue b2ab901230
Merge pull request #62390 from discordianfish/kube-proxy-tolerate-all
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm: Make kube-proxy tolerate all taints

**What this PR does / why we need it**:
As a essential core component, kube-proxy should generally run on all
nodes even if the cluster operator taints nodes for special purposes.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes kubernetes/kubeadm#699

**Release note**:

```release-note
kubeadm creates kube-proxy with a toleration to run on all nodes, no matter the taint.
```
2018-04-24 13:28:31 -07:00
Kubernetes Submit Queue f68d10cfe4
Merge pull request #62853 from tony612/fix-resultRun-reset
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

reset resultRun to 0 on pod restart

**What this PR does / why we need it**:

The resultRun should be reset to 0 on pod restart, so that resultRun on the first failure of the new container will be 1, which is correct. Otherwise, the actual FailureThreshold after restarting will be `FailureThreshold - 1`.

**Which issue(s) this PR fixes**:

This PR is related to https://github.com/kubernetes/kubernetes/issues/53530. https://github.com/kubernetes/kubernetes/pull/46371 fixed that issue but there's still a little problem like what I said above.

**Special notes for your reviewer**:

**Release note**:
```release-note
fix resultRun by resetting it to 0 on pod restart
```
2018-04-24 13:28:25 -07:00
Kubernetes Submit Queue f388fcb229
Merge pull request #59122 from klausenbusk/kubeadm-ca
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubeadm: Mount additional paths inside apiserver/controller-manager for working CA root

This is required for a working CA root, as /etc/ssl/certs on a few
Linux distributions just contains a bunch of symlinks.
Container Linux and Debian have symlinks pointing to
/usr/share/ca-certificates, ArchLinux has symlinks pointing
to /etc/ca-certificates.
On Debian /etc/ssl/certs can also include symlinks pointing
to /usr/local/share/ca-certificates for local CA certificates.

Fix: kubeadm/#671

---

**What this PR does / why we need it**:

Without this PR, `controller-manager` and `apiserver` would lack a CA root on some Linux distro (ex: Container Linux) which for example break flexplugins which require a CA root [[1]](https://github.com/kubernetes-incubator/external-storage/issues/571#issuecomment-360155462).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubeadm/issues/671

**Special notes for your reviewer**:

**Release note**:
```release-note
Mount additional paths required for a working CA root, for setups where /etc/ssl/certs doesn't contains certificates but just symlink.
```

/sig sig-kubeadm
2018-04-24 13:28:21 -07:00
Kubernetes Submit Queue bf1974c83f
Merge pull request #61711 from crassirostris/audit-size-limiting
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Implemented truncating audit backend

Fixes https://github.com/kubernetes/kubernetes/issues/60432

Introduces an optional truncating backend, disabled by default, that estimates the size of audit events and truncates events/split batches based on the configuration.

/cc @sttts @tallclair @CaoShuFeng @ericchiang 

```release-note
Introduce truncating audit backend that can be enabled for existing backend to limit the size of individual audit events and batches of events.
```
2018-04-24 13:28:17 -07:00
Kubernetes Submit Queue 67870dac16
Merge pull request #62655 from stealthybox/TLSUpgrade_+_detiber-kubeadm_hash
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Modify the kubeadm upgrade DAG for the TLS Upgrade

**What this PR does / why we need it**:
This adds the necessary utilities to detect Etcd TLS on static pods from the file system and query Etcd.
It modifies the upgrade logic to make it support the APIServer downtime.
Tests are included and should be passing.

```bash 
bazel test //cmd/kubeadm/... \
  && bazel build //cmd/kubeadm --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 \
  && issue=TLSUpgrade ~/Repos/vagrant-kubeadm-testing/copy_kubeadm_bin.sh
```
These cases are working consistently for me
```bash
kubeadm-1.9.6 reset \
  && kubeadm-1.9.6 init --kubernetes-version 1.9.1 \
  && kubectl apply -f https://git.io/weave-kube-1.6
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.6  # non-TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.0 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.1 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.1  # TLS to TLS /w major version downgrade
```

This branch is based on top of #61942, as resolving the hash race condition is necessary for consistent behavior.
It looks to fit in pretty well with @craigtracey's PR: #62141
The interfaces are pretty similar

/assign @detiber @timothysc

**Which issue(s) this PR fixes**
Helps with https://github.com/kubernetes/kubeadm/issues/740

**Special notes for your reviewer**:

278b322a1c
   [kubeadm] Implement ReadStaticPodFromDisk

c74b56372d
   Implement etcdutils with Cluster.HasTLS()

   - Test HasTLS()
   - Instrument throughout upgrade plan and apply
   - Update plan_test and apply_test to use new fake Cluster interfaces
   - Add descriptions to upgrade range test
   - Support KubernetesDir and EtcdDataDir in upgrade tests
   - Cover etcdUpgrade in upgrade tests
   - Cover upcoming TLSUpgrade in upgrade tests

8d8e5fe33b
   Update test-case, fix nil-pointer bug, and improve error message

97117fa873
   Modify the kubeadm upgrade DAG for the TLS Upgrade

   - Calculate `beforePodHashMap` before the etcd upgrade in anticipation of
   KubeAPIServer downtime
   - Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch
   on the Etcd TLS Upgrade if TLS Upgrade:
      - Skip L7 Etcd check (could implement a waiter for this)
      - Skip data rollback on etcd upgrade failure due to lack of L7 check
    (APIServer is already down unable to serve new requests)
      - On APIServer upgrade failure, also rollback the etcd manifest to
    maintain protocol compatibility

   - Add logging

**Release note**:
```release-note
kubeadm upgrade no longer races leading to unexpected upgrade behavior on pod restarts
kubeadm upgrade now successfully upgrades etcd and the controlplane to use TLS
kubeadm upgrade now supports external etcd setups
kubeadm upgrade can now rollback and restore etcd after an upgrade failure
```
2018-04-24 13:28:13 -07:00
Kubernetes Submit Queue 44b57338d5
Merge pull request #59692 from mtaufen/dkcfg-unpack-configmaps
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

unpack dynamic kubelet config payloads to files

This PR unpacks the downloaded ConfigMap to a set of files on the node.

This enables other config files to ride alongside the
KubeletConfiguration, and the KubeletConfiguration to refer to these
cohabitants with relative paths.

This PR also stops storing dynamic config metadata (e.g. current,
last-known-good config records) in the same directory as config
checkpoints. Instead, it splits the storage into `meta` and
`checkpoints` dirs.

The current store dir structure is as follows:
```
- dir named by --dynamic-config-dir (root for managing dynamic config)
| - meta (dir for metadata, e.g. which config source is currently assigned, last-known-good)
  | - current (a serialized v1 NodeConfigSource object, indicating the assigned config)
  | - last-known-good (a serialized v1 NodeConfigSource object, indicating the last-known-good config)
| - checkpoints (dir for config checkpoints)
  | - uid1 (dir for unpacked config, identified by uid1)
    | - file1
    | - file2
    | - ...
  | - uid2
  | - ...
```

There are some likely changes to the above structure before dynamic config goes beta, such as renaming "current" to "assigned" for clarity, and extending the checkpoint identifier to include a resource version, as part of resolving #61643.

```release-note
NONE
```

/cc @luxas @smarterclayton
2018-04-24 12:01:37 -07:00
Marek Siarkowicz f0b5e2d7c5 Remove examples directory 2018-04-24 19:45:43 +01:00
Kubernetes Submit Queue c0d1ab8e99
Merge pull request #62083 from rramkumar1/ipvs-exclude-cidrs-flag
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add --ipvs-exclude-cidrs flag to kube-proxy. 

**What this PR does / why we need it**:
Add a flag to kube-proxy called --ipvs-exclude-cidrs. This flag allows a user to specify a list of CIDR ranges that should not be included in the cleanup of IPVS rules. 

Fixes: #59507

**Release note**:
```
Use --ipvs-exclude-cidrs to specify a list of CIDR's which the IPVS proxier should not touch when cleaning up IPVS rules.
```
/assign @m1093782566
2018-04-24 11:13:14 -07:00
Solly Ross a6c653d87f Fix hpa-use-rest-clients help text
The help text erroneously says "WARNING: alpha feature" when it
shouldn't have.  When we moved to beta, this should have been removed.
2018-04-24 13:29:08 -04:00
Kubernetes Submit Queue 817aac84b6
Merge pull request #63017 from liggitt/webhook_test
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make integration test etcd store unique

Ensures each test gets a distinct spot in etcd to avoid cross-test pollution

Fixes flakes seen in the etcd_storage_path_test


```release-note
NONE
```
2018-04-24 10:09:59 -07:00
Dan Williams 91321ef85b dockershim/sandbox: clean up pod network even if SetUpPod() failed
If the CNI network plugin completes successfully, but something fails
between that success and dockerhsim's sandbox setup code, plugin resources
may not be cleaned up. A non-trivial amount of code runs after the
plugin itself exits and the CNI driver's SetUpPod() returns, and any error
condition recognized by that code would cause this leakage.

The Kubernetes CRI RunPodSandbox() request does not attempt to clean
up on errors, since it cannot know how much (if any) networking
was actually set up. It depends on the CRI implementation to do
that cleanup for it.

In the dockershim case, a SetUpPod() failure means networkReady is
FALSE for the sandbox, and TearDownPod() will not be called later by
garbage collection even though networking was configured, because
dockershim can't know how far SetUpPod() got.

Concrete examples include if the sandbox's container is somehow
removed during during that time, or another OS error is encountered,
or the plugin returns a malformed result to the CNI driver.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1532965
2018-04-24 11:17:49 -05:00
Michael Taufen 23c21b055c Fix qosReserved json tag (lowercase qos, instead of uppercase QOS)
The API conventions specify that json keys should start with a lowercase
character, and if the key starts with an initialism, all characters in
the initialism should be lowercase. See `tlsCipherSuites` as an example.

API Conventions:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md

>All letters in the acronym should have the same case, using the
>appropriate case for the situation. For example, at the beginning
>of a field name, the acronym should be all lowercase, such as "httpGet".
2018-04-24 09:12:35 -07:00
Mike Danese 54fd2aaefd replace request.Context with context.Context 2018-04-24 08:59:00 -07:00
leigh schrandt dac4fe84bb [kubeadm] Fix Etcd Rollback
Fix `rollbackEtcdData()` to return error=nil on success
`rollbackEtcdData()` used to always return an error making the rest of the
upgrade code completely unreachable.

Ignore errors from `rollbackOldManifests()` during the rollback since it
always returns an error.
Success of the rollback is gated with etcd L7 healthchecks.

Remove logic implying the etcd manifest should be rolled back when
`upgradeComponent()` fails
2018-04-24 09:56:42 -06:00
Jason DeTiberus 4c768bb2ca [kubeadm] Add etcd L7 check on upgrade
- Adds L7 check for kubeadm etcd static pod upgrade
2018-04-24 09:56:35 -06:00
leigh schrandt 8129480d44 [kubeadm] Modify the kubeadm upgrade DAG for the TLS Upgrade
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch on the Etcd TLS Upgrade
if TLS Upgrade:
  - Skip L7 Etcd check (could implement a waiter for this)
  - Skip data rollback on etcd upgrade failure due to lack of L7 check (APIServer is already down unable to serve new requests)
  - On APIServer upgrade failure, also rollback the etcd manifest to maintain protocol compatibility

- Add logging
2018-04-24 09:55:56 -06:00
leigh schrandt 4a37e05665 [kubeadm] Update test-case, fix nil-pointer bug, and improve error message 2018-04-24 09:55:56 -06:00