Update k8s upgrade docs (#8789)

* Update k8s upgrade docs
pull/8805/head
Luke Kysow 2020-10-01 14:36:15 -07:00 committed by GitHub
parent d0c160130b
commit 6abc6a293c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 279 additions and 26 deletions

View File

@ -7,28 +7,252 @@ description: Upgrade Consul on Kubernetes
# Upgrade Consul on Kubernetes
To upgrade Consul on Kubernetes, we follow the same pattern as
[generally upgrading Consul](/docs/upgrading), except we can use
the Helm chart to step through a rolling deploy. It is important to understand
how to [generally upgrade Consul](/docs/upgrading) before reading this
section.
## Upgrade Types
Upgrading Consul on Kubernetes will follow the same pattern: each server
will be updated one-by-one. After that is successful, the clients will
be updated in batches.
Consul on Kubernetes will need to be upgraded/updated if you change your Helm configuration,
if a new Helm chart is released, or if you wish to upgrade your Consul version.
### Helm Configuration Changes
If you make a change to your Helm values file, you will need to perform a `helm upgrade`
for those changes to take effect.
For example, if you've installed Consul with the following:
```yaml
global:
name: consul
connectInject:
enabled: false
```
And you wish to set `connectInject.enabled` to `true`:
```diff
global:
name: consul
connectInject:
- enabled: false
+ enabled: true
```
Perform the following steps:
1. Determine your current installed chart version.
```bash
helm list -f consul
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
consul default 2 2020-09-30 ... deployed consul-0.24.0 1.8.2
```
In this example, version `0.24.0` (from `consul-0.24.0`) is being used.
1. Perform a `helm upgrade`:
```bash
helm upgrade consul hashicorp/consul --version 0.24.0 -f /path/to/my/values.yaml
```
**Before performing the upgrade, be sure you've read the other sections on this page,
continuing at [Determining What Will Change](#determining-what-will-change).**
~> NOTE: It's important to always set the `--version` flag, because otherwise Helm
will use the most up-to-date version in its local cache, which may result in an
unintended upgrade.
### Helm Chart Version Upgrade
You may wish to upgrade your Helm chart version to take advantage of new features,
bugfixes, or because you want to upgrade your Consul version, and it requires a
certain Helm chart version.
1. Update your local Helm repository cache:
```bash
helm repo update
```
1. List all available versions:
```bash
helm search repo hashicorp/consul -l
NAME CHART VERSION APP VERSION DESCRIPTION
hashicorp/consul 0.24.1 1.8.2 Official HashiCorp Consul Chart
hashicorp/consul 0.24.0 1.8.1 Official HashiCorp Consul Chart
...
```
Here we can see that the latest version of `0.24.1`.
1. To determine which version you have installed, issue the following command:
```bash
helm list -f consul
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
consul default 2 2020-09-30 ... deployed consul-0.24.0 1.8.2
```
In this example, version `0.24.0` (from `consul-0.24.0`) is being used.
If you want to upgrade to the latest `0.24.1` version, use the following procedure:
1. Check the changelog for any breaking changes from that version and any versions in between: https://github.com/hashicorp/consul-helm/blob/master/CHANGELOG.md.
1. Upgrade by performing a `helm upgrade` with the `--version` flag:
```bash
helm upgrade consul hashicorp/consul --version 0.24.1 -f /path/to/my/values.yaml
```
**Before performing the upgrade, be sure you've read the other sections on this page,
continuing at [Determining What Will Change](#determining-what-will-change).**
### Consul Version Upgrade
If a new version of Consul is released, you will need to perform a Helm upgrade
to update to the new version.
1. Ensure you've read the [Upgrading Consul](/docs/upgrading) documentation.
1. Ensure you've read any [specific instructions](/docs/upgrading/upgrade-specific) for the version you're upgrading
to and the Consul [changelog](https://github.com/hashicorp/consul/blob/master/CHANGELOG.md) for that version.
1. Read our [Compatibility Matrix](/docs/k8s/upgrade/compatibility) to ensure
your current Helm chart version supports this Consul version. If it does not,
you may need to also upgrade your Helm chart version at the same time.
1. Set `global.consul` in your `values.yaml` to the desired version:
```yaml
global:
image: consul:1.8.3
```
1. Determine your current installed chart version:
```bash
helm list -f consul
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
consul default 2 2020-09-30 ... deployed consul-0.24.0 1.8.2
```
In this example, version `0.24.0` (from `consul-0.24.0`) is being used.
1. Perform a `helm upgrade`:
```bash
helm upgrade consul hashicorp/consul --version 0.24.0 -f /path/to/my/values.yaml`.
```
**Before performing the upgrade, be sure you've read the other sections on this page,
continuing at [Determining What Will Change](#determining-what-will-change).**
~> NOTE: It's important to always set the `--version` flag, because otherwise Helm
will use the most up-to-date version in its local cache, which may result in an
unintended upgrade.
## Determining What Will Change
Before upgrading, it's important to understand what changes will be made to your
cluster. For example, you will need to take more care if your upgrade will result
in the Consul server statefulset being redeployed.
There is no built-in functionality in Helm that shows what a helm upgrade will
change. There is, however, a Helm plugin [helm-diff](https://github.com/databus23/helm-diff)
that can be used.
1. Install `helm-diff` with:
```bash
helm plugin install https://github.com/databus23/helm-diff
```
1. If you are updating your `values.yaml` file, do so now.
1. Take the same `helm upgrade` command you were planning to issue but perform `helm diff upgrade` instead of `helm upgrade`:
```bash
helm diff upgrade consul hashicorp/consul --version 0.24.1 -f /path/to/your/values.yaml
```
This will print out the manifests that will be updated and their diffs.
1. To see only the objects that will be updated, add `| grep "has changed"`:
```bash
helm diff upgrade consul hashicorp/consul --version 0.24.1 -f /path/to/your/values.yaml |
grep "has changed"
```
1. Take specific note if `consul, DaemonSet` or `consul-server, StatefulSet` are listed.
This means that your Consul client daemonset or Consul server statefulset (or both) will be redeployed.
If either is being redeployed, we will follow the same pattern for upgrades as
on other platforms: the servers will be redeployed one-by-one, and then the
clients will be redeployed in batches. Read [Upgrading Consul](/docs/upgrading) and then continue
reading below.
If neither the client daemonset nor the server statefulset is being redeployed,
then you can continue with the helm upgrade without any specific sequence to follow.
## Service Mesh
If you are using Consul's service mesh features, as opposed to the [service sync](/docs/k8s/service-sync)
functionality, you must be aware of the behavior of the service mesh during upgrades.
Consul clients operate as a daemonset across all Kubernernetes nodes. During an upgrade,
if the Consul client daemonset has changed, the client pods will need to be restarted
because their spec has changed.
When a Consul client pod is restarted, it will deregister itself from Consul when it stops.
When the pod restarts, it will re-register itself with Consul.
Thus, during the period between the Consul client on a node stopping and restarting,
the following will occur:
1. The node will be deregistered from Consul. It will not show up in the Consul UI
nor in API requests.
1. Because the node is deregistered, all service pods that were on that node will
also be deregistered. This means they will not receive service mesh traffic
until the Consul client pod restarts.
1. Service pods on that node can continue to make requests through the service
mesh because each Envoy proxy maintains a cache of the locations of upstream
services. However, if the upstream services change IPs, Envoy will not be able
to refresh its cluster information until its local Consul client is restarted.
So services can continue to make requests without downtime for a short period
of time, however, it's important for the local Consul client to be restarted
as quickly as possible.
Once the local Consul client pod restarts, each service pod needs to re-register
itself with the client. This is done automatically by the `consul-connect-lifecycle-sidecar`
sidecar container that is injected alongside each service.
Because service mesh pods are briefly deregistered during a Consul client restart,
it's **important that you do not restart all Consul clients at once**. Otherwise
you may experience downtime because no replicas of a specific service will be in the mesh.
In addition, it's **important that you have multiple replicas** for each service.
If you only have one replica, then during restart of the Consul client on the
node hosting that replica, it will be briefly deregistered from the mesh. Since
it's the only replica, other services will not be able to make calls to that
service. (NOTE: This can also be avoided by stopping that replica so it is rescheduled to
a node whose Consul client has already been updated.)
Given the above, we recommend that after Consul servers are upgraded, the Consul
client daemonset is set to use the `OnDelete` update strategy and Consul clients
are deleted one by one or in batches. See [Upgrading Consul Servers](#upgrading-consul-server)
and [Upgrading Consul Clients](#upgrading-consul-clients) for more details.
## Upgrading Consul Servers
To initiate the upgrade, change the `server.image` value to the
desired Consul version. For illustrative purposes, the example below will
use `consul:123.456`. Also, set the `server.updatePartition` value
_equal to the number of server replicas_:
To initiate the upgrade:
1. Change the `global.image` value to the desired Consul version
1. Set the `server.updatePartition` value _equal to the number of server replicas_.
By default there are 3 servers, so you would set this value to `3`
1. Set the `updateStrategy` for clients to `OnDelete`
```yaml
server:
global:
image: 'consul:123.456'
replicas: 3
server:
updatePartition: 3
client:
updateStrategy: |
type: OnDelete
```
The `updatePartition` value controls how many instances of the server
@ -36,34 +260,63 @@ cluster are updated. Only instances with an index _greater than_ the
`updatePartition` value are updated (zero-indexed). Therefore, by setting
it equal to replicas, none should update yet.
Next, run the upgrade. You should run this with `--dry-run` first to verify
the changes that will be sent to the Kubernetes cluster.
The `updateStrategy` controls how Kubernetes rolls out changes to the client daemonset.
By setting it to `OnDelete`, no clients will be restarted until their pods are deleted.
Without this, they would be redeployed alongside the servers because their Docker
image versions have changed. This is not desirable because we want the Consul
servers to be upgraded _before_ the clients.
```shell-session
$ helm upgrade consul ./
...
1. Next, perform the upgrade:
```bash
helm upgrade consul hashicorp/consul --version <your-version> -f /path/to/your/values.yaml
```
This should cause no changes (although the resource will be updated). If
This will not cause the servers to redeploy (although the resource will be updated). If
everything is stable, begin by decreasing the `updatePartition` value by one,
and running `helm upgrade` again. This should cause the first Consul server
and performing `helm upgrade` again. This will cause the first Consul server
to be stopped and restarted with the new image.
Wait until the Consul server cluster is healthy again (30s to a few minutes)
then decrease `updatePartition` and upgrade again. Continue until
1. Wait until the Consul server cluster is healthy again (30s to a few minutes).
This can be confirmed by issuing `consul members` on one of the previous servers,
and ensuring that all servers are listed and are `alive`.
Decrease `updatePartition` by one and upgrade again. Continue until
`updatePartition` is `0`. At this point, you may remove the
`updatePartition` configuration. Your server upgrade is complete.
## Upgrading Consul Clients
With the servers upgraded, it is time to upgrade the clients. To upgrade
the clients, set the `client.image` value to the desired Consul version.
With the servers upgraded, it is time to upgrade the clients.
If you are using Consul's service mesh features, you will want to be careful
restarting the clients as outlined in [Service Mesh](#service-mesh).
You can either:
1. Manually issue `kubectl delete pod <id>` for each consul daemonset pod
2. Set the updateStrategy to rolling update with a small number:
```yaml
client:
updateStrategy: |
rollingUpdate:
maxUnavailable: 2
type: RollingUpdate
```
Then, run `helm upgrade`. This will upgrade the clients in batches, waiting
until the clients come up healthy before continuing.
3. Cordon and drain each node to ensure there are no connect pods active on it, and then delete the
consul client pod on that node.
-> NOTE: If you are using only the Service Sync functionality, you can perform an upgrade without
following a specific sequence since that component is more resilient to brief restarts of
Consul clients.
## Configuring TLS on an Existing Cluster
If you already have a Consul cluster deployed on Kubernetes and
would like to turn on TLS for internal Consul communication,
please see
[Configuring TLS on an Existing Cluster](/docs/k8s/operations/tls-on-existing-cluster).
[Configuring TLS on an Existing Cluster](/docs/k8s/tls-on-existing-cluster).