Using the node external IP address for all CNI traffic is a breaking change from previous versions; we should make it an opt-in for distributed clusters instead of default behavior.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If CCM and ServiceLB are both disabled, don't run the cloud-controller-manager at all;
this should provide the same CLI flag behavior as previous releases, and not create
problems when users disable the CCM but still want ServiceLB.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Consolidate data dir flag
* Group cluster flags together
* Reorder and group agent flags
* Add additional info around vmodule flag
* Hide deprecated flags, and add warning about their removal
Signed-off-by: Derek Nola <derek.nola@suse.com>
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Increase the default snapshot timeout. The timeout is not currently
configurable from Rancher, and larger clusters are frequently seeing
uploads fail at 30 seconds.
* Enable compression for scheduled snapshots if enabled on the
command-line. The CLI flag was not being passed into the etcd config.
* Only set the S3 content-type to application/zip if the file is zipped.
* Don't run more than one snapshot at once, to prevent misconfigured
etcd snapshot cron schedules from stacking up.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.
Signed-off-by: Darren Shepherd <darren@acorn.io>
Allow the flannel backend to be specified as
backend=option=val,option2=val2 to select a given backend with extra options.
In particular this adds the following options to wireguard-native
backend:
* Mode - flannel wireguard tunnel mode
* PersistentKeepaliveInterval- wireguard persistent keepalive interval
Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
Also update all use of 'go get' => 'go install', update CI tooling for
1.18 compatibility, and gofmt everything so lint passes.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Bump etcd to v3.5.4-k3s1
* Fix issue with datastore corruption on cluster-reset
* Disable unnecessary components during cluster reset
Disable control-plane components and the tunnel setup during
cluster-reset, even when not doing a restore. This reduces the amount of
log clutter during cluster reset/restore, making any errors encountered
more obvious.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This requires a further set of gofmt -s improvements to the
code, but nothing major. golangci-lint 1.45.2 brings golang 1.18
support which might be needed in the future.
Signed-off-by: Dirk Müller <dirk@dmllr.de>
This change allows to define two cluster CIDRs for compatibility with
Kubernetes dual-stuck, with an assumption that two CIDRs are usually
IPv4 and IPv6.
It does that by levearaging changes in out kube-router fork, with the
following downstream release:
https://github.com/k3s-io/kube-router/releases/tag/v1.3.2%2Bk3s
Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
Improve feedback when running secrets-encrypt commands on etcd-only nodes, and
allow etcd-only nodes to properly restart when effecting rotation.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Reuse the existing etcd library code to start up the temporary etcd
server for bootstrap reconcile. This allows us to do proper
health-checking of the datastore on startup, including handling of
alarms.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Several types contained redundant references to ControlRuntime data. Switch to consistently accessing this via config.Runtime instead.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Add etcd extra args support for K3s
Signed-off-by: Chris Kim <oats87g@gmail.com>
* Add etcd custom argument integration test
Signed-off-by: Chris Kim <oats87g@gmail.com>
* go generate
Signed-off-by: Chris Kim <oats87g@gmail.com>
Using MAINPID breaks systemd's exit detection, as it stops watching the
original pid, but is unable to watch the new pid as it is not a child
of systemd itself. The best we can do is just notify when execing the child
process.
We also need to consolidate forking into a sigle place so that we don't
end up with multiple levels of child processes if both redirecting log
output and reaping child processes.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Make sure there are no duplicates in etcd member list
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix node names with hyphens
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use full server name for etcd node name
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
Also honor node-ip when adding the node address to the SAN list, instead
of hardcoding the autodetected IP address.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Reset load balancer state during restoraion
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Reset load balancer state during restoraion
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* switch image names to the ones with the prefix mirrored
* bump rancher/mirrored-coredns-coredns to 1.8.4
Signed-off-by: Jiaqi Luo <6218999+jiaqiluo@users.noreply.github.com>
Sync DisableKubeProxy from cfg into control before sending control to clients,
as it may have been modified by a startup hook.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Commit of new etcd snapshot integration tests.
* Updated integration github action to not run on doc changes.
* Update Drone runner to only run unit tests
Signed-off-by: dereknola <derek.nola@suse.com>
the server's built-in helm controller.
Problem:
Testing installation and uninstallation of the Helm Controller on k3s is
not possible if the Helm Controller is baked into the k3s server.
Solution:
The Helm Controller can optionally be disabled, which will allow users
to manage its installation manually.
Signed-off-by: Joe Kralicky <joe.kralicky@suse.com>
* Move registries.yaml handling out to rancher/wharfie
* Add system-default-registry support
* Add CLI support for kubelet image credential providers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Always use static ports for the load-balancers
This fixes an issue where RKE2 kube-proxy daemonset pods were failing to
communicate with the apiserver when RKE2 was restarted because the
load-balancer used a different port every time it started up.
This also changes the apiserver load-balancer port to be 1 below the
supervisor port instead of 1 above it. This makes the apiserver port
consistent at 6443 across servers and agents on RKE2.
Additional fixes below were required to successfully test and use this change
on etcd-only nodes.
* Actually add lb-server-port flag to CLI
* Fix nil pointer when starting server with --disable-etcd but no --server
* Don't try to use full URI as initial load-balancer endpoint
* Fix etcd load-balancer pool updates
* Update dynamiclistener to fix cert updates on etcd-only nodes
* Handle recursive initial server URL in load balancer
* Don't run the deploy controller on etcd-only nodes
* Add functionality for etcd snapshot/restore to and from S3 compatible backends.
* Update etcd restore functionality to extract and write certificates and configs from snapshot.
Adds support for retagging images to appear to have been sourced from
one or more additional registries as they are imported from the tarball.
This is intended to support RKE2 use cases with system-default-registry
where the images need to appear to have been pulled from a registry
other than docker.io.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
The --disable/--no-deploy flags actually turn off some built-in
controllers, in addition to preventing manifests from getting loaded.
Make it clear which controllers can still be disabled even when the
packaged components are ommited by the no_stage build tag.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This attempts to update logging statements to make them consistent
through out the code base. It also adds additional context to messages
where possible, simplifies messages, and updates level where necessary.