This is less effort than passing the tag across steps 🤷♂️
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 378edb939d)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Allow pprof to run on server with `--disable-agent`
* Allow supervisor metrics to run on server with `--disable-agent`
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Use pagination when retrieving etcd snapshot list
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit c2216a62ad)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Update secretsencrypt pagination
Make secretsencrypt page size and iteration consistent with other paginators
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 891e72f90f)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Cap length of generated name used for servicelb daemonset
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 21611c5665)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Fix ipv6 sysctl required by non-ipv6 LoadBalancer service
This is a partial revert of 095ecdb034,
with the workaround moved into klipper-lb.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit d4c3422a85)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* remove deprecated use of wait functions
Signed-off-by: Will <will7989@hotmail.com>
(cherry picked from commit e4f3cc7b54)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Update pkg/secretsencrypt/config.go
Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: Will Andrews <will7989@hotmail.com>
(cherry picked from commit 3ec086f6f7)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Update pkg/cluster/managed.go
Co-authored-by: Derek Nola <derek.nola@suse.com>
Signed-off-by: Will Andrews <will7989@hotmail.com>
(cherry picked from commit e2179aa957)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Wire lasso metrics up to common gatherer
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit e168438d44)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Fix cloudprovider controller name
Looking at metrics revealed the cloudprovider controller name was anempty string.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit bffdf463e1)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
---------
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Signed-off-by: Will <will7989@hotmail.com>
Signed-off-by: Will Andrews <will7989@hotmail.com>
Co-authored-by: Will <will7989@hotmail.com>
Co-authored-by: Derek Nola <derek.nola@suse.com>
FindString would silently skip parsing dropins if the main config file
didn't exist. If a custom config file path was passed it would raise an
error, but if we were parsing the default config file and it didn't
exist it would just silently fail to load the dropins.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This was only used to pass the bundled strongswan path through to the flannel ipsec backend, and is no longer needed. Ref: #719
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move test-compat to GHA (#10414)
Signed-off-by: Derek Nola <derek.nola@suse.com>
* For E2E upgrade test, automatically determine the channel to use (#10461)
Signed-off-by: Derek Nola <derek.nola@suse.com>
Fixes an issue where running etcd-snapshot commands on a node that has a server address set in the config will manage snapshots on that server, instead of on the local node as intended.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This should give us more detail on how long dials take before failing, so that we can perhaps better tune the retry loop in the future.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
I should have caught `[]string{cfg.NodeIP}[0]` and `[]string{envInfo.NodeIP.String()}[0]` in code review...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
We shouldn't be replacing the configured server address on agents. Doing
so breaks the agent's ability to fall back to the fixed registration
endpoint when all servers are down, since we replaced it with the first
discovered apiserver address. The fixed registration endpoint will be
restored as default when the service is restarted, but this is not the
correct behavior. This should have only been done on etcd-only nodes
that start up using their local supervisor, but need to switch to a
control-plane node as soon as one is available.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move snapshot structs and functions into pkg/etcd/snapshot
* Move s3 client code and functions into pkg/etcd/s3
* Refactor pkg/etcd to track snapshot and s3 moves
* Add support for reading s3 client config from secret
* Add minio client cache, since S3 client configuration can now be
changed at runtime by modifying the secret, and don't want to have to
create a new minio client every time we read config.
* Add tests for pkg/etcd/s3
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit c36db53e54)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* chore: Bump Local Path Provisioner version
Made with ❤️️ by updatecli
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit a0b374508e)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Also remove the wg-add script that has been unused since v1.26 dropped the legacy wireguard backend
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 047664b610)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes an issue where the semaphore wasn't permanently initialized
until a scheduled snapshot was taken, allowing multiple on-demand
snapshots to be taken until the first scheduled snapshot was triggered.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If proxy.SetAPIServerPort was called multiple times, all calls after the
first one would cause the apiserver address to be set to the default
server address, bypassing the local load-balancer. This was most likely
to occur on RKE2, where the supervisor may be up for a period of time
before it is ready to manage node password secrets, causing the agent
to retry.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 1661f1024a)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* chore: Bump Local Path Provisioner version
Made with ❤️️ by updatecli
(cherry picked from commit 1268779ea0)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Add write-kubeconfig-group flag to server
* update kubectl unable to read config message for kubeconfig mode/group
Signed-off-by: Katherine Pata <me@kitty.sh>
(cherry picked from commit 7a0ea3c953)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If health checks are failing for all servers, make a second pass through the server list with health-checks ignored before returning failure
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ca39614d4e)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ed23a2bb48)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>