Commit Graph

1393 Commits (ebc4e505ea02a9c08ad33bc9ae563b2a35c6bcde)

Author SHA1 Message Date
Alireza Eskandari d416975b02 Add tolerations support for DaemonSet pods
Signed-off-by: Alireza Eskandari <alireza.eskandari@wsd.com>
(cherry picked from commit 22fb7049bd)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-08-12 17:09:56 -07:00
Derek Nola 551616038d Allow Pprof and Superisor metrics in standalone mode (#10576)
* Allow pprof to run on server with `--disable-agent`
* Allow supervisor metrics to run on server with `--disable-agent`

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-08-06 08:51:20 -07:00
Derek Nola 1f238a0155 Use higher QPS for secrets reencryption (#10571)
* Use higher QPS for secrets reencryption

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-08-06 08:51:20 -07:00
Brad Davidson 429d00a93f
[release-1.29] Backports for 2024-08 release cycle (#10665)
* Use pagination when retrieving etcd snapshot list

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit c2216a62ad)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Update secretsencrypt pagination

Make secretsencrypt page size and iteration consistent with other paginators

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 891e72f90f)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Cap length of generated name used for servicelb daemonset

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 21611c5665)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Fix ipv6 sysctl required by non-ipv6 LoadBalancer service

This is a partial revert of 095ecdb034,
with the workaround moved into klipper-lb.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit d4c3422a85)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* remove deprecated use of wait functions

Signed-off-by: Will <will7989@hotmail.com>
(cherry picked from commit e4f3cc7b54)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Update pkg/secretsencrypt/config.go

Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: Will Andrews <will7989@hotmail.com>
(cherry picked from commit 3ec086f6f7)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Update pkg/cluster/managed.go

Co-authored-by: Derek Nola <derek.nola@suse.com>
Signed-off-by: Will Andrews <will7989@hotmail.com>
(cherry picked from commit e2179aa957)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Wire lasso metrics up to common gatherer

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit e168438d44)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Fix cloudprovider controller name

Looking at metrics revealed the cloudprovider controller name was anempty string.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit bffdf463e1)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

---------

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Signed-off-by: Will <will7989@hotmail.com>
Signed-off-by: Will Andrews <will7989@hotmail.com>
Co-authored-by: Will <will7989@hotmail.com>
Co-authored-by: Derek Nola <derek.nola@suse.com>
2024-08-05 09:35:07 -07:00
galal-hussein b59dfc404a Fixing setproctitle function
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
(cherry picked from commit bf6e874241)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-08-02 11:19:19 -07:00
Brad Davidson f246bbc390 Fix inconsistent loading of config dropins when config file does not exist
FindString would silently skip parsing dropins if the main config file
didn't exist. If a custom config file path was passed it would raise an
error, but if we were parsing the default config file and it didn't
exist it would just silently fail to load the dropins.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 17:00:06 -07:00
Brad Davidson 0293118796 Add K3S_DATA_DIR as env var for --data-dir flag
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 17:00:06 -07:00
Brad Davidson 3f2e9e2cb9 Don't use server and token values from config file for etcd-snapshot commands
Fixes an issue where running etcd-snapshot commands on a node that has a server address set in the config will manage snapshots on that server, instead of on the local node as intended.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Brad Davidson 875c61d684 Add dial duration to debug error message
This should give us more detail on how long dials take before failing, so that we can perhaps better tune the retry loop in the future.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Brad Davidson 066ff3c10a Fix IPv6 primary node-ip handling
I should have caught `[]string{cfg.NodeIP}[0]` and `[]string{envInfo.NodeIP.String()}[0]` in code review...

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Brad Davidson a5099da4e2 Fix agents removing configured supervisor address
We shouldn't be replacing the configured server address on agents. Doing
so breaks the agent's ability to fall back to the fixed registration
endpoint when all servers are down, since we replaced it with the first
discovered apiserver address. The fixed registration endpoint will be
restored as default when the service is restarted, but this is not the
correct behavior. This should have only been done on etcd-only nodes
that start up using their local supervisor, but need to switch to a
control-plane node as soon as one is available.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Brad Davidson ea59d1f42c Fix reentrant rlock in loadbalancer.dialContext
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Brad Davidson 389ebc7a5b Add etcd s3 config secret implementation
* Move snapshot structs and functions into pkg/etcd/snapshot
* Move s3 client code and functions into pkg/etcd/s3
* Refactor pkg/etcd to track snapshot and s3 moves
* Add support for reading s3 client config from secret
* Add minio client cache, since S3 client configuration can now be
  changed at runtime by modifying the secret, and don't want to have to
  create a new minio client every time we read config.
* Add tests for pkg/etcd/s3

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit c36db53e54)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Brad Davidson 92f8dc0a15 Ensure remotedialer kubelet connections use kubelet bind address
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit eb8bd15889)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
github-actions[bot] 2bdaaed7bb Bump Local Path Provisioner version (#10394)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit a0b374508e)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:14:33 -07:00
Roberto Bonafiglia b4b156d9d1 Update flannel to v0.25.4 and fixed issue with IPv6 mask
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2024-07-01 18:58:20 +02:00
Brad Davidson 83ae095ab9 Replace 1-weight semaphore on snapshots with simple mutex
Fixes an issue where the semaphore wasn't permanently initialized
until a scheduled snapshot was taken, allowing multiple on-demand
snapshots to be taken until the first scheduled snapshot was triggered.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-19 09:48:09 -07:00
Brad Davidson 4a5f69fae1 Fix agent supervisor port using apiserver port instead
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-13 15:13:34 -07:00
Harrison Affel 125f5bf501 fix typo, use rancher/permissions
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-06-07 08:31:26 -07:00
Brad Davidson 3ef137a8d2 Fix race condition panic in loadbalancer.nextServer
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-07 07:40:10 -07:00
fmoral2 12864fb665
Add test for `isValidResolvConf` (#10302)
Signed-off-by: Francisco <francisco.moral@suse.com>
2024-06-07 11:07:27 -03:00
Vitor Savian 013ec43b02 Add snapshot retention etcd-s3-folder fix
* Add snapshot retention folder fix

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

* Add snapshot retention E2E test

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

---------

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-06-06 20:17:39 -03:00
Brad Davidson 485eaf31b4 Fix bug that caused agents to bypass local loadbalancer
If proxy.SetAPIServerPort was called multiple times, all calls after the
first one would cause the apiserver address to be set to the default
server address, bypassing the local load-balancer. This was most likely
to occur on RKE2, where the supervisor may be up for a period of time
before it is ready to manage node password secrets, causing the agent
to retry.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 1661f1024a)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-04 12:48:16 -07:00
Koen de Laat 9fedcc5220 fix: Use actual warningPeriod in certmonitor
Signed-off-by: Koen de Laat <koen.de.laat@philips.com>
(cherry picked from commit 79ba10f5ec)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-04 12:48:16 -07:00
github-actions[bot] c5efab64d0 Bump Local Path Provisioner version (#10268)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli

(cherry picked from commit 1268779ea0)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-04 12:48:16 -07:00
Katherine Door da2625d1a9 Add write-kubeconfig-group flag to server (#9233)
* Add write-kubeconfig-group flag to server
* update kubectl unable to read config message for kubeconfig mode/group

Signed-off-by: Katherine Pata <me@kitty.sh>
(cherry picked from commit 7a0ea3c953)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 2c50f4aa5b Fix embedded mirror blocked by SAR RBAC and re-enable test
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 8262c02cdd Fix issue caused by sole server marked as failed under load
If health checks are failing for all servers, make a second pass through the server list with health-checks ignored before returning failure

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ca39614d4e)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 2e7b394713 Fix netpol crash when node remains tained unintialized
It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ed23a2bb48)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 0a728b8ff9 Convert remaining http handlers over to use util.SendError
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit f8e0648304)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 7ef30a2c60 Refactor supervisor listener startup and add metrics
* Refactor agent supervisor listener startup and authn/authz to use upstream
  auth delegators to perform for SubjectAccessReview for access to
  metrics.
* Convert spegel and pprof handlers over to new structure.
* Promote bind-address to agent flag to allow setting supervisor bind
  address for both agent and server.
* Promote enable-pprof to agent flag to allow profiling agents. Access
  to the pprof endpoint now requires client cert auth, similar to the
  spegel registry api endpoint.
* Add prometheus metrics handler.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ff679fb3ab)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
galal-hussein c9f3efbe11 Add proctitle package with linux and windows constraints
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
(cherry picked from commit 48ff3bcddb)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 2b63eb4a27 Fix issue with k3s-etcd informers not starting
Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 3d14092f76)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
huangzy d40fc0878f allow helm controller set owner reference
Signed-off-by: huangzy <huangzynn@outlook.com>
(cherry picked from commit 6fcaad553d)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Robert Rose 8a7b0b75fe Follow directory symlinks in auto deploying manifests (#9288)
Signed-off-by: Robert Rose <robert.rose@mailbox.org>
(cherry picked from commit 6886c0977f)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
linxin d386eaf904 Validate resolv.conf for presence of nameserver entries
Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: linxin <linxin@geedgenetworks.com>
(cherry picked from commit f24ba9d3a9)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson d1b3a02af2 Add support for svclb pod PriorityClassName
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 37f97b33c9)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 6452a5ea1b Update local-path-provisioner helper script
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit b453630478)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson 2f3d3aa05b Fix issue with local traffic policy for single-stack services on dual-stack nodes.
Just enable IP forwarding for all address families regardless of service address families.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 095ecdb034)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson ef8bd94480 Bump spegel version
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 5cf4d75749)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson c7d8e98b37 Switch stargz over to cri registry config_path
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 30999f9a07)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson bfc17af8bb Use fixed stream server bind address for cri-dockerd
Will now use 127.0.0.1:10010, same as containerd's CRI

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 7374010c0c)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Brad Davidson c4226adc8f Add WithSkipMissing to not fail import on missing blobs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 5f6b813cc8)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:16:55 -07:00
Thomas Ferrandiz 7ebc6903fa Use TrafficManager interface when calling flannel
Signed-off-by: Thomas Ferrandiz <thomas.ferrandiz@suse.com>
2024-05-28 07:58:19 +00:00
Thomas Ferrandiz 1ec25d8f64 Bump flannel version to v0.25.2
Signed-off-by: Thomas Ferrandiz <thomas.ferrandiz@suse.com>
2024-05-28 07:58:19 +00:00
Manuel Buil 86ad488227 Fix bug when using tailscale config by file
Signed-off-by: Manuel Buil <mbuil@suse.com>
2024-05-24 07:56:20 +02:00
Harrison Affel 1689846299 windows changes
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-05-16 14:55:16 -07:00
Brad Davidson 94e29e2ef5 Make /db/info available anonymously from localhost
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-22 19:34:43 -07:00
Brad Davidson d3b60543e7 Fix 10 second etcd-snapshot request timeout
The default clientaccess request timeout is too short. Wait longer by default, and add the s3 timeout if s3 is enabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-19 23:26:51 -07:00
Brad Davidson 5b431ca531 Fix on-demand snapshots not honoring folder
Also fix etcd s3 tests to actually check that the files are saved to s3 🙃

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-19 23:26:51 -07:00