Commit Graph

1413 Commits (a809749edc51467860d05b633c40c64f5aafcc61)

Author SHA1 Message Date
Brad Davidson d6c20b7452 Fix hosts.toml header var
Resolves issue from 270f85e468 that prevented old hosts.toml files from being cleaned up.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-09-10 14:59:41 -07:00
Arne Winter c4c11e51f1
add node-internal-dns/node-external-dns address pass-through support (#10852)
* add --node-internal-dns and --node-external-dns

Signed-off-by: Arne Winter <github@arnewinter.dev>
Co-authored-by: Brad Davidson <brad@oatmail.org>
2024-09-06 14:15:19 -07:00
Brad Davidson 270f85e468 Only clean up containerd hosts dirs managed by k3s
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-09-05 17:21:55 -07:00
Harsimran Singh Maan 0b4d2497e5 Update coredns to 1.11.3 and metrics-server to 0.7.2
Used https://github.com/coredns/corefile-migration to
migrate the corefile. There are no changes for the
default file from 1.10.1 to 1.11.3.

Notable plugin changes include the k8s_external with fallthrough option
and rewrite with cname_target option.

These changes are not part of the default config that ships
with k3s. Customers using these two plugins can start using the new options

Metrics does not have any new features other than build tooling updates.

Requires https://github.com/rancher/image-mirror/pull/704

Signed-off-by: Harsimran Singh Maan <maan.harry@gmail.com>
2024-08-29 15:00:45 -07:00
Brad Davidson bd45aa5c45 Bump traefik to v2.11.8
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-08-29 14:02:58 -07:00
Derek Nola 85e02e10d7
Remove secrets encryption controller (#10612)
* Remove secrets encryption controller

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-08-26 08:31:49 -07:00
Brad Davidson fe3324cb84 Fix rotateca validation failures when not touching default self-signed CAs
Also silences warnings about bootstrap fields that are not intended to be handled by CA rotation

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-08-22 14:47:40 -07:00
Derek Nola d358a89171 Fix secrets-encrypt metrics
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-08-22 14:23:34 -07:00
galal-hussein 5087240e32 Downgrade Microsoft/hcsshim to v0.8.26
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-08-22 14:23:34 -07:00
galal-hussein 8cbcbcd044 go generate
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-08-22 14:23:34 -07:00
galal-hussein 20b50426ab Update to v1.31.0
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-08-22 14:23:34 -07:00
Alireza Eskandari 22fb7049bd Add tolerations support for DaemonSet pods
Signed-off-by: Alireza Eskandari <alireza.eskandari@wsd.com>
2024-08-12 13:01:27 -07:00
Brad Davidson bffdf463e1 Fix cloudprovider controller name
Looking at metrics revealed the cloudprovider controller name was anempty string.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 21:54:20 -07:00
Brad Davidson e168438d44 Wire lasso metrics up to common gatherer
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 21:54:20 -07:00
Will Andrews e2179aa957 Update pkg/cluster/managed.go
Co-authored-by: Derek Nola <derek.nola@suse.com>
Signed-off-by: Will Andrews <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Will Andrews 3ec086f6f7 Update pkg/secretsencrypt/config.go
Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: Will Andrews <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Will e4f3cc7b54 remove deprecated use of wait functions
Signed-off-by: Will <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Brad Davidson e514940020 Fix inconsistent loading of config dropins when config file does not exist
FindString would silently skip parsing dropins if the main config file
didn't exist. If a custom config file path was passed it would raise an
error, but if we were parsing the default config file and it didn't
exist it would just silently fail to load the dropins.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 15:23:52 -07:00
Brad Davidson 9111b1f77e Add K3S_DATA_DIR as env var for --data-dir flag
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 15:23:52 -07:00
Derek Nola 59e0761043
Use higher QPS for secrets reencryption (#10571)
* Use higher QPS for secrets reencryption

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-07-26 12:07:26 -07:00
Derek Nola a70157c12e
Allow Pprof and Superisor metrics in standalone mode (#10576)
* Allow pprof to run on server with `--disable-agent`
* Allow supervisor metrics to run on server with `--disable-agent`

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-07-26 11:23:57 -07:00
Brad Davidson d4c3422a85 Fix ipv6 sysctl required by non-ipv6 LoadBalancer service
This is a partial revert of 095ecdb034,
with the workaround moved into klipper-lb.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 13:40:33 -07:00
Brad Davidson 21611c5665 Cap length of generated name used for servicelb daemonset
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 13:40:33 -07:00
Brad Davidson 891e72f90f Update secretsencrypt pagination
Make secretsencrypt page size and iteration consistent with other paginators

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 12:44:29 -07:00
Brad Davidson c2216a62ad Use pagination when retrieving etcd snapshot list
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 12:44:29 -07:00
Brad Davidson 37830fe170 Don't use server and token values from config file for etcd-snapshot commands
Fixes an issue where running etcd-snapshot commands on a node that has a server address set in the config will manage snapshots on that server, instead of on the local node as intended.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:12:50 -07:00
Brad Davidson cb6bf74bc4 Add dial duration to debug error message
This should give us more detail on how long dials take before failing, so that we can perhaps better tune the retry loop in the future.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson 118acabec2 Fix IPv6 primary node-ip handling
I should have caught `[]string{cfg.NodeIP}[0]` and `[]string{envInfo.NodeIP.String()}[0]` in code review...

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson 9841517457 Fix agents removing configured supervisor address
We shouldn't be replacing the configured server address on agents. Doing
so breaks the agent's ability to fall back to the fixed registration
endpoint when all servers are down, since we replaced it with the first
discovered apiserver address. The fixed registration endpoint will be
restored as default when the service is restarted, but this is not the
correct behavior. This should have only been done on etcd-only nodes
that start up using their local supervisor, but need to switch to a
control-plane node as soon as one is available.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson 9d0c2e0000 Fix reentrant rlock in loadbalancer.dialContext
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson c36db53e54 Add etcd s3 config secret implementation
* Move snapshot structs and functions into pkg/etcd/snapshot
* Move s3 client code and functions into pkg/etcd/s3
* Refactor pkg/etcd to track snapshot and s3 moves
* Add support for reading s3 client config from secret
* Add minio client cache, since S3 client configuration can now be
  changed at runtime by modifying the secret, and don't want to have to
  create a new minio client every time we read config.
* Add tests for pkg/etcd/s3

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-10 13:13:55 -07:00
Brad Davidson eb8bd15889 Ensure remotedialer kubelet connections use kubelet bind address
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-10 13:00:25 -07:00
github-actions[bot] a0b374508e
Bump Local Path Provisioner version (#10394)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-07-10 12:53:46 -07:00
Roberto Bonafiglia faeaf1b01b Update flannel to v0.25.4 and fixed issue with IPv6 mask
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2024-07-01 18:57:34 +02:00
Brad Davidson aa4794b372 Replace 1-weight semaphore on snapshots with simple mutex
Fixes an issue where the semaphore wasn't permanently initialized
until a scheduled snapshot was taken, allowing multiple on-demand
snapshots to be taken until the first scheduled snapshot was triggered.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-19 09:47:58 -07:00
Brad Davidson b4d4ed8f01 Fix agent supervisor port using apiserver port instead
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-13 15:13:21 -07:00
Harrison Affel f10cb29534 fix typo, use rancher/permissions
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-06-07 08:00:44 -07:00
Brad Davidson c0450a2cb4 Fix race condition panic in loadbalancer.nextServer
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-07 07:39:48 -07:00
Vitor Savian d9b8ba8d71
Add snapshot retention etcd-s3-folder fix
* Add snapshot retention folder fix

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

* Add snapshot retention E2E test

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

---------

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-06-06 17:31:01 -03:00
fmoral2 043b1eac5d
Add test for `isValidResolvConf` (#10302)
Signed-off-by: Francisco <francisco.moral@suse.com>
2024-06-06 17:02:31 -03:00
Brad Davidson 1661f1024a Fix bug that caused agents to bypass local loadbalancer
If proxy.SetAPIServerPort was called multiple times, all calls after the
first one would cause the apiserver address to be set to the default
server address, bypassing the local load-balancer. This was most likely
to occur on RKE2, where the supervisor may be up for a period of time
before it is ready to manage node password secrets, causing the agent
to retry.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-04 11:18:45 -07:00
Koen de Laat 79ba10f5ec fix: Use actual warningPeriod in certmonitor
Signed-off-by: Koen de Laat <koen.de.laat@philips.com>
2024-06-03 11:20:15 -07:00
github-actions[bot] 1268779ea0
Bump Local Path Provisioner version (#10268)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli
2024-06-03 11:19:23 -07:00
Brad Davidson f9130d537d Fix embedded mirror blocked by SAR RBAC and re-enable test
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 08:33:18 -07:00
Katherine Door 7a0ea3c953
Add write-kubeconfig-group flag to server (#9233)
* Add write-kubeconfig-group flag to server
* update kubectl unable to read config message for kubeconfig mode/group

Signed-off-by: Katherine Pata <me@kitty.sh>
2024-05-30 23:45:34 -07:00
Brad Davidson 307f07bd61 Fix issue caused by sole server marked as failed under load
If health checks are failing for all servers, make a second pass through the server list with health-checks ignored before returning failure

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-30 11:47:23 -07:00
Brad Davidson ed23a2bb48 Fix netpol crash when node remains tained unintialized
It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 23:34:44 -07:00
Brad Davidson f8e0648304 Convert remaining http handlers over to use util.SendError
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 16:24:57 -07:00
Brad Davidson ff679fb3ab Refactor supervisor listener startup and add metrics
* Refactor agent supervisor listener startup and authn/authz to use upstream
  auth delegators to perform for SubjectAccessReview for access to
  metrics.
* Convert spegel and pprof handlers over to new structure.
* Promote bind-address to agent flag to allow setting supervisor bind
  address for both agent and server.
* Promote enable-pprof to agent flag to allow profiling agents. Access
  to the pprof endpoint now requires client cert auth, similar to the
  spegel registry api endpoint.
* Add prometheus metrics handler.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 16:24:57 -07:00
Brad Davidson 3d14092f76 Fix issue with k3s-etcd informers not starting
Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 15:48:15 -07:00