github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Brad Davidson	095e34d816	Fix issues with defragment and alarm clear on etcd startup * Use clientv3.NewCtxClient instead of New to avoid automatic retry of all RPCs * Only timeout status requests; allow defrag and alarm clear requests to run to completion. * Only clear alarms on the local cluster member, not ALL cluster members Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-10-30 12:18:48 -07:00
Brad Davidson	0942e6a0c5	Fix sqlite endpoint when migrating from sqlite to etcd Support for 'sqlite' as the endpoint was removed in https://github.com/k3s-io/kine/pull/320 and the constant removed in https://github.com/k3s-io/kine/pull/325 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-10-03 10:54:03 -07:00
Will	e4f3cc7b54	remove deprecated use of wait functions Signed-off-by: Will <will7989@hotmail.com>	2024-07-29 16:23:17 -07:00
Brad Davidson	c36db53e54	Add etcd s3 config secret implementation * Move snapshot structs and functions into pkg/etcd/snapshot * Move s3 client code and functions into pkg/etcd/s3 * Refactor pkg/etcd to track snapshot and s3 moves * Add support for reading s3 client config from secret * Add minio client cache, since S3 client configuration can now be changed at runtime by modifying the secret, and don't want to have to create a new minio client every time we read config. * Add tests for pkg/etcd/s3 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-07-10 13:13:55 -07:00
Brad Davidson	aa4794b372	Replace 1-weight semaphore on snapshots with simple mutex Fixes an issue where the semaphore wasn't permanently initialized until a scheduled snapshot was taken, allowing multiple on-demand snapshots to be taken until the first scheduled snapshot was triggered. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-06-19 09:47:58 -07:00
Brad Davidson	f8e0648304	Convert remaining http handlers over to use util.SendError Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-28 16:24:57 -07:00
Brad Davidson	3d14092f76	Fix issue with k3s-etcd informers not starting Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-28 15:48:15 -07:00
Hussein Galal	144f5ad333	Kubernetes V1.30.0-k3s1 (#10063 ) * kubernetes 1.30.0-k3s1 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Update go version to v1.22.2 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update dynamiclistener and helm-controller Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update go in go.mod to 1.22.2 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update go in Dockerfiles Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update cri-dockerd Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Add proctitle package with linux and windows constraints Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * go mod tidy Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Fixing setproctitle function Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update dynamiclistener to v0.6.0-rc1 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> --------- Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2024-05-06 19:42:27 +03:00
Brad Davidson	94e29e2ef5	Make /db/info available anonymously from localhost Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-22 19:34:43 -07:00
Brad Davidson	7d9abc9f07	Improve etcd load-balancer startup behavior Prefer the address of the etcd member being joined, and seed the full address list immediately on startup. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-09 15:36:33 -07:00
Brad Davidson	fe465cc832	Move etcd snapshot management CLI to request/response Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-09 15:21:26 -07:00
Vitor Savian	5d69d6e782	Add tls for kine Signed-off-by: Vitor Savian <vitor.savian@suse.com> Bump kine Signed-off-by: Vitor Savian <vitor.savian@suse.com> Add integration tests for kine with tls Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-03-28 11:12:07 -03:00
Brad Davidson	d7cdbb7d4d	Send error response if member list cannot be retrieved Prevents joining nodes from being stuck with bad initial member list if there is a transient failure, or if they try to join themselves Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-26 15:17:15 -07:00
Brad Davidson	82432a2df7	Fix issue with etcd node name missing hostname * Set ServerNodeName in snapshot CLI setup * Raise errer if ServerNodeName ends up empty some other way * Fix status controller to use etcd node name annotation instead of prefix checking Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-01 13:52:53 -08:00
Derek Nola	fae41a8b2a	Rename AgentReady to ContainerRuntimeReady for better clarity Signed-off-by: Derek Nola <derek.nola@suse.com>	2024-02-21 12:21:19 -08:00
Brad Davidson	de825845b2	Bump kine and set NotifyInterval to what the apiserver expects Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-02-09 14:22:38 -08:00
Vitor Savian	f9ee66f4d8	Changed how lastHeartBeatTime works in the etcd condition Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-02-07 15:05:33 -03:00
Brad Davidson	8224a3a7f6	Fix ipv6 endpoint address selection for on-demand snapshots Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-02-06 18:02:36 -08:00
Vitor Savian	9a70021a9e	Error getting node in setEtcdStatusCondition Signed-off-by: Vitor Savian <vitor.savian@suse.com> Added retry and changed nodes for Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-01-11 22:06:36 -03:00
Vitor Savian	4a92ced8ee	Handle etcd status condition when cluster reset and disable etcd Signed-off-by: Vitor Savian <vitor.savian@suse.com> Set condition if node is unhealthy Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-01-09 11:20:41 -03:00
Vitor Savian	e53c189587	Handle nil pointer when runtime core is not ready in etcd Signed-off-by: Vitor <vitor.savian@suse.com>	2023-11-16 15:58:42 -08:00
Vitor Savian	c5cd7b3d65	Added etcd status condition Signed-off-by: Vitor <vitor.savian@suse.com>	2023-11-13 06:39:24 -08:00
Brad Davidson	49411e7084	Don't try to read token hash and cluster id during cluster-reset These fields are only necessary when saving snapshots to S3, and will block restoration if attempted Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-27 15:06:29 -07:00
Brad Davidson	3db1d33282	Re-enable etcd endpoint auto-sync Removing this in `002e6c43ee` regressed control-plane-only nodes, as we rely on the etcd client to update its endpoint list internally so that we can use it to sync the load-balancer address list. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-18 08:33:03 -07:00
Brad Davidson	9597ea1183	Start etcd client before ensuring self removal Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-13 23:24:16 -07:00
Brad Davidson	550ab36ab7	Switch to managing ETCDSnapshotFile resources Reconcile snapshot CRs instead of ConfigMap; manage ConfigMap downstream from CR list Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	676b00aa0e	Move etcd snapshot code into separate file Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	002e6c43ee	Reorganize Driver interface and etcd driver to avoid passing context and config into most calls Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-09-25 11:54:23 -07:00
Brad Davidson	890645924f	Don't export functions not needed outside the etcd package Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-09-25 11:54:23 -07:00
Brad Davidson	8c73fd670b	Disable HTTP on main etcd client port Fixes performance issue under load, ref: https://github.com/etcd-io/etcd/issues/15402 and https://github.com/kubernetes/kubernetes/pull/118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-09-25 08:29:57 -07:00
Vitor Savian	e83b1ba4aa	Fixed the etcd retention to delete orphaned snapshots based on the date (#8177 ) * Fix retention using name instead of date Signed-off-by: Vitor <vitor.savian@suse.com>	2023-08-14 18:48:59 -03:00
Ian Cardoso	e551308db8	fix for etcd-snapshot delete with --etcd-s3 flag (#8110 ) k3s etcd-snapshot save --etcd-s3 ... is creating a local snapshot and uploading it to s3 while k3s etcd-snapshot delete --etcd-s3 ... was deleting the snapshot only on s3 buckets, this commit change the behavior of delete to do it locally and on s3 Signed-off-by: Ian Cardoso <osodracnai@gmail.com>	2023-08-04 14:26:32 -03:00
Vitor Savian	ca7aeed090	Etcd snapshots retention when node name changes (#8099 ) Fixed the etcd retention to delete orphaned snapshots Signed-off-by: Vitor <vitor.savian@suse.com>	2023-08-03 10:54:40 -03:00
Brad Davidson	aa76942d0f	Add FilterCN function to prevent SAN Stuffing Wire up a node watch to collect addresses of server nodes, to prevent adding unauthorized SANs to the dynamiclistener cert. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-08-02 11:15:39 -07:00
Brad Davidson	e61fde93c1	Fix MemberList error handling and incorrect etcd-arg passthrough Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-04-28 22:04:30 -07:00
Brad Davidson	91afb38799	Retry cluster join on "too many learners" error Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-04-28 11:28:33 -07:00
Brad Davidson	d95980bba3	Lock bootstrap data with empty key to prevent conflicts Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-04-05 10:56:57 -07:00
Brad Davidson	b010db0cff	Ensure that loopback is used for the advertised address when resetting Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-04-03 17:01:43 -07:00
Brad Davidson	0c302f4341	Fix etcd member deletion Turns out etcd-only nodes were never running any of the controllers, so allowing multiple controllers didn't really fix things. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-02-14 09:39:41 -08:00
Brad Davidson	3d146d2f1b	Allow for multiple sets of leader-elected controllers Addresses an issue where etcd controllers did not run on etcd-only nodes Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-02-10 10:46:48 -08:00
Brad Davidson	a298bfdb18	Add jitter to scheduled snapshots and retry harder on conflicts Also ensure that the snapshot job does not attempt to trigger multiple concurrent runs, as this is not supported. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-01-11 14:32:03 -08:00
Derek Nola	06d81cb936	Replace deprecated ioutil package (#6230 ) * Replace ioutil package * check integration test null pointer * Remove rotate retries Signed-off-by: Derek Nola <derek.nola@suse.com>	2022-10-07 17:36:57 -07:00
Derek Nola	4c0bc8c046	Update etcd error to match correct url (#5909 ) Signed-off-by: Derek Nola <derek.nola@suse.com>	2022-07-29 09:40:53 -07:00
Brad Davidson	5eaa0a9422	Replace getLocalhostIP with Loopback helper method Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-07-21 16:51:57 -07:00
Brad Davidson	1674b9d640	Raise etcd connection test timeout to 30 seconds Addressess issue where the compact may take more than 10 seconds on slower disks. These disks probably aren't really suitable for etcd, but apparently run fine otherwise. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-07-21 13:23:19 -07:00
Brad Davidson	ffe72eecc4	Address issues with etcd snapshots * Increase the default snapshot timeout. The timeout is not currently configurable from Rancher, and larger clusters are frequently seeing uploads fail at 30 seconds. * Enable compression for scheduled snapshots if enabled on the command-line. The CLI flag was not being passed into the etcd config. * Only set the S3 content-type to application/zip if the file is zipped. * Don't run more than one snapshot at once, to prevent misconfigured etcd snapshot cron schedules from stacking up. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-07-12 14:41:38 -07:00
Brad Davidson	6fad63583b	Only listen on loopback when resetting Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-06-15 11:25:54 -07:00
Brad Davidson	fb0a342a20	Sanitize filenames for use in configmap keys If the user points S3 backups at a bucket containing other files, those file names may not be valid configmap keys. For example, RKE1 generates backup files with names like `s3-c-zrjnb-rs-6hxpk_2022-05-05T12:05:15Z.zip`; the semicolons in the timestamp portion of the name are not allowed for use in configmap keys. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-06-15 10:54:26 -07:00
Brad Davidson	ce5b9347c9	Replace DefaultProxyDialerFn dialer injection with EgressSelector support Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-04-29 17:54:36 -07:00
Brad Davidson	418c3fa858	Fix issue with datastore corruption on cluster-reset (#5515 ) * Bump etcd to v3.5.4-k3s1 * Fix issue with datastore corruption on cluster-reset * Disable unnecessary components during cluster reset Disable control-plane components and the tunnel setup during cluster-reset, even when not doing a restore. This reduces the amount of log clutter during cluster reset/restore, making any errors encountered more obvious. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-04-27 13:44:15 -07:00

1 2 3

131 Commits (13e911378764cafb98030ebe80832739ae5ce87e)