github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Brad Davidson	a39e191906	Add tests for ETCD.Test() Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-10-30 12:18:48 -07:00
Brad Davidson	095e34d816	Fix issues with defragment and alarm clear on etcd startup * Use clientv3.NewCtxClient instead of New to avoid automatic retry of all RPCs * Only timeout status requests; allow defrag and alarm clear requests to run to completion. * Only clear alarms on the local cluster member, not ALL cluster members Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-10-30 12:18:48 -07:00
Brad Davidson	0826ebc142	Fix race condition when multiple nodes reconcile S3 snapshots Don't delete s3 etcdsnapshotfiles if they are missing from s3 but less than a minute old, its possible the other node just finished uploading it and the object key has not yet become visible. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-10-07 11:11:58 -07:00
Brad Davidson	0942e6a0c5	Fix sqlite endpoint when migrating from sqlite to etcd Support for 'sqlite' as the endpoint was removed in https://github.com/k3s-io/kine/pull/320 and the constant removed in https://github.com/k3s-io/kine/pull/325 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-10-03 10:54:03 -07:00
Will	e4f3cc7b54	remove deprecated use of wait functions Signed-off-by: Will <will7989@hotmail.com>	2024-07-29 16:23:17 -07:00
Brad Davidson	c2216a62ad	Use pagination when retrieving etcd snapshot list Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-07-24 12:44:29 -07:00
Brad Davidson	c36db53e54	Add etcd s3 config secret implementation * Move snapshot structs and functions into pkg/etcd/snapshot * Move s3 client code and functions into pkg/etcd/s3 * Refactor pkg/etcd to track snapshot and s3 moves * Add support for reading s3 client config from secret * Add minio client cache, since S3 client configuration can now be changed at runtime by modifying the secret, and don't want to have to create a new minio client every time we read config. * Add tests for pkg/etcd/s3 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-07-10 13:13:55 -07:00
Brad Davidson	aa4794b372	Replace 1-weight semaphore on snapshots with simple mutex Fixes an issue where the semaphore wasn't permanently initialized until a scheduled snapshot was taken, allowing multiple on-demand snapshots to be taken until the first scheduled snapshot was triggered. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-06-19 09:47:58 -07:00
Vitor Savian	d9b8ba8d71	Add snapshot retention etcd-s3-folder fix * Add snapshot retention folder fix Signed-off-by: Vitor Savian <vitor.savian@suse.com> * Add snapshot retention E2E test Signed-off-by: Vitor Savian <vitor.savian@suse.com> --------- Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-06-06 17:31:01 -03:00
Brad Davidson	307f07bd61	Fix issue caused by sole server marked as failed under load If health checks are failing for all servers, make a second pass through the server list with health-checks ignored before returning failure Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-30 11:47:23 -07:00
Brad Davidson	ed23a2bb48	Fix netpol crash when node remains tained unintialized It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-28 23:34:44 -07:00
Brad Davidson	f8e0648304	Convert remaining http handlers over to use util.SendError Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-28 16:24:57 -07:00
Brad Davidson	3d14092f76	Fix issue with k3s-etcd informers not starting Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-28 15:48:15 -07:00
Hussein Galal	144f5ad333	Kubernetes V1.30.0-k3s1 (#10063 ) * kubernetes 1.30.0-k3s1 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Update go version to v1.22.2 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update dynamiclistener and helm-controller Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update go in go.mod to 1.22.2 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update go in Dockerfiles Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update cri-dockerd Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Add proctitle package with linux and windows constraints Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * go mod tidy Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Fixing setproctitle function Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * update dynamiclistener to v0.6.0-rc1 Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> --------- Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2024-05-06 19:42:27 +03:00
Brad Davidson	94e29e2ef5	Make /db/info available anonymously from localhost Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-22 19:34:43 -07:00
Brad Davidson	5b431ca531	Fix on-demand snapshots not honoring folder Also fix etcd s3 tests to actually check that the files are saved to s3 🙃 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-19 23:26:51 -07:00
Brad Davidson	7d9abc9f07	Improve etcd load-balancer startup behavior Prefer the address of the etcd member being joined, and seed the full address list immediately on startup. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-09 15:36:33 -07:00
Brad Davidson	fe465cc832	Move etcd snapshot management CLI to request/response Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-09 15:21:26 -07:00
Derek Nola	14f54d0b26	Transition from deprecated pointer library to ptr (#9801 ) Signed-off-by: Derek Nola <derek.nola@suse.com>	2024-03-28 10:07:02 -07:00
Vitor Savian	5d69d6e782	Add tls for kine Signed-off-by: Vitor Savian <vitor.savian@suse.com> Bump kine Signed-off-by: Vitor Savian <vitor.savian@suse.com> Add integration tests for kine with tls Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-03-28 11:12:07 -03:00
Brad Davidson	c51d7bfbd1	Add health-check support to loadbalancer * Adds support for health-checking loadbalancer servers. If a health-check fails when dialing, all existing connections to the server will be closed. * Wires up a remotedialer tunnel connectivity check as the health check for supervisor/apiserver connections. * Wires up a simple ping request to the supervisor port as the health check for etcd connections. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-27 16:50:27 -07:00
Brad Davidson	edb0440017	Fix etcd snapshot reconcile for agentless nodes Disable cleanup of orphaned snapshots and patching of node annotations if running agentless Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-27 16:44:36 -07:00
Brad Davidson	d7cdbb7d4d	Send error response if member list cannot be retrieved Prevents joining nodes from being stuck with bad initial member list if there is a transient failure, or if they try to join themselves Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-26 15:17:15 -07:00
Brad Davidson	3576ed4327	Clean up snapshotDir create/exists logic Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-04 12:09:29 -08:00
Brad Davidson	82432a2df7	Fix issue with etcd node name missing hostname * Set ServerNodeName in snapshot CLI setup * Raise errer if ServerNodeName ends up empty some other way * Fix status controller to use etcd node name annotation instead of prefix checking Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-01 13:52:53 -08:00
Derek Nola	fae41a8b2a	Rename AgentReady to ContainerRuntimeReady for better clarity Signed-off-by: Derek Nola <derek.nola@suse.com>	2024-02-21 12:21:19 -08:00
Brad Davidson	de825845b2	Bump kine and set NotifyInterval to what the apiserver expects Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-02-09 14:22:38 -08:00
Vitor Savian	f9ee66f4d8	Changed how lastHeartBeatTime works in the etcd condition Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-02-07 15:05:33 -03:00
Brad Davidson	8224a3a7f6	Fix ipv6 endpoint address selection for on-demand snapshots Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-02-06 18:02:36 -08:00
Brad Davidson	6ec1926f88	Add check for etcd-snapshot-dir and fix panic in Walk Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-02-06 17:47:33 -08:00
Brad Davidson	4005600d4e	Fix excessive retry on snapshot reconcile Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-02-06 17:46:24 -08:00
Vitor Savian	9a70021a9e	Error getting node in setEtcdStatusCondition Signed-off-by: Vitor Savian <vitor.savian@suse.com> Added retry and changed nodes for Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-01-11 22:06:36 -03:00
Vitor Savian	4a92ced8ee	Handle etcd status condition when cluster reset and disable etcd Signed-off-by: Vitor Savian <vitor.savian@suse.com> Set condition if node is unhealthy Signed-off-by: Vitor Savian <vitor.savian@suse.com>	2024-01-09 11:20:41 -03:00
Brad Davidson	319dca3e82	Fix nil map in full snapshot configmap reconcile If a full reconcile wins the race against sync of an individual snapshot resource, or someone intentionally deletes the configmap, the data map could be nil and cause a crash. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-01-04 16:49:58 -08:00
Brad Davidson	1e663622d2	Fix the OTHER log message that prints the wrong variable Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-01-04 15:23:39 -08:00
Brad Davidson	6d3a92a658	Print key instead of file path in snapshot metadata log message Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-11-21 14:03:27 -08:00
Brad Davidson	b23e70d519	Don't apply s3 retention if S3 client failed to initialize Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-11-21 14:03:27 -08:00
Brad Davidson	a92c4a0f17	Don't request metadata when listing objects While some implementations may support it, it appears that most don't, and some may in fact return an error if it is requested. We already stat the object to get the metadata anyway, so this was unnecessary if harmless on most implementations. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-11-21 14:03:27 -08:00
Brad Davidson	1e0a7044cf	Reorder snapshot configmap reconcile to reduce log spew during initial startup Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-11-17 10:09:01 -08:00
Vitor Savian	e53c189587	Handle nil pointer when runtime core is not ready in etcd Signed-off-by: Vitor <vitor.savian@suse.com>	2023-11-16 15:58:42 -08:00
Brad Davidson	2088218c5f	Fix issue with snapshot metadata configmap Omit snapshot list configmap entries for snapshots without extra metadata; reduce log level of warnings about missing s3 metadata files. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-11-15 14:25:28 -08:00
Vitor Savian	c5cd7b3d65	Added etcd status condition Signed-off-by: Vitor <vitor.savian@suse.com>	2023-11-13 06:39:24 -08:00
Brad Davidson	49411e7084	Don't try to read token hash and cluster id during cluster-reset These fields are only necessary when saving snapshots to S3, and will block restoration if attempted Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-27 15:06:29 -07:00
Brad Davidson	5b6b9685e9	Manually requeue configmap reconcile when no nodes have reconciled snapshots Silences error message from lasso - this is a normal startup condition when no snapshots exist so we shouldn't log nasty looking errors. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-18 15:09:25 -07:00
Brad Davidson	3db1d33282	Re-enable etcd endpoint auto-sync Removing this in `002e6c43ee` regressed control-plane-only nodes, as we rely on the etcd client to update its endpoint list internally so that we can use it to sync the load-balancer address list. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-18 08:33:03 -07:00
Brad Davidson	9597ea1183	Start etcd client before ensuring self removal Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-13 23:24:16 -07:00
Brad Davidson	d885162967	Add server token hash to CR and S3 This required pulling the token hash stuff out of the cluster package, into util. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	550ab36ab7	Switch to managing ETCDSnapshotFile resources Reconcile snapshot CRs instead of ConfigMap; manage ConfigMap downstream from CR list Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	5cd4f69bfa	Move snapshot delete into local/s3 functions Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	7464007037	Store extra metadata and cluster ID for snapshots Write the extra metadata both locally and to S3. These files are placed such that they will not be used by older versions of K3s that do not make use of them. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00

1 2 3 4

182 Commits (55cda2200e0f3e670970b044871f9ea09134cff6)