github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Brad Davidson	7ecd5874d2	Skip initial datastore reconcile during cluster-reset Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-11-15 14:31:44 -08:00
Brad Davidson	d885162967	Add server token hash to CR and S3 This required pulling the token hash stuff out of the cluster package, into util. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	7464007037	Store extra metadata and cluster ID for snapshots Write the extra metadata both locally and to S3. These files are placed such that they will not be used by older versions of K3s that do not make use of them. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-10-12 15:04:45 -07:00
Brad Davidson	002e6c43ee	Reorganize Driver interface and etcd driver to avoid passing context and config into most calls Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-09-25 11:54:23 -07:00
Brad Davidson	d95980bba3	Lock bootstrap data with empty key to prevent conflicts Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-04-05 10:56:57 -07:00
Brad Davidson	992e64993d	Add support for kubeadm token and client certificate auth Allow bootstrapping with kubeadm bootstrap token strings or existing Kubelet certs. This allows agents to join the cluster using kubeadm bootstrap tokens, as created with the `k3s token create` command. When the token expires or is deleted, agents can successfully restart by authenticating with their kubelet certificate via node authentication. If the token is gone and the node is deleted from the cluster, node auth will fail and they will be prevented from rejoining the cluster until provided with a valid token. Servers still must be bootstrapped with the static cluster token, as they will need to know it to decrypt the bootstrap data. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-02-07 14:55:04 -08:00
Derek Nola	13c633da12	Add Secrets Encryption to CriticalArgs (#6409 ) * Add EncryptSecrets to Critical Control Args * use deep comparison to extract differences Signed-off-by: Derek Nola <derek.nola@suse.com> Signed-off-by: Derek Nola <derek.nola@suse.com>	2022-11-04 10:35:29 -07:00
iyear	3aae7b8783	Fix incorrect defer usage Problem: Using defer inside a loop can lead to resource leaks Solution: Judge newer file in the separate function Signed-off-by: iyear <ljyngup@gmail.com>	2022-11-01 16:23:25 -07:00
Derek Nola	06d81cb936	Replace deprecated ioutil package (#6230 ) * Replace ioutil package * check integration test null pointer * Remove rotate retries Signed-off-by: Derek Nola <derek.nola@suse.com>	2022-10-07 17:36:57 -07:00
Brad Davidson	fc1c100ffd	Remove legacy bidirectional datastore sync code Since #4438 removed 2-way sync and treats any changed+newer files on disk as an error, we no longer need to determine if files are newer on disk/db or if there is a conflicting mix of both. Any changed+newer file is an error, unless we're doing a cluster reset in which case everything is unconditionally replaced. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-07-12 12:10:30 -07:00
Brad Davidson	83420ef78e	Fix fatal error when reconciling bootstrap data Properly skip restoring bootstrap data for files that don't have a path set because the feature that would set it isn't enabled. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-07-12 12:10:30 -07:00
Brad Davidson	96162c07c5	Handle egress-selector-mode change during upgrade Properly handle unset egress-selector-mode from existing servers during cluster upgrade. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-06-30 11:57:41 -07:00
Brad Davidson	1339626a5b	Defragment etcd datastore before clearing alarms Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-03-28 09:27:59 -07:00
Brad Davidson	3cebde924b	Handle empty entries in bootstrap path map Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-03-17 13:42:27 -07:00
Luther Monson	9a849b1bb7	[master] changing package to k3s-io (#4846 ) * changing package to k3s-io Signed-off-by: Luther Monson <luther.monson@gmail.com> Co-authored-by: Derek Nola <derek.nola@suse.com>	2022-03-02 15:47:27 -08:00
Brad Davidson	9a48086524	Ignore cluster membership errors when reconciling from temp etcd Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-03-01 20:25:20 -08:00
Brad Davidson	e4846c92b4	Move temporary etcd startup into etcd module Reuse the existing etcd library code to start up the temporary etcd server for bootstrap reconcile. This allows us to do proper health-checking of the datastore on startup, including handling of alarms. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-03-01 20:25:20 -08:00
Brad Davidson	5014c9e0e8	Fix adding etcd-only node to existing cluster Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-02-28 19:56:08 -08:00
Brad Davidson	a1b800f0bf	Remove unnecessary copies of etcdconfig struct Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-02-28 12:05:16 -08:00
Brad Davidson	2989b8b2c5	Remove unnecessary copies of runtime struct Several types contained redundant references to ControlRuntime data. Switch to consistently accessing this via config.Runtime instead. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-02-28 12:05:16 -08:00
Brad Davidson	5ca206ad3b	Fix handling of agent-token fallback to token Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-01-07 09:56:37 -08:00
Brad Davidson	e7464a17f7	Fix use of agent creds for secrets-encrypt and config validate Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-01-06 12:55:18 -08:00
Brian Downs	3ae550ae51	Update bootstrap logic to output all changed files on disk (#4800 )	2021-12-21 14:28:32 -07:00
Brad Davidson	8ad7d141e8	Close etcd clients to avoid leaking GRPC connections If you don't explicitly close the etcd client when you're done with it, the GRPC connection hangs around in the background. Normally this is harmelss, but in the case of the temporary etcd we start up on 2399 to reconcile bootstrap data, the client will start logging errors afterwards when the server goes away. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2021-12-17 23:55:17 -08:00
Derek Nola	17eebe0563	Fix cold boot and reconcilation on secondary servers (#4747 ) * Enable reconcilation on secondary servers Signed-off-by: Derek Nola <derek.nola@suse.com> * Remove unused code Signed-off-by: Derek Nola <derek.nola@suse.com> * Attempt to reconcile with datastore first Signed-off-by: Derek Nola <derek.nola@suse.com> * Added warning on failure Signed-off-by: Derek Nola <derek.nola@suse.com> * Update warning Signed-off-by: Derek Nola <derek.nola@suse.com> * golangci-lint fix Signed-off-by: Derek Nola <derek.nola@suse.com>	2021-12-15 15:38:50 -08:00
Hussein Galal	d71b335871	Fix snapshot restoration on fresh nodes (#4737 ) Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-12-14 02:04:39 +02:00
Brian Downs	bf4e037fcf	Resolve Bootstrap Migration Edge Case (#4730 )	2021-12-13 13:02:30 -07:00
Brian Downs	a6fe2c0bc5	Resolve restore bootstrap (#4704 )	2021-12-09 14:54:27 -07:00
Manuel Buil	1e0696628e	Merge pull request #4581 from manuelbuil/checking-HA-parameters Verify new control plane nodes joining the cluster share the same config as cluster members	2021-12-08 10:49:28 +01:00
Derek Nola	bcb662926d	Secrets-encryption rotation (#4372 ) * Regular CLI framework for encrypt commands * New secrets-encryption feature * New integration test * fixes for flaky integration test CI * Fix to bootstrap on restart of existing nodes * Consolidate event recorder Signed-off-by: Derek Nola <derek.nola@suse.com>	2021-12-07 14:31:32 -08:00
Manuel Buil	1b3187ea07	Check HA network parameters Signed-off-by: Manuel Buil <mbuil@suse.com>	2021-12-07 23:09:05 +01:00
Chris Kim	f18b3252c0	[master] Add etcd extra args support for K3s (#4463 ) * Add etcd extra args support for K3s Signed-off-by: Chris Kim <oats87g@gmail.com> * Add etcd custom argument integration test Signed-off-by: Chris Kim <oats87g@gmail.com> * go generate Signed-off-by: Chris Kim <oats87g@gmail.com>	2021-11-11 21:03:15 -08:00
Brian Downs	adaeae351c	update bootstrap logic (#4438 ) * update bootstrap logic resolving a startup bug and account for etcd	2021-11-10 05:33:42 -07:00
Brian Downs	0a0b915921	reset buffer after use (#4279 )	2021-10-22 15:56:01 -07:00
Brian Downs	34080b23b1	Copy old bootstrap buffer data for use during migration (#4215 )	2021-10-15 10:17:29 -07:00
Brian Downs	ac7a8d89c6	Add ability to reconcile bootstrap data between datastore and disk (#3398 )	2021-10-07 12:47:00 -07:00
Hussein Galal	136dddca11	Fix storing bootstrap data with empty token string (#3422 ) * Fix storing bootstrap data with empty token string Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * delete node password secret after restoration fixes to bootstrap key vendor update Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix comment Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix typo Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * more fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * typos Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Removing dynamic listener file after restoration Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * go mod tidy Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-06-22 22:42:34 +02:00
Brian Downs	7c99f8645d	Have Bootstrap Data Stored in etcd at Completed Start (#3038 ) * have state stored in etcd at completed start and remove unneeded code	2021-03-11 13:07:40 -07:00
Brad Davidson	7cdfaad6ce	Always use static ports for client load-balancers (#3026 ) * Always use static ports for the load-balancers This fixes an issue where RKE2 kube-proxy daemonset pods were failing to communicate with the apiserver when RKE2 was restarted because the load-balancer used a different port every time it started up. This also changes the apiserver load-balancer port to be 1 below the supervisor port instead of 1 above it. This makes the apiserver port consistent at 6443 across servers and agents on RKE2. Additional fixes below were required to successfully test and use this change on etcd-only nodes. * Actually add lb-server-port flag to CLI * Fix nil pointer when starting server with --disable-etcd but no --server * Don't try to use full URI as initial load-balancer endpoint * Fix etcd load-balancer pool updates * Update dynamiclistener to fix cert updates on etcd-only nodes * Handle recursive initial server URL in load balancer * Don't run the deploy controller on etcd-only nodes	2021-03-06 02:29:57 -08:00
Brian Downs	13229019f8	Add ability to perform an etcd on-demand snapshot via cli (#2819 ) * add ability to perform an etcd on-demand snapshot via cli	2021-01-21 14:09:15 -07:00
JenTing Hsiao	57041f0239	Add codespell CI test and fix codespell error (#2740 ) * Add codespell CI test * Fix codespell error	2020-12-22 12:35:58 -08:00
Brad Davidson	c3c983198f	Add temporary fix for issue with interrupted etcd promote This is a minimal fix for https://github.com/rancher/rke2/issues/392 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-30 11:45:58 -07:00
Brad Davidson	45dd4afe50	Simplify token parsing Improves readability, reduces round-trips to the join server to validate certs. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:26:24 -07:00
Brad Davidson	9074da7405	Fix misc nits and missing/unused imports Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:10:00 -07:00
Brad Davidson	703ba5cde7	Add a bunch of doc comments Also change identical error messages to clarify where problems are occurring. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:10:00 -07:00
Brad Davidson	a3bbd58f37	Fix managed etcd cold startup deadlock issue #2249 We should ignore --token and --server if the managed database is initialized, just like we ignore --cluster-init. If the user wants to join a new cluster, or rejoin a cluster after --cluster-reset, they need to delete the database. This a cleaner way to prevent deadlocking on quorum loss, and removes the requirement that the target of the --server argument must be online before already joined nodes can start. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 02:44:49 -07:00
Darren Shepherd	a18d387390	Refactor clustered DB framework	2020-06-06 16:39:41 -07:00

47 Commits (1fe0371e95ba296f5f0b925bc6c5648c2205c9df)