github/k3s - k3s - https://git.xinac.net

Commit Graph

Author	SHA1	Message	Date
Brad Davidson	2b63eb4a27	Fix issue with k3s-etcd informers not starting Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit `3d14092f76`) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-05-31 09:16:55 -07:00
Brad Davidson	7d9abc9f07	Improve etcd load-balancer startup behavior Prefer the address of the etcd member being joined, and seed the full address list immediately on startup. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-04-09 15:36:33 -07:00
Brad Davidson	c51d7bfbd1	Add health-check support to loadbalancer * Adds support for health-checking loadbalancer servers. If a health-check fails when dialing, all existing connections to the server will be closed. * Wires up a remotedialer tunnel connectivity check as the health check for supervisor/apiserver connections. * Wires up a simple ping request to the supervisor port as the health check for etcd connections. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2024-03-27 16:50:27 -07:00
Brad Davidson	002e6c43ee	Reorganize Driver interface and etcd driver to avoid passing context and config into most calls Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-09-25 11:54:23 -07:00
Brad Davidson	c44d33d29b	Fix race condition in tunnel server startup Several places in the code used a 5-second retry loop to wait on Runtime.Core to be set. This caused a race condition where OnChange handlers could be added after the Wrangler shared informers were already started. When this happened, the handlers were never called because the shared informers they relied upon were not started. Fix that by requiring anything that waits on Runtime.Core to run from a cluster controller startup hook that is guaranteed to be called before the shared informers are started, instead of just firing it off in a goroutine that retries until it is set. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-04-28 11:24:34 -07:00
Brad Davidson	0c302f4341	Fix etcd member deletion Turns out etcd-only nodes were never running any of the controllers, so allowing multiple controllers didn't really fix things. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2023-02-14 09:39:41 -08:00
Brad Davidson	418c3fa858	Fix issue with datastore corruption on cluster-reset (#5515 ) * Bump etcd to v3.5.4-k3s1 * Fix issue with datastore corruption on cluster-reset * Disable unnecessary components during cluster reset Disable control-plane components and the tunnel setup during cluster-reset, even when not doing a restore. This reduces the amount of log clutter during cluster reset/restore, making any errors encountered more obvious. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-04-27 13:44:15 -07:00
Luther Monson	9a849b1bb7	[master] changing package to k3s-io (#4846 ) * changing package to k3s-io Signed-off-by: Luther Monson <luther.monson@gmail.com> Co-authored-by: Derek Nola <derek.nola@suse.com>	2022-03-02 15:47:27 -08:00
Brad Davidson	2989b8b2c5	Remove unnecessary copies of runtime struct Several types contained redundant references to ControlRuntime data. Switch to consistently accessing this via config.Runtime instead. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2022-02-28 12:05:16 -08:00
Hussein Galal	a939decf01	fix a runtime core panic (#3627 ) Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-07-13 23:33:07 +02:00
Hussein Galal	136dddca11	Fix storing bootstrap data with empty token string (#3422 ) * Fix storing bootstrap data with empty token string Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * delete node password secret after restoration fixes to bootstrap key vendor update Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix comment Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix typo Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * more fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * typos Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Removing dynamic listener file after restoration Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * go mod tidy Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-06-22 22:42:34 +02:00
Hussein Galal	948295e8e8	Fix cluster restoration in rke2 (#3295 ) Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-05-11 00:06:33 +02:00
Hussein Galal	f410fc7d1e	Invoke cluster reset function when only reset flag is passed (#3276 ) Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-05-05 17:40:04 +02:00
Brian Downs	7c99f8645d	Have Bootstrap Data Stored in etcd at Completed Start (#3038 ) * have state stored in etcd at completed start and remove unneeded code	2021-03-11 13:07:40 -07:00
Brad Davidson	7cdfaad6ce	Always use static ports for client load-balancers (#3026 ) * Always use static ports for the load-balancers This fixes an issue where RKE2 kube-proxy daemonset pods were failing to communicate with the apiserver when RKE2 was restarted because the load-balancer used a different port every time it started up. This also changes the apiserver load-balancer port to be 1 below the supervisor port instead of 1 above it. This makes the apiserver port consistent at 6443 across servers and agents on RKE2. Additional fixes below were required to successfully test and use this change on etcd-only nodes. * Actually add lb-server-port flag to CLI * Fix nil pointer when starting server with --disable-etcd but no --server * Don't try to use full URI as initial load-balancer endpoint * Fix etcd load-balancer pool updates * Update dynamiclistener to fix cert updates on etcd-only nodes * Handle recursive initial server URL in load balancer * Don't run the deploy controller on etcd-only nodes	2021-03-06 02:29:57 -08:00
Brian Downs	4d1f9eda9d	Etcd Snapshot/Restore to/from S3 Compatible Backends (#2902 ) * Add functionality for etcd snapshot/restore to and from S3 compatible backends. * Update etcd restore functionality to extract and write certificates and configs from snapshot.	2021-03-03 11:14:12 -07:00
Hussein Galal	5749f66aa3	Add disable flags for control components (#2900 ) * Add disable flags to control components Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * golint Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * more fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fixes to disable flags Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Add comments to functions Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Fix joining problem Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * more fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * golint Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix ticker Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix role labels Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * more fixes Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2021-02-12 17:35:57 +02:00
JenTing Hsiao	57041f0239	Add codespell CI test and fix codespell error (#2740 ) * Add codespell CI test * Fix codespell error	2020-12-22 12:35:58 -08:00
Jacob Blain Christen	36230daa86	[migration k3s-io] update kine dependency (#2568 ) rancher/kine ➡️ k3s-io/kine Part of https://github.com/rancher/k3s/issues/2189 Signed-off-by: Jacob Blain Christen <jacob@rancher.com>	2020-11-30 16:45:22 -07:00
Brad Davidson	de18528412	Make etcd voting members responsible for managing learners (#2399 ) * Set etcd timeouts using values from k8s instead of etcdctl Fix for one of the warnings from #2303 * Use etcd zap logger instead of deprecated capsnlog Fix for one of the warnings from #2303 * Remove member self-promotion code paths * Add learner promotion tracking code * Fix RaftAppliedIndex progress check * Remove ErrGRPCKeyNotFound check This is not used by v3 API - it just returns a response with 0 KVs. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-10-27 11:06:26 -07:00
Hussein Galal	373449ec0a	Allow for multiple etcd snapshot restoration (#2307 ) * add reset tmp file Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * go imports Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix multiple lines string Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * fix typo Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * use resetFile function Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2020-09-30 02:53:31 +02:00
Brad Davidson	8262e23169	Revert removal of EndpointName hooks (#2319 ) * Revert "Remove dead EndpointName code" This reverts commit `8025da5a8d`. * Fix docstrings based on proper understanding of use	2020-09-28 18:13:55 -07:00
Brad Davidson	9074da7405	Fix misc nits and missing/unused imports Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:10:00 -07:00
Brad Davidson	703ba5cde7	Add a bunch of doc comments Also change identical error messages to clarify where problems are occurring. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:10:00 -07:00
Brad Davidson	8025da5a8d	Remove dead EndpointName code According to @galal-hussein this is dead code that was probably brought over from Kine. I certainly couldn't figure out what it is supposed to be doing. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:10:00 -07:00
Brad Davidson	97eb28a01a	Remove unnecessary listener arg from managed DB setup Signed-off-by: Brad Davidson <brad.davidson@rancher.com>	2020-09-27 03:09:45 -07:00
Brian Downs	ba70c41cce	Initial Logging Output Update (#2246 ) This attempts to update logging statements to make them consistent through out the code base. It also adds additional context to messages where possible, simplifies messages, and updates level where necessary.	2020-09-21 09:56:03 -07:00
Hussein Galal	169ee63907	Add etcd members as learners (#2066 ) * Add etcd members as learners Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com> * Ignore errors in promote member Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>	2020-07-29 22:52:49 +02:00
Darren Shepherd	a18d387390	Refactor clustered DB framework	2020-06-06 16:39:41 -07:00

29 Commits (b2a2ac0afc63e9c86bf10ab3c36c9af92c0152a8)