Commit Graph

70 Commits (f9f1cabe9c1b55f3d7387bd9f1a0b7d7bdadfa78)

Author SHA1 Message Date
Derek Nola 7c3f21e581
K3s Integration test fixes (#4341)
* Move tests into sub folders
* Updated documentation
* Prevent infinite loop is user has not made k3s

Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-28 12:35:28 -07:00
galal-hussein ab3d25a2c5 Update peer address when running cluster-reset
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-10-25 15:43:27 -07:00
Brian Downs 0452f017c1
Add etcd s3 timeout (#4207) 2021-10-15 10:24:14 -07:00
Brad Davidson 5a923ab8dc Add containerd ready channel to delay etcd node join
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-14 14:03:52 -07:00
Brian Downs ac7a8d89c6
Add ability to reconcile bootstrap data between datastore and disk (#3398) 2021-10-07 12:47:00 -07:00
Brian Downs f4cea90cb9
set transport to skip verify if se skip flag passed (#4102) 2021-09-28 10:13:50 -07:00
Hussein Galal 7826407a2e
Make sure there are no duplicates in etcd member list (#4025)
* Make sure there are no duplicates in etcd member list

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix node names with hyphens

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use full server name for etcd node name

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-09-18 00:51:18 +02:00
Brad Davidson 086ca8ba6a Fix premature etcd shutdown when joining an existing cluster
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-15 10:35:07 -07:00
Chris Kim acf9036b63
No-op when etcd member was already removed and use existing name for etcd controller (#4014)
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-15 08:41:30 -07:00
Chris Kim 928b8531c3
[master] Add `etcd-member-management` controller to K3s (#4001)
* Initial leader elected etcd member management controller
* Bump etcd to v3.5.0-k3s2

Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-14 08:20:38 -07:00
Brad Davidson b4d8c641c6 Add exposed metrics listener instead of replacing loopback listener
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-10 09:39:39 -07:00
Brad Davidson 29c8b238e5 Replace klog with non-exiting fork
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-10 09:36:16 -07:00
Darren Shepherd 741ba95b04 Migrate sqlite data to etcd when initializing the cluster
Signed-off-by: Darren Shepherd <darren@rancher.com>
2021-09-09 10:24:02 -07:00
Devin Buhl a1ec43e0b7
feat: add option to disable s3 over https
Signed-off-by: Devin Buhl <devin.kray@gmail.com>
2021-09-05 12:03:49 -04:00
Brad Davidson b8add39b07 Bump kine for metrics/tls changes
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-01 01:51:30 -07:00
Derek Nola 60297a1bbe
Creation of K3s integration test Sonobuoy plugin (#3931)
* Added test runner and build files
* Changes to int test to output junit results.
* Updated documentation, removed comments

Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-30 08:27:59 -07:00
Derek Nola 114b30277f
Redux: Enable K3s integration test to run on existing cluster (#3905)
* Made it possible to run int tests on existing cluster

Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-26 16:26:19 -07:00
Derek Nola 66dacc6ee0
Revert "Enable K3s integration test to run on existing cluster (#3892)" (#3899)
This reverts commit 703b5af950.
2021-08-24 07:26:14 -07:00
Derek Nola 703b5af950
Enable K3s integration test to run on existing cluster (#3892)
* Made it possible to run int tests on existing cluster

Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-23 12:12:03 -07:00
Brad Davidson e95b75409a Fix lint failures
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson dc14f370c4 Update wrangler to v0.8.5
Required to support apiextensions.v1 as v1beta1 has been deleted. Also
update helm-controller and dynamiclistener to track wrangler versions.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson 872855015c Update etcd to v3.5.0
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Malte Starostik b23955e835
Fix URL pruning when joining an etcd member (#3832)
* Fix URL pruning when joining an etcd member

Problem:
Existing member clientURLs were checked if they contain the joining
node's IP. In some edge cases this would prune valid URLs when the
joining IP is a substring match of the only existing member's IP.
Because of this, it was impossible to e.g. join 10.0.0.2 to an existing
node that has an IP of 10.0.0.2X or 10.0.0.2XX:

level=fatal msg="starting kubernetes: preparing server: start managed database:
joining etcd cluster: etcdclient: no available endpoints"

Solution:
Fixed by properly parsing the URLs and comparing the IPs for equality
instead of substring match.

Signed-off-by: Malte Starostik <info@stellaware.de>
2021-08-12 15:59:04 -07:00
Derek Nola a1e36153f9
Added locking system for integration tests (#3820)
* Added locking system for integration tests
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-10 16:22:12 -07:00
Derek Nola 4cc781b5e3
Moved testing utils into tests directory. Improved gotests template. (#3805)
* Moved testing utils into tests directory. Improved gotests template.
* Updated cgroups2 with util folder rename

Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-10 11:13:26 -07:00
Brian Downs dcf0657b20
account for an s3 folder when listing objects (#3807)
* account for an s3 folder when listing objects
2021-08-09 16:14:41 -07:00
Derek Nola b4eca61aeb
Prevent snapshot commands from creating empty snapshot directory (#3783)
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-09 09:04:18 -07:00
Hussein Galal bc96ffb5f3
Fix Node stuck at deletion (#3771)
* fix Node stuck at deletion

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix Node stuck at deletion

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-08-05 22:32:01 +02:00
Derek Nola 21c8a33647
Introduction of Integration Tests (#3695)
* Commit of new etcd snapshot integration tests.
* Updated integration github action to not run on doc changes.
* Update Drone runner to only run unit tests

Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-26 09:59:33 -07:00
Derek Nola bba49ea447
Fix to allow prune to correctly cleanup custom named snapshots (#3649)
Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-19 14:30:57 -07:00
Derek Nola c833183517
Add unit tests for pkg/etcd (#3549)
* Created new etcd unit tests and testing support file

Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-01 16:08:35 -07:00
Hussein Galal f5fbb9a9a8
Export cli server flags and etcd restoration functions (#3527)
* Export cli server flags and etfd restoration functions

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* export S3

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-06-30 22:29:03 +02:00
Brian Downs ecbf17e2ed move object channel defer close to goroutine
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-05-18 19:58:30 -07:00
Brian Downs 254b52077e add retention default and wire in s3 prune
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-05-18 13:57:40 -07:00
Brian Downs 6ee28214fa
Add the ability to prune etcd snapshots (#3310)
* add prune subcommand to force rentention policy enforcement
2021-05-13 13:36:33 -07:00
Brian Downs bcd8b67db4
Add the ability to list etcd snapshots (#3303)
* add ability to list local and s3 etcd snapshots
2021-05-11 16:59:33 -07:00
Brian Downs e998cd110d
Add the ability to delete an etcd snapshot locally or from S3 (#3277)
* Add the ability to delete a given set of etcd snapshots from the CLI for locally stored and S3 store snapshots.
2021-05-07 16:10:04 -07:00
Brian Downs beb0d8397a reference node name when needed
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-05-04 10:03:28 -07:00
Brian Downs c5ad71ce0b
Collect and Store etcd Snapshots and Metadata (#3239)
* Add the ability to store local etcd snapshots and etcd snapshots stored in an S3 compatible object store in a ConfigMap.
2021-04-30 18:26:39 -07:00
Brian Downs 66ed6efd57 Resolve local retention issue when S3 in use.
Remove early return preventing local retention policy to be enforced
resulting in N number of snapshots being stored.

Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-04-14 10:40:08 -07:00
Hussein Galal 73df65d93a
remove etcd data dir when etcd is disabled (#3059)
* remove etcd data dir when etcd is disabled

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix comment

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use debug instead of info logs

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-03-16 18:14:43 +02:00
Brian Downs 7c99f8645d
Have Bootstrap Data Stored in etcd at Completed Start (#3038)
* have state stored in etcd at completed start and remove unneeded code
2021-03-11 13:07:40 -07:00
Brad Davidson c0d129003b Handle loadbalancer port in TIME_WAIT
If the port wanted by the client load balancer is in TIME_WAIT, startup
will fail. Set SO_REUSEPORT so that it can be listened on again
immediately.

The configurable Listen call wants a context, so plumb that through as
well.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-03-08 17:05:25 -08:00
Brad Davidson 7cdfaad6ce
Always use static ports for client load-balancers (#3026)
* Always use static ports for the load-balancers

This fixes an issue where RKE2 kube-proxy daemonset pods were failing to
communicate with the apiserver when RKE2 was restarted because the
load-balancer used a different port every time it started up.

This also changes the apiserver load-balancer port to be 1 below the
supervisor port instead of 1 above it. This makes the apiserver port
consistent at 6443 across servers and agents on RKE2.

Additional fixes below were required to successfully test and use this change
on etcd-only nodes.

* Actually add lb-server-port flag to CLI
* Fix nil pointer when starting server with --disable-etcd but no --server
* Don't try to use full URI as initial load-balancer endpoint
* Fix etcd load-balancer pool updates
* Update dynamiclistener to fix cert updates on etcd-only nodes
* Handle recursive initial server URL in load balancer
* Don't run the deploy controller on etcd-only nodes
2021-03-06 02:29:57 -08:00
Brian Downs 4d1f9eda9d
Etcd Snapshot/Restore to/from S3 Compatible Backends (#2902)
* Add functionality for etcd snapshot/restore to and from S3 compatible backends.
* Update etcd restore functionality to extract and write certificates and configs from snapshot.
2021-03-03 11:14:12 -07:00
galal-hussein d6124981d5 remove etcd member if disable etcd is passed
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-03-01 23:50:50 +02:00
Hussein Galal 5749f66aa3
Add disable flags for control components (#2900)
* Add disable flags to control components

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* golint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes to disable flags

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add comments to functions

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Fix joining problem

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* golint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix ticker

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix role labels

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-02-12 17:35:57 +02:00
Brad Davidson 071de833ae Fix typo in field tag
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-01-22 19:38:37 -08:00
Yuriy 06fda7accf
Add functionality to bind custom IP address for Etcd metrics endpoint (#2750)
* Add functionality to bind custom IP address for Etcd metrics endpoint

Signed-off-by: yuriydzobak <yurii.dzobak@lotusflare.com>
2021-01-22 17:40:48 -08:00
Brian Downs 13229019f8
Add ability to perform an etcd on-demand snapshot via cli (#2819)
* add ability to perform an etcd on-demand snapshot via cli
2021-01-21 14:09:15 -07:00