Brad Davidson
b12cd62935
Move IPv4/v6 selection into helpers
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Roberto Bonafiglia
9c9adda61b
Added default endpoint for IPv6
...
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-04-14 09:58:40 +02:00
Brad Davidson
f37e7565b8
Move the apiserver addresses controller into the etcd package
...
This controller only needs to run when using managed etcd, so move it in
with the rest of the etcd stuff. This change also modifies the
controller to only watch the Kubernetes service endpoint, instead of
watching all endpoints in the entire cluster.
Fixes an error message revealed by use of a newer grpc client in
Kubernetes 1.24, which logs an error when the Put to etcd failed because
kine doesn't support the etcd Put operation. The controller shouldn't
have been running without etcd in the first place.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-07 11:28:15 -07:00
Brad Davidson
2a429aac65
Fix crash on early snapshot
...
Don't attempt to retrieve snapshot metadata configmap if the apiserver
isn't available. This could be triggered if the cron expression caused a
snapshot to be triggered before the apiserver is up.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-07 09:23:34 -07:00
Roberto Bonafiglia
4afeb9c5c7
Merge pull request #5325 from rbrtbnfgl/fix-etcd-ipv6-url
...
Fixed etcd URL in case of IPv6 address
2022-04-05 09:55:42 +02:00
Roberto Bonafiglia
0746dde758
Fixed http URL on etcd
...
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-31 14:24:59 +02:00
Roberto Bonafiglia
06c779c57d
Fixed loadbalancer in case of IPv6 addresses
...
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-31 11:49:30 +02:00
Brad Davidson
62cc1ed24f
Skip setting up client tls when etcd server does not have tls enabled
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-30 01:03:41 -07:00
Roberto Bonafiglia
dda409b041
Updated localhost address on IPv6 only setup
...
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-29 09:35:54 +02:00
Brad Davidson
1339626a5b
Defragment etcd datastore before clearing alarms
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-28 09:27:59 -07:00
Roberto Bonafiglia
2285aa699b
Fixed etcd URL in case of IPv6 address
...
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-23 15:35:51 +01:00
Brad Davidson
078da46532
Close additional leaked GPRC clients
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-15 18:07:55 -07:00
Derek Nola
1f7abe5dbb
Testing directory and documentation rework. ( #5256 )
...
* Removed vagrant folder
* Fix comments around E2E ENVs
* Eliminate testutil folder
* Convert flock integration test to unit test
* Point to other READMEs
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-03-15 10:29:56 -07:00
Luther Monson
9a849b1bb7
[master] changing package to k3s-io ( #4846 )
...
* changing package to k3s-io
Signed-off-by: Luther Monson <luther.monson@gmail.com>
Co-authored-by: Derek Nola <derek.nola@suse.com>
2022-03-02 15:47:27 -08:00
Brad Davidson
9a48086524
Ignore cluster membership errors when reconciling from temp etcd
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-01 20:25:20 -08:00
Brad Davidson
e4846c92b4
Move temporary etcd startup into etcd module
...
Reuse the existing etcd library code to start up the temporary etcd
server for bootstrap reconcile. This allows us to do proper
health-checking of the datastore on startup, including handling of
alarms.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-01 20:25:20 -08:00
Brad Davidson
555087b9b8
Add function to clear local alarms on etcd startup
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-01 11:56:52 -08:00
Brad Davidson
5014c9e0e8
Fix adding etcd-only node to existing cluster
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 19:56:08 -08:00
Brad Davidson
2989b8b2c5
Remove unnecessary copies of runtime struct
...
Several types contained redundant references to ControlRuntime data. Switch to consistently accessing this via config.Runtime instead.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 12:05:16 -08:00
Derek Nola
e28be2912c
Migrate Ginkgo testing framework to V2, consolidate integration tests ( #5097 )
...
* Upgrade and convert ginkgo from v1 to v2
* Move all integration tests into integration folder
* Update TESTING.md
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-02-09 08:22:53 -08:00
Hussein Galal
13728058a4
Add k3s etcd restoration integration test ( #5014 )
...
* Add k3s etcd restoration test
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Fix tests and rebase
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Reorganizing the tests
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Fixing comments
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Fix etcd restore
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* dont check for errors when restoring
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use eventually to test for restoration
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix tests
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix golint
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2022-02-08 21:24:34 +02:00
Brian Downs
effcb15adb
Adds the ability to compress etcd snapshots ( #4866 )
2022-01-14 10:31:22 -07:00
Derek Nola
2ac8df3602
Integration tests utilities improvements ( #4832 )
...
* Remove sudo commands from integration tests
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Added cleanup fucntion
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Implement better int cleanup
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Rename test utils
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Enable K3sCmd to be a single string
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Removed parsePod function
Signed-off-by: Derek Nola <derek.nola@suse.com>
* codespell
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Revert startup timeout
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Reorder sonobuoy tests, drop concurrent tests to 3
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Disable etcd
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Skip parallel testing for etcd
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-01-06 08:05:56 -08:00
Brad Davidson
a5c6e6a68a
Fix panic checking name of uninitialized etcd member
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-21 23:38:20 -08:00
Hussein Galal
d71b335871
Fix snapshot restoration on fresh nodes ( #4737 )
...
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-12-14 02:04:39 +02:00
Brian Downs
a6fe2c0bc5
Resolve restore bootstrap ( #4704 )
2021-12-09 14:54:27 -07:00
Manuel Buil
1b3187ea07
Check HA network parameters
...
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-12-07 23:09:05 +01:00
Derek Nola
d05c334a78
Improved cleanup for etcd unit test ( #4537 )
...
* Improved cleanup for etcd unit test
Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-11-29 14:46:58 -08:00
Chris Kim
ae4a1a144a
etcd snapshot functionality enhancements ( #4453 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-11-29 10:30:04 -08:00
Chris Kim
f18b3252c0
[master] Add etcd extra args support for K3s ( #4463 )
...
* Add etcd extra args support for K3s
Signed-off-by: Chris Kim <oats87g@gmail.com>
* Add etcd custom argument integration test
Signed-off-by: Chris Kim <oats87g@gmail.com>
* go generate
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-11-11 21:03:15 -08:00
Brian Downs
adaeae351c
update bootstrap logic ( #4438 )
...
* update bootstrap logic resolving a startup bug and account for etcd
2021-11-10 05:33:42 -07:00
Derek Nola
7c3f21e581
K3s Integration test fixes ( #4341 )
...
* Move tests into sub folders
* Updated documentation
* Prevent infinite loop is user has not made k3s
Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-28 12:35:28 -07:00
galal-hussein
ab3d25a2c5
Update peer address when running cluster-reset
...
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-10-25 15:43:27 -07:00
Brian Downs
0452f017c1
Add etcd s3 timeout ( #4207 )
2021-10-15 10:24:14 -07:00
Brad Davidson
5a923ab8dc
Add containerd ready channel to delay etcd node join
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-14 14:03:52 -07:00
Brian Downs
ac7a8d89c6
Add ability to reconcile bootstrap data between datastore and disk ( #3398 )
2021-10-07 12:47:00 -07:00
Brian Downs
f4cea90cb9
set transport to skip verify if se skip flag passed ( #4102 )
2021-09-28 10:13:50 -07:00
Hussein Galal
7826407a2e
Make sure there are no duplicates in etcd member list ( #4025 )
...
* Make sure there are no duplicates in etcd member list
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix node names with hyphens
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use full server name for etcd node name
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-09-18 00:51:18 +02:00
Brad Davidson
086ca8ba6a
Fix premature etcd shutdown when joining an existing cluster
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-15 10:35:07 -07:00
Chris Kim
acf9036b63
No-op when etcd member was already removed and use existing name for etcd controller ( #4014 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-15 08:41:30 -07:00
Chris Kim
928b8531c3
[master] Add `etcd-member-management` controller to K3s ( #4001 )
...
* Initial leader elected etcd member management controller
* Bump etcd to v3.5.0-k3s2
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-14 08:20:38 -07:00
Brad Davidson
b4d8c641c6
Add exposed metrics listener instead of replacing loopback listener
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-10 09:39:39 -07:00
Brad Davidson
29c8b238e5
Replace klog with non-exiting fork
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-10 09:36:16 -07:00
Darren Shepherd
741ba95b04
Migrate sqlite data to etcd when initializing the cluster
...
Signed-off-by: Darren Shepherd <darren@rancher.com>
2021-09-09 10:24:02 -07:00
Devin Buhl
a1ec43e0b7
feat: add option to disable s3 over https
...
Signed-off-by: Devin Buhl <devin.kray@gmail.com>
2021-09-05 12:03:49 -04:00
Brad Davidson
b8add39b07
Bump kine for metrics/tls changes
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-01 01:51:30 -07:00
Derek Nola
60297a1bbe
Creation of K3s integration test Sonobuoy plugin ( #3931 )
...
* Added test runner and build files
* Changes to int test to output junit results.
* Updated documentation, removed comments
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-30 08:27:59 -07:00
Derek Nola
114b30277f
Redux: Enable K3s integration test to run on existing cluster ( #3905 )
...
* Made it possible to run int tests on existing cluster
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-26 16:26:19 -07:00
Derek Nola
66dacc6ee0
Revert "Enable K3s integration test to run on existing cluster ( #3892 )" ( #3899 )
...
This reverts commit 703b5af950
.
2021-08-24 07:26:14 -07:00
Derek Nola
703b5af950
Enable K3s integration test to run on existing cluster ( #3892 )
...
* Made it possible to run int tests on existing cluster
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-23 12:12:03 -07:00
Brad Davidson
e95b75409a
Fix lint failures
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson
dc14f370c4
Update wrangler to v0.8.5
...
Required to support apiextensions.v1 as v1beta1 has been deleted. Also
update helm-controller and dynamiclistener to track wrangler versions.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson
872855015c
Update etcd to v3.5.0
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Malte Starostik
b23955e835
Fix URL pruning when joining an etcd member ( #3832 )
...
* Fix URL pruning when joining an etcd member
Problem:
Existing member clientURLs were checked if they contain the joining
node's IP. In some edge cases this would prune valid URLs when the
joining IP is a substring match of the only existing member's IP.
Because of this, it was impossible to e.g. join 10.0.0.2 to an existing
node that has an IP of 10.0.0.2X or 10.0.0.2XX:
level=fatal msg="starting kubernetes: preparing server: start managed database:
joining etcd cluster: etcdclient: no available endpoints"
Solution:
Fixed by properly parsing the URLs and comparing the IPs for equality
instead of substring match.
Signed-off-by: Malte Starostik <info@stellaware.de>
2021-08-12 15:59:04 -07:00
Derek Nola
a1e36153f9
Added locking system for integration tests ( #3820 )
...
* Added locking system for integration tests
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-10 16:22:12 -07:00
Derek Nola
4cc781b5e3
Moved testing utils into tests directory. Improved gotests template. ( #3805 )
...
* Moved testing utils into tests directory. Improved gotests template.
* Updated cgroups2 with util folder rename
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-10 11:13:26 -07:00
Brian Downs
dcf0657b20
account for an s3 folder when listing objects ( #3807 )
...
* account for an s3 folder when listing objects
2021-08-09 16:14:41 -07:00
Derek Nola
b4eca61aeb
Prevent snapshot commands from creating empty snapshot directory ( #3783 )
...
Signed-off-by: dereknola <derek.nola@suse.com>
2021-08-09 09:04:18 -07:00
Hussein Galal
bc96ffb5f3
Fix Node stuck at deletion ( #3771 )
...
* fix Node stuck at deletion
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix Node stuck at deletion
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-08-05 22:32:01 +02:00
Derek Nola
21c8a33647
Introduction of Integration Tests ( #3695 )
...
* Commit of new etcd snapshot integration tests.
* Updated integration github action to not run on doc changes.
* Update Drone runner to only run unit tests
Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-26 09:59:33 -07:00
Derek Nola
bba49ea447
Fix to allow prune to correctly cleanup custom named snapshots ( #3649 )
...
Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-19 14:30:57 -07:00
Derek Nola
c833183517
Add unit tests for pkg/etcd ( #3549 )
...
* Created new etcd unit tests and testing support file
Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-01 16:08:35 -07:00
Hussein Galal
f5fbb9a9a8
Export cli server flags and etcd restoration functions ( #3527 )
...
* Export cli server flags and etfd restoration functions
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* export S3
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-06-30 22:29:03 +02:00
Brian Downs
ecbf17e2ed
move object channel defer close to goroutine
...
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-05-18 19:58:30 -07:00
Brian Downs
254b52077e
add retention default and wire in s3 prune
...
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-05-18 13:57:40 -07:00
Brian Downs
6ee28214fa
Add the ability to prune etcd snapshots ( #3310 )
...
* add prune subcommand to force rentention policy enforcement
2021-05-13 13:36:33 -07:00
Brian Downs
bcd8b67db4
Add the ability to list etcd snapshots ( #3303 )
...
* add ability to list local and s3 etcd snapshots
2021-05-11 16:59:33 -07:00
Brian Downs
e998cd110d
Add the ability to delete an etcd snapshot locally or from S3 ( #3277 )
...
* Add the ability to delete a given set of etcd snapshots from the CLI for locally stored and S3 store snapshots.
2021-05-07 16:10:04 -07:00
Brian Downs
beb0d8397a
reference node name when needed
...
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-05-04 10:03:28 -07:00
Brian Downs
c5ad71ce0b
Collect and Store etcd Snapshots and Metadata ( #3239 )
...
* Add the ability to store local etcd snapshots and etcd snapshots stored in an S3 compatible object store in a ConfigMap.
2021-04-30 18:26:39 -07:00
Brian Downs
66ed6efd57
Resolve local retention issue when S3 in use.
...
Remove early return preventing local retention policy to be enforced
resulting in N number of snapshots being stored.
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-04-14 10:40:08 -07:00
Hussein Galal
73df65d93a
remove etcd data dir when etcd is disabled ( #3059 )
...
* remove etcd data dir when etcd is disabled
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix comment
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use debug instead of info logs
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-03-16 18:14:43 +02:00
Brian Downs
7c99f8645d
Have Bootstrap Data Stored in etcd at Completed Start ( #3038 )
...
* have state stored in etcd at completed start and remove unneeded code
2021-03-11 13:07:40 -07:00
Brad Davidson
c0d129003b
Handle loadbalancer port in TIME_WAIT
...
If the port wanted by the client load balancer is in TIME_WAIT, startup
will fail. Set SO_REUSEPORT so that it can be listened on again
immediately.
The configurable Listen call wants a context, so plumb that through as
well.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-03-08 17:05:25 -08:00
Brad Davidson
7cdfaad6ce
Always use static ports for client load-balancers ( #3026 )
...
* Always use static ports for the load-balancers
This fixes an issue where RKE2 kube-proxy daemonset pods were failing to
communicate with the apiserver when RKE2 was restarted because the
load-balancer used a different port every time it started up.
This also changes the apiserver load-balancer port to be 1 below the
supervisor port instead of 1 above it. This makes the apiserver port
consistent at 6443 across servers and agents on RKE2.
Additional fixes below were required to successfully test and use this change
on etcd-only nodes.
* Actually add lb-server-port flag to CLI
* Fix nil pointer when starting server with --disable-etcd but no --server
* Don't try to use full URI as initial load-balancer endpoint
* Fix etcd load-balancer pool updates
* Update dynamiclistener to fix cert updates on etcd-only nodes
* Handle recursive initial server URL in load balancer
* Don't run the deploy controller on etcd-only nodes
2021-03-06 02:29:57 -08:00
Brian Downs
4d1f9eda9d
Etcd Snapshot/Restore to/from S3 Compatible Backends ( #2902 )
...
* Add functionality for etcd snapshot/restore to and from S3 compatible backends.
* Update etcd restore functionality to extract and write certificates and configs from snapshot.
2021-03-03 11:14:12 -07:00
galal-hussein
d6124981d5
remove etcd member if disable etcd is passed
...
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-03-01 23:50:50 +02:00
Hussein Galal
5749f66aa3
Add disable flags for control components ( #2900 )
...
* Add disable flags to control components
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* golint
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fixes to disable flags
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Add comments to functions
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Fix joining problem
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* golint
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix ticker
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix role labels
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-02-12 17:35:57 +02:00
Brad Davidson
071de833ae
Fix typo in field tag
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-01-22 19:38:37 -08:00
Yuriy
06fda7accf
Add functionality to bind custom IP address for Etcd metrics endpoint ( #2750 )
...
* Add functionality to bind custom IP address for Etcd metrics endpoint
Signed-off-by: yuriydzobak <yurii.dzobak@lotusflare.com>
2021-01-22 17:40:48 -08:00
Brian Downs
13229019f8
Add ability to perform an etcd on-demand snapshot via cli ( #2819 )
...
* add ability to perform an etcd on-demand snapshot via cli
2021-01-21 14:09:15 -07:00
MonzElmasry
86f68d5d62
change etcd dir permission if it exists
...
Signed-off-by: MonzElmasry <menna.elmasry@rancher.com>
2021-01-08 23:47:36 +02:00
Brad Davidson
8e4d3e645b
Restore legacy master role for etcd nodes
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-12-15 15:15:46 -08:00
Brad Davidson
63f2211b31
deprecate the "node-role.kubernetes.io/master" label / taint
...
Related to https://github.com/kubernetes/kubernetes/pull/95382
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-12-08 22:51:34 -08:00
Hussein Galal
fadc5a8057
Add tombstone file to etcd and catch errc etcd channel ( #2592 )
...
* Add tombstone file to embedded etcd
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go mod update
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* more changes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* gofmt and goimports
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go mod update
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go lint
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go lint
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go mod tidy
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-12-07 22:30:44 +02:00
Menna Elmasry
523ccaf3f2
Merge pull request #2448 from MonzElmasry/new_b
...
Make etcd use node private ip
2020-10-29 00:23:56 +02:00
MonzElmasry
e8436cc76b
Make etcd use node private ip
...
Signed-off-by: MonzElmasry <menna.elmasry@rancher.com>
2020-10-28 23:45:24 +02:00
Hussein Galal
fcd18d1b6e
skip node delete from removed member ( #2413 )
...
* skip node delete from removed member
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use grpc errors
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go imports
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* exit if node is the etcd that being removed
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-10-28 18:32:51 +02:00
Brad Davidson
de18528412
Make etcd voting members responsible for managing learners ( #2399 )
...
* Set etcd timeouts using values from k8s instead of etcdctl
Fix for one of the warnings from #2303
* Use etcd zap logger instead of deprecated capsnlog
Fix for one of the warnings from #2303
* Remove member self-promotion code paths
* Add learner promotion tracking code
* Fix RaftAppliedIndex progress check
* Remove ErrGRPCKeyNotFound check
This is not used by v3 API - it just returns a response with 0 KVs.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-10-27 11:06:26 -07:00
Hussein Galal
373449ec0a
Allow for multiple etcd snapshot restoration ( #2307 )
...
* add reset tmp file
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* go imports
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix multiple lines string
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix typo
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use resetFile function
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-09-30 02:53:31 +02:00
Brad Davidson
703ba5cde7
Add a bunch of doc comments
...
Also change identical error messages to clarify where problems are
occurring.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-09-27 03:10:00 -07:00
Brad Davidson
f59e8fc21b
Fix etcd directory permissions
...
Silences warning on startup about insecure directory permissions
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-09-27 03:10:00 -07:00
Brad Davidson
ee99660a96
Rename etcd directory helpers to reduce confusion about which datadir we're talking about
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-09-27 03:10:00 -07:00
Brad Davidson
97eb28a01a
Remove unnecessary listener arg from managed DB setup
...
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-09-27 03:09:45 -07:00
Brad Davidson
42bba04651
Skip etcd snapshots if the local endpoint is still a learner ( #2295 )
...
* Don't take snapshots if the local endpoint is still a learner
* Configure timeouts for etcd client dialer
2020-09-21 20:23:18 -07:00
Brian Downs
ba70c41cce
Initial Logging Output Update ( #2246 )
...
This attempts to update logging statements to make them consistent
through out the code base. It also adds additional context to messages
where possible, simplifies messages, and updates level where necessary.
2020-09-21 09:56:03 -07:00
Hussein Galal
46fe57d7e9
reset etcd name on cluster reset ( #2284 )
...
* reset etcd name on cluster reset
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* gofmt
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-09-19 03:09:36 +02:00
Brian Downs
866dc94cea
Galal hussein etcd backup restore ( #2154 )
...
* Add etcd snapshot and restore
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix error logs
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* goimports
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* fix flag describtion
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Add disable snapshot and retention
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* use creation time for snapshot retention
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* unexport method, update var name
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* adjust snapshot flags
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update var name, string concat
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* revert previous change, create constants
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* updates
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* type assertion error checking
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* pr remediation
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* pr remediation
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* pr remediation
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* pr remediation
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* pr remediation
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* updates
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* updates
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* simplify logic, remove unneeded function
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update flags
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update flags
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* add comment
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* exit on restore completion, update flag names, move retention check
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* exit on restore completion, update flag names, move retention check
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* exit on restore completion, update flag names, move retention check
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update disable snapshots flag and field names
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* move function
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update field names
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update var and field names
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update var and field names
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update defaultSnapshotIntervalMinutes to 12 like rke
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update directory perms
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update etc-snapshot-dir usage
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update interval to 12 hours
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* fix usage typo
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* add cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* add cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* add cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* wire in cron
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update deps target to work, add build/data target for creation, and generate
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* remove dead make targets
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* error handling, cluster reset functionality
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* error handling, cluster reset functionality
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* update
Signed-off-by: Brian Downs <brian.downs@gmail.com>
* remove intermediate dapper file
Signed-off-by: Brian Downs <brian.downs@gmail.com>
Co-authored-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-08-28 16:57:40 -07:00
Hussein Galal
169ee63907
Add etcd members as learners ( #2066 )
...
* Add etcd members as learners
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
* Ignore errors in promote member
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-07-29 22:52:49 +02:00
galal-hussein
c580a8b528
Add heartbeat interval and election timeout
2020-06-06 16:39:42 -07:00