Commit Graph

25 Commits (da16869555775cf17d4d97ffaf8a13b70bc738c2)

Author SHA1 Message Date
galal-hussein 973084ce47 Update peer address when running cluster-reset
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-10-25 15:49:42 -07:00
Malte Starostik 389cd740c8 Fix URL pruning when joining an etcd member
* Fix URL pruning when joining an etcd member

Problem:
Existing member clientURLs were checked if they contain the joining
node's IP. In some edge cases this would prune valid URLs when the
joining IP is a substring match of the only existing member's IP.
Because of this, it was impossible to e.g. join 10.0.0.2 to an existing
node that has an IP of 10.0.0.2X or 10.0.0.2XX:

level=fatal msg="starting kubernetes: preparing server: start managed database:
joining etcd cluster: etcdclient: no available endpoints"

Solution:
Fixed by properly parsing the URLs and comparing the IPs for equality
instead of substring match.

Signed-off-by: Malte Starostik <info@stellaware.de>
(cherry picked from commit b23955e835)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-13 11:50:05 -07:00
Hussein Galal 439e32b042
fix Node stuck at deletion (#3777)
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-08-06 19:57:38 +02:00
Brian Downs 0f3fe02eff Resolve local retention issue when S3 in use.
Remove early return preventing local retention policy to be enforced
resulting in N number of snapshots being stored.

Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-04-14 12:09:45 -07:00
Brian Downs cbdad9090b update imports
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-03-19 13:25:29 -07:00
Brian Downs 7b56aea0c0 Have Bootstrap Data Stored in etcd at Completed Start (#3038)
* have state stored in etcd at completed start and remove unneeded code

(cherry picked from commit 7c99f8645d)
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-03-15 17:11:31 -07:00
Brian Downs 92d1ecfbbe Etcd Snapshot/Restore to/from S3 Compatible Backends (#2902)
* Add functionality for etcd snapshot/restore to and from S3 compatible backends.
* Update etcd restore functionality to extract and write certificates and configs from snapshot.

(cherry picked from commit 4d1f9eda9d)
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-03-15 17:02:53 -07:00
Brian Downs ca55efaa8e Add ability to perform an etcd on-demand snapshot via cli (#2819)
* add ability to perform an etcd on-demand snapshot via cli

(cherry picked from commit 13229019f8)
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-03-15 16:54:41 -07:00
Hussein Galal f621760825
[release-1.19] Add disable components flags (#3023)
* Add disable flags for control components (#2900)

* Add disable flags to control components

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* golint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes to disable flags

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add comments to functions

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Fix joining problem

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* golint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix ticker

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix role labels

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update dynamiclistener

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* remove etcd member if disable etcd is passed

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Mark disable components flags as experimental

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* change error to warn when removing self from etcd members

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add hidden to disable flags

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go mod

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-03-05 00:28:56 +02:00
MonzElmasry 7d8b09c4f8
change etcd dir permission if it exists
Signed-off-by: MonzElmasry <menna.elmasry@rancher.com>
2021-01-14 23:18:19 +02:00
Menna Elmasry f8a4547bec Merge pull request #2448 from MonzElmasry/new_b
Make etcd use node private ip
2020-10-28 16:40:15 -07:00
Hussein Galal 701e45f42b skip node delete from removed member (#2413)
* skip node delete from removed member

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use grpc errors

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go imports

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* exit if node is the etcd that being removed

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-10-28 16:40:15 -07:00
Brad Davidson 085a3b2920 Make etcd voting members responsible for managing learners (#2399)
* Set etcd timeouts using values from k8s instead of etcdctl
  Fix for one of the warnings from #2303
* Use etcd zap logger instead of deprecated capsnlog
  Fix for one of the warnings from #2303
* Remove member self-promotion code paths
* Add learner promotion tracking code
* Fix RaftAppliedIndex progress check
* Remove ErrGRPCKeyNotFound check
  This is not used by v3 API - it just returns a response with 0 KVs.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-10-28 16:40:15 -07:00
Hussein Galal 64bfc7c8bc Allow for multiple etcd snapshot restoration (#2307)
* add reset tmp file

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go imports

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix multiple lines string

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix typo

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use resetFile function

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-10-28 16:40:15 -07:00
Brad Davidson dfe88df824 Add a bunch of doc comments
Also change identical error messages to clarify where problems are
occurring.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-10-28 16:40:15 -07:00
Brad Davidson 5e4edcb524 Fix etcd directory permissions
Silences warning on startup about insecure directory permissions

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-10-28 16:40:15 -07:00
Brad Davidson 61dd185422 Rename etcd directory helpers to reduce confusion about which datadir we're talking about
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-10-28 16:40:15 -07:00
Brad Davidson 6998709610 Remove unnecessary listener arg from managed DB setup
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-10-28 16:40:15 -07:00
Brad Davidson fb527e91ab Skip etcd snapshots if the local endpoint is still a learner (#2295)
* Don't take snapshots if the local endpoint is still a learner
* Configure timeouts for etcd client dialer
2020-10-28 16:40:15 -07:00
Brian Downs b6c64761ab Initial Logging Output Update (#2246)
This attempts to update logging statements to make them consistent
through out the code base. It also adds additional context to messages
where possible, simplifies messages, and updates level where necessary.
2020-10-28 16:40:15 -07:00
Hussein Galal 379defa2c2 reset etcd name on cluster reset (#2284)
* reset etcd name on cluster reset

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* gofmt

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-10-28 16:40:15 -07:00
Brian Downs 866dc94cea
Galal hussein etcd backup restore (#2154)
* Add etcd snapshot and restore

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix error logs

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* goimports

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix flag describtion

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add disable snapshot and retention

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use creation time for snapshot retention

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* unexport method, update var name

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* adjust snapshot flags

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update var name, string concat

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* revert previous change, create constants

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* updates

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* type assertion error checking

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* updates

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* updates

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* simplify logic, remove unneeded function

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update flags

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update flags

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add comment

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* exit on restore completion, update flag names, move retention check

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* exit on restore completion, update flag names, move retention check

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* exit on restore completion, update flag names, move retention check

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update disable snapshots flag and field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* move function

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update var and field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update var and field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update defaultSnapshotIntervalMinutes to 12 like rke

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update directory perms

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update etc-snapshot-dir usage

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update interval to 12 hours

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* fix usage typo

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update deps target to work, add build/data target for creation, and generate

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* remove dead make targets

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* error handling, cluster reset functionality

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* error handling, cluster reset functionality

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* remove intermediate dapper file

Signed-off-by: Brian Downs <brian.downs@gmail.com>

Co-authored-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-08-28 16:57:40 -07:00
Hussein Galal 169ee63907
Add etcd members as learners (#2066)
* Add etcd members as learners

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Ignore errors in promote member

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-07-29 22:52:49 +02:00
galal-hussein c580a8b528 Add heartbeat interval and election timeout 2020-06-06 16:39:42 -07:00
Darren Shepherd 6b5b69378f Add embedded etcd support
This is replaces dqlite with etcd.  The each same UX of dqlite is
followed so there is no change to the CLI args for this.
2020-06-06 16:39:41 -07:00