Commit Graph

162 Commits (3b28ac0a1ff34a8a537a96bb733a4c3713fde9c7)

Author SHA1 Message Date
Brad Davidson 61bbad7d9e Store extra metadata and cluster ID for snapshots
Write the extra metadata both locally and to S3. These files are placed such that they will not be used by older versions of K3s that do not make use of them.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 7464007037)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-13 12:28:56 -07:00
Derek Nola 0816812c99
[Release-1.25] Clear remove annotations on cluster reset (#8589)
* Use admin kubeconfig instead of supervisor for etcd snapshot CLI

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Skip creating CRDs and setting up event recorder for CLI controller context

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Don't export functions not needed outside the etcd package

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Reorganize Driver interface and etcd driver to avoid passing context and config into most calls

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Clear remove annotations on cluster reset; refuse to delete last member from cluster

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

---------

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Co-authored-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-12 08:11:34 -07:00
Derek Nola 6afee00eaf
Server Token Rotation (#8578)
* Consolidate NewCertCommands
* Add support for user defined new token
* Add E2E testlets
* Ensure agent token also changes

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-10-10 09:45:27 -07:00
Manuel Buil 6f550cd9a1 ipFamilyPolicy:PreferDualStack for coredns and metrics-server
Signed-off-by: Manuel Buil <mbuil@suse.com>
2023-10-02 11:35:15 +02:00
Brad Davidson ddbe499d9a Add FilterCN function to prevent SAN Stuffing
Wire up a node watch to collect addresses of server nodes, to prevent adding unauthorized SANs to the dynamiclistener cert.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit aa76942d0f)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-08-04 16:08:16 -07:00
Vitor Savian e8a4961732 Adding cli to custom klipper helm image (#7682)
Adding cli to custom klipper helm image

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
(cherry picked from commit 0809187cff)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-07-07 16:28:16 -07:00
Derek Nola c850132b5f
Fix rootless node password (#7900)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-07-07 11:03:14 -07:00
Derek Nola e1a315189b
Allow k3s to customize apiServerPort on helm-controller (#7873)
Signed-off-by: Daishan Peng <daishan@acorn.io>
Co-authored-by: Daishan Peng <daishan@acorn.io>
2023-07-05 11:56:58 -07:00
Brad Davidson a645d3caf2 Improve error response logging
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 5170bc5a04)
2023-06-12 10:53:26 -07:00
Brad Davidson 3596d1891b Soft-fail on node password verification if the secret cannot be created
Allows nodes to join the cluster during a webhook outage. This also
enhances auditability by creating Kubernetes events for the deferred
verification.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 45d8c1a1a2)
2023-06-12 10:53:26 -07:00
Brad Davidson 29bc03305a Create new kubeconfig for supervisor use
Only actual admin actions should use the admin kubeconfig; everything done by the supervisor/deploy/helm controllers will now use a distinct account for audit purposes.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 64a5f58f1e)
2023-06-12 10:53:26 -07:00
Brad Davidson ac6966145c Use distinct clients for supervisor, deploy, and helm controllers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 8748813a61)
2023-06-12 10:53:26 -07:00
Brad Davidson eff951b567 Bump helm-controller version for repo auth/ca support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-10 15:18:54 -07:00
Brad Davidson a0891cab16 Consistently use constant-time comparison of password hashes
As per https://github.com/golang/go/issues/47001 even subtle.ConstantTimeCompare should never be used with variable-length inputs, as it will return 0 if the lengths do not match. Switch to consistently using constant-time comparisons of hashes for password checks to avoid any possible side-channel leaks that could be combined with other vectors to discover password lengths.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 239021e759)
2023-05-10 15:18:54 -07:00
Brad Davidson 95f5069514 Fix stack log on panic
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit f1b6a3549c)
2023-05-10 15:18:54 -07:00
Brad Davidson 6d28abd1f4 Fix race condition in tunnel server startup
Several places in the code used a 5-second retry loop to wait on
Runtime.Core to be set. This caused a race condition where OnChange
handlers could be added after the Wrangler shared informers were already
started. When this happened, the handlers were never called because the
shared informers they relied upon were not started.

Fix that by requiring anything that waits on Runtime.Core to run from a
cluster controller startup hook that is guaranteed to be called before
the shared informers are started, instead of just firing it off in a
goroutine that retries until it is set.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit c44d33d29b)
2023-05-10 15:18:54 -07:00
Brad Davidson 746ada89d2 Create CRDs with schema
Fixes an issue where CRDs were being created without schema, allowing
resources with invalid content to be created, later stalling the
controller ListWatch event channel when the invalid resources could not
be deserialized.

This also requires moving Addon GVK tracking from a status field to
an annotation, as the GroupVersionKind type has special handling
internal to Kubernetes that prevents it from being serialized to the CRD
when schema validation is enabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ad41fb8c96)
2023-05-10 15:18:54 -07:00
Brad Davidson 37a26379d5 Add support for cross-signing new certs during ca rotation
We need to send the full chain in order for cross-signing to work
properly during switchover to a new root.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-03-13 20:04:11 -07:00
Brad Davidson 4e03608119 Fix etcd member deletion
Turns out etcd-only nodes were never running **any** of the controllers,
so allowing multiple controllers didn't really fix things.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-14 13:18:12 -08:00
Brad Davidson 14f2226b67 Allow for multiple sets of leader-elected controllers
Addresses an issue where etcd controllers did not run on etcd-only nodes

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-10 11:35:29 -08:00
Brad Davidson 33c6488bbc Ensure that node exists when using node auth
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 87f9c4ab11)
2023-02-10 09:33:55 -08:00
Brad Davidson 97c506cc65 Add support for `k3s token` command
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 373df1c8b0)
2023-02-10 09:33:55 -08:00
Brad Davidson 5eac6f977c Add `certificate rotate-ca` to write updated CA certs to datastore
This command must be run on a server while the service is running. After this command completes, all the servers in the cluster should be restarted to load the new CA files.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 215fb157ff)
2023-02-10 09:33:55 -08:00
Manuel Buil 557fcd28d5 Change the priority of address types depending on flannel-external-ip
Signed-off-by: Manuel Buil <mbuil@suse.com>
2022-11-04 09:02:39 +01:00
Brad Davidson 8d28a38a18 Update helm-controller to pull in refactor
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-10-26 15:03:13 -07:00
Derek Nola 06d81cb936
Replace deprecated ioutil package (#6230)
* Replace ioutil package
* check integration test null pointer
* Remove rotate retries

Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-10-07 17:36:57 -07:00
Brad Davidson 0b96ca92bc Move servicelb into cloudprovider LoadBalancer interface
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-09-30 08:17:20 -07:00
Brad Davidson a15e7e8b68 Move DisableServiceLB/Rootless/ServiceLBNamespace into config.Control
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-09-30 08:17:20 -07:00
Vladimir Kochnev 13af0b1d88 Save agent token to /var/lib/rancher/k3s/server/agent-token
Having separate tokens for server and agent nodes is a nice feature.

However, passing server's plain `K3S_AGENT_TOKEN` value
to `k3s agent --token` without CA hash is insecure when CA is
self-signed, and k3s warns about it in the logs:

```
Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash.
Use the full token from the server's node-token file to enable Cluster CA validation.
```

Okay so I need CA hash but where should I get it?

This commit attempts to fix this issue by saving agent token value to
`agent-token` file with CA hash appended.

Signed-off-by: Vladimir Kochnev <hashtable@yandex.ru>
2022-08-01 14:11:50 -07:00
Brad Davidson 5eaa0a9422 Replace getLocalhostIP with Loopback helper method
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-07-21 16:51:57 -07:00
Derek Nola a9b5a1933f
Delay service readiness until after startuphooks have finished (#5649)
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-06-15 09:00:52 -07:00
Darren Shepherd e6009b1edf Introduce servicelb-namespace parameter
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.

Signed-off-by: Darren Shepherd <darren@acorn.io>
2022-06-14 15:48:58 -07:00
Brad Davidson ce5b9347c9 Replace DefaultProxyDialerFn dialer injection with EgressSelector support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-29 17:54:36 -07:00
Brad Davidson 3d01ca1309 Make supervisor errors parsable by Kubernetes client libs
This gives nicer errors from Kubernetes components during startup, and
reduces LOC a bit by using the upstream responsewriters module instead
of writing the headers and body by hand.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-29 09:23:37 -07:00
Brad Davidson b12cd62935 Move IPv4/v6 selection into helpers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson 5b2c14b123 Print a helpful error when trying to join additional servers but etcd is not in use
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson 99851b0f84 Use core constants for cert user/group values
Also update cert gen to ensure leaf certs are regenerated if other key fields change.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson f37e7565b8 Move the apiserver addresses controller into the etcd package
This controller only needs to run when using managed etcd, so move it in
with the rest of the etcd stuff. This change also modifies the
controller to only watch the Kubernetes service endpoint, instead of
watching all endpoints in the entire cluster.

Fixes an error message revealed by use of a newer grpc client in
Kubernetes 1.24, which logs an error when the Put to etcd failed because
kine doesn't support the etcd Put operation. The controller shouldn't
have been running without etcd in the first place.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-07 11:28:15 -07:00
Brad Davidson 49544e0d49 Allow agents to query non-apiserver supervisors for apiserver endpoints
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-06 13:03:14 -07:00
Roberto Bonafiglia 4afeb9c5c7
Merge pull request #5325 from rbrtbnfgl/fix-etcd-ipv6-url
Fixed etcd URL in case of IPv6 address
2022-04-05 09:55:42 +02:00
Roberto Bonafiglia dda409b041 Updated localhost address on IPv6 only setup
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-29 09:35:54 +02:00
Brad Davidson e811689df9 Fix etcd-only secrets encryption rotation
Improve feedback when running secrets-encrypt commands on etcd-only nodes, and
allow etcd-only nodes to properly restart when effecting rotation.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-25 10:40:58 -07:00
Brad Davidson 38706eeec0 Defer ensuring node passwords on etcd-only nodes during initial cluster bootstrap
This allows secondary etcd nodes to bootstrap the kubelet before an
apiserver joins the cluster. Rancher waits for all the etcd nodes to
come up before adding the control-plane nodes, so this needs to be
handled properly.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-18 10:58:37 -07:00
Brad Davidson a93b9b6d53 Update helm-controller
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-16 23:49:14 -07:00
Luther Monson 9a849b1bb7
[master] changing package to k3s-io (#4846)
* changing package to k3s-io

Signed-off-by: Luther Monson <luther.monson@gmail.com>

Co-authored-by: Derek Nola <derek.nola@suse.com>
2022-03-02 15:47:27 -08:00
Brad Davidson 5014c9e0e8 Fix adding etcd-only node to existing cluster
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 19:56:08 -08:00
Brad Davidson a1b800f0bf Remove unnecessary copies of etcdconfig struct
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 12:05:16 -08:00
Brad Davidson e7464a17f7 Fix use of agent creds for secrets-encrypt and config validate
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-01-06 12:55:18 -08:00
Derek Nola bcb662926d
Secrets-encryption rotation (#4372)
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-12-07 14:31:32 -08:00
Brad Davidson 3da1bb3af2 Fix other uses of NewForConfigOrDie in contexts where we could return err
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-29 15:18:14 -07:00