Commit Graph

161 Commits (76e8e9789a78706bb04738ba82c1d38533b88231)

Author SHA1 Message Date
Derek Nola 0b18a65d4f
Revert "Warn that v1.28 will deprecate reencrypt/prepare (#7848)"
This reverts commit 4ab01f3941.

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-07-14 12:38:33 -07:00
Derek Nola 8405813c12
Fix rootless node password (#7887)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-07-07 09:14:49 -07:00
Derek Nola 4ab01f3941
Warn that v1.28 will deprecate reencrypt/prepare (#7848)
* Warn that v1.28 will deprecate reencrypt/prepare

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-07-06 12:34:51 -07:00
Daishan Peng ce3443ddf6 Allow k3s to customize apiServerPort on helm-controller
Signed-off-by: Daishan Peng <daishan@acorn.io>
2023-07-03 11:09:49 -07:00
Vitor Savian 0809187cff
Adding cli to custom klipper helm image (#7682)
Adding cli to custom klipper helm image

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2023-06-28 15:31:58 +00:00
Brad Davidson 5170bc5a04 Improve error response logging
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-06-05 15:31:04 -07:00
Brad Davidson 45d8c1a1a2 Soft-fail on node password verification if the secret cannot be created
Allows nodes to join the cluster during a webhook outage. This also
enhances auditability by creating Kubernetes events for the deferred
verification.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-06-05 15:31:04 -07:00
Derek Nola b0188f5a13
Test Coverage Reports for E2E tests (#7526)
* Move coverage writer into agent and server
* Add coverage report to E2E PR tests
* Add codecov upload to drone

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-06-05 14:15:17 -07:00
Brad Davidson 64a5f58f1e Create new kubeconfig for supervisor use
Only actual admin actions should use the admin kubeconfig; everything done by the supervisor/deploy/helm controllers will now use a distinct account for audit purposes.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-30 18:15:11 -07:00
Brad Davidson 8748813a61 Use distinct clients for supervisor, deploy, and helm controllers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-30 18:15:11 -07:00
Brad Davidson 8f450bafe1 Bump helm-controller version for repo auth/ca support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-10 14:57:37 -07:00
Brad Davidson 239021e759 Consistently use constant-time comparison of password hashes
As per https://github.com/golang/go/issues/47001 even subtle.ConstantTimeCompare should never be used with variable-length inputs, as it will return 0 if the lengths do not match. Switch to consistently using constant-time comparisons of hashes for password checks to avoid any possible side-channel leaks that could be combined with other vectors to discover password lengths.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-09 13:54:50 -07:00
Derek Nola c6dc789e25
Add support for `-cover` + integration test code coverage (#7415)
* Add support for -cover in k3s server
* Update codecov reporting
* Sigterm in StopK3sServer
Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-05-08 12:46:51 -07:00
Brad Davidson f1b6a3549c Fix stack log on panic
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-04-28 11:24:34 -07:00
Brad Davidson c44d33d29b Fix race condition in tunnel server startup
Several places in the code used a 5-second retry loop to wait on
Runtime.Core to be set. This caused a race condition where OnChange
handlers could be added after the Wrangler shared informers were already
started. When this happened, the handlers were never called because the
shared informers they relied upon were not started.

Fix that by requiring anything that waits on Runtime.Core to run from a
cluster controller startup hook that is guaranteed to be called before
the shared informers are started, instead of just firing it off in a
goroutine that retries until it is set.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-04-28 11:24:34 -07:00
Brad Davidson ad41fb8c96 Create CRDs with schema
Fixes an issue where CRDs were being created without schema, allowing
resources with invalid content to be created, later stalling the
controller ListWatch event channel when the invalid resources could not
be deserialized.

This also requires moving Addon GVK tracking from a status field to
an annotation, as the GroupVersionKind type has special handling
internal to Kubernetes that prevents it from being serialized to the CRD
when schema validation is enabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-04-27 20:42:46 -07:00
Brad Davidson 977a85559e Add support for cross-signing new certs during ca rotation
We need to send the full chain in order for cross-signing to work
properly during switchover to a new root.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-03-13 16:56:28 -07:00
Brad Davidson 0c302f4341 Fix etcd member deletion
Turns out etcd-only nodes were never running **any** of the controllers,
so allowing multiple controllers didn't really fix things.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-14 09:39:41 -08:00
Brad Davidson 3d146d2f1b Allow for multiple sets of leader-elected controllers
Addresses an issue where etcd controllers did not run on etcd-only nodes

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-10 10:46:48 -08:00
Brad Davidson 87f9c4ab11 Ensure that node exists when using node auth
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-07 14:55:04 -08:00
Brad Davidson 373df1c8b0 Add support for `k3s token` command
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-07 14:55:04 -08:00
Brad Davidson 215fb157ff Add `certificate rotate-ca` to write updated CA certs to datastore
This command must be run on a server while the service is running. After this command completes, all the servers in the cluster should be restarted to load the new CA files.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-06 15:09:31 -08:00
Manuel Buil 557fcd28d5 Change the priority of address types depending on flannel-external-ip
Signed-off-by: Manuel Buil <mbuil@suse.com>
2022-11-04 09:02:39 +01:00
Brad Davidson 8d28a38a18 Update helm-controller to pull in refactor
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-10-26 15:03:13 -07:00
Derek Nola 06d81cb936
Replace deprecated ioutil package (#6230)
* Replace ioutil package
* check integration test null pointer
* Remove rotate retries

Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-10-07 17:36:57 -07:00
Brad Davidson 0b96ca92bc Move servicelb into cloudprovider LoadBalancer interface
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-09-30 08:17:20 -07:00
Brad Davidson a15e7e8b68 Move DisableServiceLB/Rootless/ServiceLBNamespace into config.Control
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-09-30 08:17:20 -07:00
Vladimir Kochnev 13af0b1d88 Save agent token to /var/lib/rancher/k3s/server/agent-token
Having separate tokens for server and agent nodes is a nice feature.

However, passing server's plain `K3S_AGENT_TOKEN` value
to `k3s agent --token` without CA hash is insecure when CA is
self-signed, and k3s warns about it in the logs:

```
Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash.
Use the full token from the server's node-token file to enable Cluster CA validation.
```

Okay so I need CA hash but where should I get it?

This commit attempts to fix this issue by saving agent token value to
`agent-token` file with CA hash appended.

Signed-off-by: Vladimir Kochnev <hashtable@yandex.ru>
2022-08-01 14:11:50 -07:00
Brad Davidson 5eaa0a9422 Replace getLocalhostIP with Loopback helper method
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-07-21 16:51:57 -07:00
Derek Nola a9b5a1933f
Delay service readiness until after startuphooks have finished (#5649)
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-06-15 09:00:52 -07:00
Darren Shepherd e6009b1edf Introduce servicelb-namespace parameter
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.

Signed-off-by: Darren Shepherd <darren@acorn.io>
2022-06-14 15:48:58 -07:00
Brad Davidson ce5b9347c9 Replace DefaultProxyDialerFn dialer injection with EgressSelector support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-29 17:54:36 -07:00
Brad Davidson 3d01ca1309 Make supervisor errors parsable by Kubernetes client libs
This gives nicer errors from Kubernetes components during startup, and
reduces LOC a bit by using the upstream responsewriters module instead
of writing the headers and body by hand.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-29 09:23:37 -07:00
Brad Davidson b12cd62935 Move IPv4/v6 selection into helpers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson 5b2c14b123 Print a helpful error when trying to join additional servers but etcd is not in use
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson 99851b0f84 Use core constants for cert user/group values
Also update cert gen to ensure leaf certs are regenerated if other key fields change.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson f37e7565b8 Move the apiserver addresses controller into the etcd package
This controller only needs to run when using managed etcd, so move it in
with the rest of the etcd stuff. This change also modifies the
controller to only watch the Kubernetes service endpoint, instead of
watching all endpoints in the entire cluster.

Fixes an error message revealed by use of a newer grpc client in
Kubernetes 1.24, which logs an error when the Put to etcd failed because
kine doesn't support the etcd Put operation. The controller shouldn't
have been running without etcd in the first place.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-07 11:28:15 -07:00
Brad Davidson 49544e0d49 Allow agents to query non-apiserver supervisors for apiserver endpoints
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-06 13:03:14 -07:00
Roberto Bonafiglia 4afeb9c5c7
Merge pull request #5325 from rbrtbnfgl/fix-etcd-ipv6-url
Fixed etcd URL in case of IPv6 address
2022-04-05 09:55:42 +02:00
Roberto Bonafiglia dda409b041 Updated localhost address on IPv6 only setup
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-29 09:35:54 +02:00
Brad Davidson e811689df9 Fix etcd-only secrets encryption rotation
Improve feedback when running secrets-encrypt commands on etcd-only nodes, and
allow etcd-only nodes to properly restart when effecting rotation.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-25 10:40:58 -07:00
Brad Davidson 38706eeec0 Defer ensuring node passwords on etcd-only nodes during initial cluster bootstrap
This allows secondary etcd nodes to bootstrap the kubelet before an
apiserver joins the cluster. Rancher waits for all the etcd nodes to
come up before adding the control-plane nodes, so this needs to be
handled properly.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-18 10:58:37 -07:00
Brad Davidson a93b9b6d53 Update helm-controller
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-16 23:49:14 -07:00
Luther Monson 9a849b1bb7
[master] changing package to k3s-io (#4846)
* changing package to k3s-io

Signed-off-by: Luther Monson <luther.monson@gmail.com>

Co-authored-by: Derek Nola <derek.nola@suse.com>
2022-03-02 15:47:27 -08:00
Brad Davidson 5014c9e0e8 Fix adding etcd-only node to existing cluster
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 19:56:08 -08:00
Brad Davidson a1b800f0bf Remove unnecessary copies of etcdconfig struct
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 12:05:16 -08:00
Brad Davidson e7464a17f7 Fix use of agent creds for secrets-encrypt and config validate
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-01-06 12:55:18 -08:00
Derek Nola bcb662926d
Secrets-encryption rotation (#4372)
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-12-07 14:31:32 -08:00
Brad Davidson 3da1bb3af2 Fix other uses of NewForConfigOrDie in contexts where we could return err
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-29 15:18:14 -07:00
Brad Davidson 5a923ab8dc Add containerd ready channel to delay etcd node join
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-14 14:03:52 -07:00