As per https://github.com/golang/go/issues/47001 even subtle.ConstantTimeCompare should never be used with variable-length inputs, as it will return 0 if the lengths do not match. Switch to consistently using constant-time comparisons of hashes for password checks to avoid any possible side-channel leaks that could be combined with other vectors to discover password lengths.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Several places in the code used a 5-second retry loop to wait on
Runtime.Core to be set. This caused a race condition where OnChange
handlers could be added after the Wrangler shared informers were already
started. When this happened, the handlers were never called because the
shared informers they relied upon were not started.
Fix that by requiring anything that waits on Runtime.Core to run from a
cluster controller startup hook that is guaranteed to be called before
the shared informers are started, instead of just firing it off in a
goroutine that retries until it is set.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes an issue where CRDs were being created without schema, allowing
resources with invalid content to be created, later stalling the
controller ListWatch event channel when the invalid resources could not
be deserialized.
This also requires moving Addon GVK tracking from a status field to
an annotation, as the GroupVersionKind type has special handling
internal to Kubernetes that prevents it from being serialized to the CRD
when schema validation is enabled.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
We need to send the full chain in order for cross-signing to work
properly during switchover to a new root.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Turns out etcd-only nodes were never running **any** of the controllers,
so allowing multiple controllers didn't really fix things.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This command must be run on a server while the service is running. After this command completes, all the servers in the cluster should be restarted to load the new CA files.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Having separate tokens for server and agent nodes is a nice feature.
However, passing server's plain `K3S_AGENT_TOKEN` value
to `k3s agent --token` without CA hash is insecure when CA is
self-signed, and k3s warns about it in the logs:
```
Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash.
Use the full token from the server's node-token file to enable Cluster CA validation.
```
Okay so I need CA hash but where should I get it?
This commit attempts to fix this issue by saving agent token value to
`agent-token` file with CA hash appended.
Signed-off-by: Vladimir Kochnev <hashtable@yandex.ru>
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.
Signed-off-by: Darren Shepherd <darren@acorn.io>
This gives nicer errors from Kubernetes components during startup, and
reduces LOC a bit by using the upstream responsewriters module instead
of writing the headers and body by hand.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This controller only needs to run when using managed etcd, so move it in
with the rest of the etcd stuff. This change also modifies the
controller to only watch the Kubernetes service endpoint, instead of
watching all endpoints in the entire cluster.
Fixes an error message revealed by use of a newer grpc client in
Kubernetes 1.24, which logs an error when the Put to etcd failed because
kine doesn't support the etcd Put operation. The controller shouldn't
have been running without etcd in the first place.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Improve feedback when running secrets-encrypt commands on etcd-only nodes, and
allow etcd-only nodes to properly restart when effecting rotation.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This allows secondary etcd nodes to bootstrap the kubelet before an
apiserver joins the cluster. Rancher waits for all the etcd nodes to
come up before adding the control-plane nodes, so this needs to be
handled properly.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder
Signed-off-by: Derek Nola <derek.nola@suse.com>
Required to support apiextensions.v1 as v1beta1 has been deleted. Also
update helm-controller and dynamiclistener to track wrangler versions.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Sync DisableKubeProxy from cfg into control before sending control to clients,
as it may have been modified by a startup hook.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
the server's built-in helm controller.
Problem:
Testing installation and uninstallation of the Helm Controller on k3s is
not possible if the Helm Controller is baked into the k3s server.
Solution:
The Helm Controller can optionally be disabled, which will allow users
to manage its installation manually.
Signed-off-by: Joe Kralicky <joe.kralicky@suse.com>
* Move registries.yaml handling out to rancher/wharfie
* Add system-default-registry support
* Add CLI support for kubelet image credential providers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Always use static ports for the load-balancers
This fixes an issue where RKE2 kube-proxy daemonset pods were failing to
communicate with the apiserver when RKE2 was restarted because the
load-balancer used a different port every time it started up.
This also changes the apiserver load-balancer port to be 1 below the
supervisor port instead of 1 above it. This makes the apiserver port
consistent at 6443 across servers and agents on RKE2.
Additional fixes below were required to successfully test and use this change
on etcd-only nodes.
* Actually add lb-server-port flag to CLI
* Fix nil pointer when starting server with --disable-etcd but no --server
* Don't try to use full URI as initial load-balancer endpoint
* Fix etcd load-balancer pool updates
* Update dynamiclistener to fix cert updates on etcd-only nodes
* Handle recursive initial server URL in load balancer
* Don't run the deploy controller on etcd-only nodes
K3s upgrade via watch over file change of static file and manifest
and triggers helm-controller for change. It seems reasonable to
only allow upgrade traefik v1->v2 when there is no existing custom
traefik HelmChartConfig in the cluster to avoid any
incompatibility.
Here also separate the CRDs and put them into a different chart
to support CRD upgrade.
Signed-off-by: Chin-Ya Huang <chin-ya.huang@suse.com>
This attempts to update logging statements to make them consistent
through out the code base. It also adds additional context to messages
where possible, simplifies messages, and updates level where necessary.
As seen in issues such as #15#155#518#570 there are situations where
k3s will fail to write the kubeconfig file, but reports that it wrote it
anyway as the success message is printed unconditionally. Also, secondary
actions like setting file mode and creating a symlink are also attempted
even if the file was not created.
This change skips attempting additional actions, and propagates the
failure back upwards.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
In rke2 everything is a static pod so this causes a chicken and egg situation
in which we need the kubelet running before the kube-apiserver can be
launched. By starting the apiserver in the background this allows us to
do this odd bootstrapping.
In k3s today the kubernetes API and the /v1-k3s API are combined into
one http server. In rke2 we are running unmodified, non-embedded Kubernetes
and as such it is preferred to run k8s and the /v1-k3s API on different
ports. The /v1-k3s API port is called the SupervisorPort in the code.
To support this separation of ports a new shim was added on the client in
then pkg/agent/proxy package that will launch two load balancers instead
of just one load balancer. One load balancer for 6443 and the other
for 9345 (which is the supervisor port).