Allow bootstrapping with kubeadm bootstrap token strings or existing
Kubelet certs. This allows agents to join the cluster using kubeadm
bootstrap tokens, as created with the `k3s token create` command.
When the token expires or is deleted, agents can successfully restart by
authenticating with their kubelet certificate via node authentication.
If the token is gone and the node is deleted from the cluster, node auth
will fail and they will be prevented from rejoining the cluster until
provided with a valid token.
Servers still must be bootstrapped with the static cluster token, as
they will need to know it to decrypt the bootstrap data.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* General cleanup of test-helpers functions to address CI failures
* Install awscli in test image
* Log containerd output to file even when running with --debug
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Using the node external IP address for all CNI traffic is a breaking change from previous versions; we should make it an opt-in for distributed clusters instead of default behavior.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Consolidate data dir flag
* Group cluster flags together
* Reorder and group agent flags
* Add additional info around vmodule flag
* Hide deprecated flags, and add warning about their removal
Signed-off-by: Derek Nola <derek.nola@suse.com>
Also honor node-ip when adding the node address to the SAN list, instead
of hardcoding the autodetected IP address.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move registries.yaml handling out to rancher/wharfie
* Add system-default-registry support
* Add CLI support for kubelet image credential providers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If the port wanted by the client load balancer is in TIME_WAIT, startup
will fail. Set SO_REUSEPORT so that it can be listened on again
immediately.
The configurable Listen call wants a context, so plumb that through as
well.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Always use static ports for the load-balancers
This fixes an issue where RKE2 kube-proxy daemonset pods were failing to
communicate with the apiserver when RKE2 was restarted because the
load-balancer used a different port every time it started up.
This also changes the apiserver load-balancer port to be 1 below the
supervisor port instead of 1 above it. This makes the apiserver port
consistent at 6443 across servers and agents on RKE2.
Additional fixes below were required to successfully test and use this change
on etcd-only nodes.
* Actually add lb-server-port flag to CLI
* Fix nil pointer when starting server with --disable-etcd but no --server
* Don't try to use full URI as initial load-balancer endpoint
* Fix etcd load-balancer pool updates
* Update dynamiclistener to fix cert updates on etcd-only nodes
* Handle recursive initial server URL in load balancer
* Don't run the deploy controller on etcd-only nodes
* Add functionality for etcd snapshot/restore to and from S3 compatible backends.
* Update etcd restore functionality to extract and write certificates and configs from snapshot.
Servers should always be upgraded before agents, but generally this
isn't required because things are compatible between versions. In this
case we're OK with failing closed if the user upgrades out of order, but
we should give a clearer message about what steps are required to fix
the issue.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Adds support for retagging images to appear to have been sourced from
one or more additional registries as they are imported from the tarball.
This is intended to support RKE2 use cases with system-default-registry
where the images need to appear to have been pulled from a registry
other than docker.io.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Problem:
While using ZFS on debian and K3s with docker, I am unable to get k3s working as the snapshotter value is being validated and the validation fails.
Solution:
We should not validate snapshotter value if we are using docker as it's a no-op in that case.
Signed-off-by: Waqar Ahmed <waqarahmedjoyia@live.com>
Removing the cfg.DataDir mutation in 3e4fd7b did not break anything, but
did change some paths in unwanted ways. Rather than mutating the
user-supplied command-line flags, explicitly specify the agent
subdirectory as needed.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This attempts to update logging statements to make them consistent
through out the code base. It also adds additional context to messages
where possible, simplifies messages, and updates level where necessary.