The InstancesV1 interface handled this for us by combining the ProviderName and InstanceID values; the new interface requires us to do it manually
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
For 1.24 and earlier, the svclb pods need a ServiceAccount so that we can allow their sysctls in PSPs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit f25419ca2c)
We should be reading from the hijacked bufio.ReaderWriter instead of
directly from the net.Conn. There is a race condition where the
underlying http handler may consume bytes from the hijacked request
stream, if it comes in the same packet as the CONNECT header. These
bytes are left in the buffered reader, which we were not using. This was
causing us to occasionally drop a few bytes from the start of the
tunneled connection's client data stream.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If CCM and ServiceLB are both disabled, don't run the cloud-controller-manager at all;
this should provide the same CLI flag behavior as previous releases, and not create
problems when users disable the CCM but still want ServiceLB.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Having separate tokens for server and agent nodes is a nice feature.
However, passing server's plain `K3S_AGENT_TOKEN` value
to `k3s agent --token` without CA hash is insecure when CA is
self-signed, and k3s warns about it in the logs:
```
Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash.
Use the full token from the server's node-token file to enable Cluster CA validation.
```
Okay so I need CA hash but where should I get it?
This commit attempts to fix this issue by saving agent token value to
`agent-token` file with CA hash appended.
Signed-off-by: Vladimir Kochnev <hashtable@yandex.ru>
(cherry picked from commit 13af0b1d88)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 5eaa0a9422)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Use INVOCATION_ID to detect execution under systemd, since as of a9b5a1933f NOTIFY_SOCKET is now cleared by the server code.
* Set the unit type to notify by default for both server and agent, which is what Rancher-managed installs have done for a while.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit bd5fdfce33)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Addressess issue where the compact may take more than 10 seconds on slower disks. These disks probably aren't really suitable for etcd, but apparently run fine otherwise.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 1674b9d640)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Increase the default snapshot timeout. The timeout is not currently
configurable from Rancher, and larger clusters are frequently seeing
uploads fail at 30 seconds.
* Enable compression for scheduled snapshots if enabled on the
command-line. The CLI flag was not being passed into the etcd config.
* Only set the S3 content-type to application/zip if the file is zipped.
* Don't run more than one snapshot at once, to prevent misconfigured
etcd snapshot cron schedules from stacking up.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
87e1806697 removed the OwnerReferences
field from the DaemonSet, which makes sense since the Service may now be
in a different namespace than the DaemonSet and cross-namespace owner
references are not supported. Unfortunately, we were relying on
garbage collection to delete the DameonSet, so this started leaving
orphaned DaemonSets when Services were deleted.
We don't want to add an a Service OnRemove handler, since this will add
finalizers to all Services, not just LoadBalancers services, causing
conformance tests to fail. Instead, manage our own finalizers, and
restore the DaemonSet removal Event that was removed by the same commit.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Since #4438 removed 2-way sync and treats any changed+newer files on disk as an error, we no longer need to determine if files are newer on disk/db or if there is a conflicting mix of both. Any changed+newer file is an error, unless we're doing a cluster reset in which case everything is unconditionally replaced.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Properly skip restoring bootstrap data for files that don't have a path
set because the feature that would set it isn't enabled.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Use same kubelet-preferred-address-types setting as RKE2 to improve reliability of the egress selector when using a HTTP proxy. Also, use BindAddressOrLoopback to ensure that the correct supervisor address is used when --bind-address is set.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.
Signed-off-by: Darren Shepherd <darren@acorn.io>
(cherry picked from commit e6009b1edf)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
The baseline PodSecurity profile will reject klipper-lb pods from running.
Since klipper-lb pods are put in the same namespace as the Service this
means users can not use PodSecurity baseline profile in combination with
the k3s servicelb.
The solution is to move all klipper-lb pods to a klipper-lb-system where
the security policy of the klipper-lb pods can be different an uniformly
managed.
Signed-off-by: Darren Shepherd <darren@acorn.io>
(cherry picked from commit f4cc1b8788)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Allow the flannel backend to be specified as
backend=option=val,option2=val2 to select a given backend with extra options.
In particular this adds the following options to wireguard-native
backend:
* Mode - flannel wireguard tunnel mode
* PersistentKeepaliveInterval- wireguard persistent keepalive interval
Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
If the user points S3 backups at a bucket containing other files, those
file names may not be valid configmap keys.
For example, RKE1 generates backup files with names like
`s3-c-zrjnb-rs-6hxpk_2022-05-05T12:05:15Z.zip`; the semicolons in the
timestamp portion of the name are not allowed for use in configmap keys.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Signed-off-by: igor <igor@igor.io>
Signed-off-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Igor <igorwwwwwwwwwwwwwwwwwwww@users.noreply.github.com>
The control-plane context handles requests outside the cluster and
should not be sent to the proxy.
In agent mode, we don't watch pods and just direct-dial any request for
a non-node address, which is the original behavior.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Watching pods appears to be the most reliable way to ensure that the
proxy routes and authorizes connections.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Remove objects when removed from manifests
If a user puts a file in /var/lib/rancher/k3s/server/manifests/ then the
objects contained therein are deployed to the cluster. If the objects
are removed from that file, they are not removed from the cluster.
This change tracks the GVKs in the files and will remove objects when
there are removed from the cluster.
Signed-off-by: Donnie Adams <donnie.adams@suse.com>
(cherry picked from commit c38a8c3b43)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>