Commit Graph

347 Commits (07c2bd4cc22fc60f80df3f7c954349f0d13500c8)

Author SHA1 Message Date
Brad Davidson 884673c8e1 Add support for svclb pod PriorityClassName
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 37f97b33c9)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 09:17:58 -07:00
Brad Davidson 5344e45dc4 Fix etcd snapshot reconcile for agentless nodes
Disable cleanup of orphaned snapshots and patching of node annotations if running agentless

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit edb0440017)
2024-04-11 10:01:23 -07:00
Brad Davidson 349cd3b871 Move error response generation code into util
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 7a2a2d075c)
2024-04-11 10:01:23 -07:00
Vitor Savian 9176d7f68a Add tls for kine
* Bump kine
* Add integration tests for kine with tls

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-04-02 11:40:16 -03:00
Derek Nola 5efbb06874 Rename AgentReady to ContainerRuntimeReady for better clarity
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-02-21 13:50:47 -08:00
Derek Nola 4a787b6642 Restore original order of agent startup functions
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-02-21 13:50:47 -08:00
Hussein Galal 1228fea1ae Update flannel to v0.24.0 and remove multiclustercidr flag (#9075)
* update flannel to v0.24.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* remove multiclustercidr flag

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-02-11 17:14:36 +01:00
Brad Davidson f6303cf14d Bump kine and set NotifyInterval to what the apiserver expects
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit de825845b2)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-10 00:49:41 -08:00
Harrison Affel d6c244c627 allow executors to define containerd and docker behavior
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-02-09 16:05:50 -03:00
Brad Davidson 9643d40179 Consistently handle component exit on shutdown
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-07 19:57:53 -08:00
Brad Davidson faf9d4466d Add embedded registry implementation
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 37e9b87f62)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 16:30:56 -08:00
Brad Davidson 8ab374deed Add server CLI flag and config fields for embedded registry
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ef90da5c6e)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 16:30:56 -08:00
Brad Davidson f81d460ee5 Move registries.yaml load into agent config
Moving it into config.Agent so that we can use or modify it outside the context of containerd setup

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 16d29398ad)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 16:30:56 -08:00
Brad Davidson 053afed3ef Add support for containerd cri registry config_path
Render cri registry mirrors.x.endpoints and configs.x.tls into config_path; keep
using mirrors.x.rewrites and configs.x.auth those do not yet have an
equivalent in the new format.

The new config file format allows disabling containerd's fallback to the
default endpoint when using mirror endpoints; a new CLI flag is added to
control that behavior.

This also re-shares some code that was unnecessarily split into parallel
implementations for linux/windows versions. There is probably more work
to be done on this front but it's a good start.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit c45524e662)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 16:30:56 -08:00
Brad Davidson 067a6545b4 Remove GA feature-gates (#8970)
Remove KubeletCredentialProviders and JobTrackingWithFinalizers feature-gates, both of which are GA and cannot be disabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 231cb6ed20)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 16:30:56 -08:00
Flavio Castelli 9e182bb798 Added runtimes for wasm/crun/nvidia
Create a generic helper function that finds extra containerd runtimes.
The code was originally inside of the nvidia container discovery file.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

Discover the containerd shims based on runwasi that are already
available on the node.

The runtimes could have been installed either by a package manager or by
the kwasm operator.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

The containerd configuration on a Linux system now handles the nvidia
and the WebAssembly runtimes.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

---------

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

Added runtime classes for crun/wasm/nvidia

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Added default runtime flag

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2023-12-08 18:19:50 -08:00
Brad Davidson 248a009de5 Skip initial datastore reconcile during cluster-reset
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 7ecd5874d2)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-16 09:55:41 -08:00
Oliver Larsson d7c1ac7ab6 QoS-class resource configuration
Problem:
Configuring qos-class features in containerd requres a custom containerd configuration template.

Solution:
Look for configuration files in default locations and configure containerd to use them if they exist.

Signed-off-by: Oliver Larsson <larsson.e.oliver@gmail.com>
(cherry picked from commit 30c8ad926d)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-16 09:55:41 -08:00
Edgar Lee 55e61670c3 Add --image-service-endpoint flag (#8279)
* Add --image-service-endpoint flag

Problem:
External container runtime can be set but image service endpoint is unchanged
and also is not exposed as a flag. This is useful for using containerd
snapshotters outside of the ones that have built-in support like
stargz-snapshotter.

Solution:
Add a flag --image-service-endpoint and also default image service endpoint to
container runtime endpoint if set.

Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
(cherry picked from commit fe18b1fce9)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-17 10:44:19 -07:00
Brad Davidson b0fb6f343e Fix CloudDualStackNodeIPs feature-gate inconsistency
Enable the feature-gate for both kubelet and cloud-controller-manager. Enabling it on only one side breaks RKE2, where feature-gates are not shared due to running in different processes.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-17 10:43:37 -07:00
Sean Yen dbea2e68c8 Windows support
Signed-off-by: Sean Yen <seanyen@microsoft.com>
2023-10-16 23:14:58 +02:00
Brad Davidson df0fd0de49 Store extra metadata and cluster ID for snapshots
Write the extra metadata both locally and to S3. These files are placed such that they will not be used by older versions of K3s that do not make use of them.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 7464007037)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-13 11:09:28 -07:00
Brad Davidson 9826b553c9 Disable HTTP on main etcd client port
Fixes performance issue under load, ref: https://github.com/etcd-io/etcd/issues/15402 and https://github.com/kubernetes/kubernetes/pull/118460

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 8c73fd670b)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-13 11:09:28 -07:00
Roberto Bonafiglia 9ce7972ea3 Use IPv6 in case is the first configured IP with dualstack
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2023-10-13 10:24:56 +02:00
Derek Nola d451d4f34f
Server Token Rotation (#8576)
* Consolidate NewCertCommands
* Add support for user defined new token
* Add E2E testlets



* Ensure agent token also changes

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-10-10 13:03:09 -07:00
Manuel Buil 9c70ee4091 Network defaults are duplicated, remove one
Signed-off-by: Manuel Buil <mbuil@suse.com>
2023-10-04 08:24:10 +02:00
Manuel Buil dbb6280d70 Take IPFamily precedence based on order
Signed-off-by: Manuel Buil <mbuil@suse.com>
2023-09-29 12:38:16 +02:00
Pedro Tashima bd04941a29
Update to v1.27.6 and Go to 1.20.8 (#8356)
* Update to v1.27.6

Signed-off-by: Pedro Tashima <pedro.tashima@suse.com>

* Bump containerd and stargz versions

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Print message on upgrade fail

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Send Bad Gateway instead of Service Unavailable when tunnel dial fails

Works around new handling for Service Unavailable by apiserver aggregation added in kubernetes/kubernetes#119870

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Add 60 seconds to server upgrade wait to account for delays in apiserver readiness

Also change cleanup helper to ensure upgrade test doesn't pollute the
images for the rest of the tests.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

---------

Signed-off-by: Pedro Tashima <pedro.tashima@suse.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Co-authored-by: Pedro Tashima <pedro.tashima@suse.com>
Co-authored-by: Brad Davidson <brad.davidson@rancher.com>
2023-09-19 12:53:43 -03:00
Brad Davidson f365a9cb98 Add new CLI flag to enable TLS SAN CN filtering
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-08-29 08:34:56 -07:00
Brad Davidson f21ae1d949 Make apiserver egress args conditional on egress-selector-mode
Only configure enable-aggregator-routing and egress-selector-config-file
if required by egress-selector-mode.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-07-31 13:59:41 -07:00
Derek Nola be44243353
Adjust default kubeconfig file permissions (#7978)
* Adjust default kubeconfig permissions

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-07-14 15:00:27 -07:00
Bartosz Lenart 34617390d0
Generation of certificates and keys for etcd gated if etcd is disabled. (#6998)
Problem:
When support for etcd was added in 3957142, generation of certificates and keys for etcd was not gated behind use of managed etcd.
Keys are generated and distributed across servers even if managed etcd is not enabled.

Solution:
Allow generation of certificates and keys only if managed etc is enabled. Check config.DisableETCD flag.

Signed-off-by: Bartossh <lenartconsulting@gmail.com>
2023-07-11 10:24:35 -07:00
Vitor Savian 0809187cff
Adding cli to custom klipper helm image (#7682)
Adding cli to custom klipper helm image

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2023-06-28 15:31:58 +00:00
guoguangwu 2215870d5d chore: pkg imported more than once
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-06-26 16:58:11 -07:00
Manuel Buil 869e030bdd VPN PoC
Signed-off-by: Manuel Buil <mbuil@suse.com>
2023-06-09 12:39:33 +02:00
Brad Davidson 45d8c1a1a2 Soft-fail on node password verification if the secret cannot be created
Allows nodes to join the cluster during a webhook outage. This also
enhances auditability by creating Kubernetes events for the deferred
verification.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-06-05 15:31:04 -07:00
Brad Davidson 64a5f58f1e Create new kubeconfig for supervisor use
Only actual admin actions should use the admin kubeconfig; everything done by the supervisor/deploy/helm controllers will now use a distinct account for audit purposes.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-30 18:15:11 -07:00
thomasferrandiz b4bc57d049
Merge pull request #7303 from thomasferrandiz/netpol-log-level
ensure that klog verbosity is set to the same level as logrus
2023-05-10 15:01:06 +02:00
Derek Nola d5f560360e
Handle multiple arguments with StringSlice flags (#7380)
* Add helper function for multiple arguments in stringslice

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Cleanup server setup with util function

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-05-02 09:55:48 -07:00
Brad Davidson f1b6a3549c Fix stack log on panic
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-04-28 11:24:34 -07:00
Brad Davidson c44d33d29b Fix race condition in tunnel server startup
Several places in the code used a 5-second retry loop to wait on
Runtime.Core to be set. This caused a race condition where OnChange
handlers could be added after the Wrangler shared informers were already
started. When this happened, the handlers were never called because the
shared informers they relied upon were not started.

Fix that by requiring anything that waits on Runtime.Core to run from a
cluster controller startup hook that is guaranteed to be called before
the shared informers are started, instead of just firing it off in a
goroutine that retries until it is set.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-04-28 11:24:34 -07:00
Thomas Ferrandiz 66fcca66cb ensure that klog verbosity is set to the same level as logrus
by repeatedly settting it every second during k3s startup

Signed-off-by: Thomas Ferrandiz <thomas.ferrandiz@suse.com>
2023-04-24 18:08:55 +00:00
Derek Nola 944f811dc5
v1.27.1 CLI Deprecation (#7311)
* Remove Flannel Wireguard
* Remove etcd-snapshot (implicit save)
* Convert ipsec and multiple backend to fatal

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-04-19 12:02:05 -07:00
Roberto Bonafiglia 15ee88964b Added multiClusterCidr feature
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2023-03-14 18:30:52 +01:00
Brad Davidson 977a85559e Add support for cross-signing new certs during ca rotation
We need to send the full chain in order for cross-signing to work
properly during switchover to a new root.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-03-13 16:56:28 -07:00
Brad Davidson 0c302f4341 Fix etcd member deletion
Turns out etcd-only nodes were never running **any** of the controllers,
so allowing multiple controllers didn't really fix things.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-14 09:39:41 -08:00
Brad Davidson 3d146d2f1b Allow for multiple sets of leader-elected controllers
Addresses an issue where etcd controllers did not run on etcd-only nodes

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-10 10:46:48 -08:00
Brad Davidson c6d0afd0cb Check for existing resources before creating them
Prevents errors when starting with fail-closed webhooks

Also, use panic instead of Fatalf so that the CloudControllerManager rescue can handle the error

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-09 15:20:49 -08:00
Brad Davidson 992e64993d Add support for kubeadm token and client certificate auth
Allow bootstrapping with kubeadm bootstrap token strings or existing
Kubelet certs. This allows agents to join the cluster using kubeadm
bootstrap tokens, as created with the `k3s token create` command.

When the token expires or is deleted, agents can successfully restart by
authenticating with their kubelet certificate via node authentication.
If the token is gone and the node is deleted from the cluster, node auth
will fail and they will be prevented from rejoining the cluster until
provided with a valid token.

Servers still must be bootstrapped with the static cluster token, as
they will need to know it to decrypt the bootstrap data.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-07 14:55:04 -08:00
Brad Davidson 373df1c8b0 Add support for `k3s token` command
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-02-07 14:55:04 -08:00