Commit Graph

1278 Commits (b8f3967ad14950c40a108cde9fc1b01c4071c410)

Author SHA1 Message Date
Brad Davidson 77846d63c1 Propagate errors up from config.Get
Fixes crash when killing agent while waiting for config from server

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Brad Davidson 16d29398ad Move registries.yaml load into agent config
Moving it into config.Agent so that we can use or modify it outside the context of containerd setup

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Brad Davidson 5c99bdd9bd Pin images instead of locking layers with lease
Layer leases never did what we wanted anyways, and this is the new approved interface for ensuring that images do not get GCd

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Vitor Savian 4a92ced8ee Handle etcd status condition when cluster reset and disable etcd
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Set condition if node is unhealthy

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-01-09 11:20:41 -03:00
Aofei Sheng 8d2c40cdac
Use `ipFamilyPolicy: RequireDualStack` for dual-stack kube-dns (#8984)
Signed-off-by: Aofei Sheng <aofei@aofeisheng.com>
2024-01-09 00:44:03 +02:00
Manuel Buil 6330e26bb3 Wait for taint to be gone in the node before starting the netpol controller
Signed-off-by: Manuel Buil <mbuil@suse.com>
2024-01-08 12:04:18 +01:00
Brad Davidson b297996b92 Add runtime checking of golang version
Forces other groups packaging k3s to intentionally choose to build k3s with an unvalidated golang version

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 17:22:46 -08:00
Lex Rivera 5fe074b540
Add more paths to crun runtime detection (#9086)
* add usr/local paths for crun detection

Signed-off-by: Lex Rivera <me@lex.io>
2024-01-04 16:51:13 -08:00
Brad Davidson c45524e662 Add support for containerd cri registry config_path
Render cri registry mirrors.x.endpoints and configs.x.tls into config_path; keep
using mirrors.x.rewrites and configs.x.auth those do not yet have an
equivalent in the new format.

The new config file format allows disabling containerd's fallback to the
default endpoint when using mirror endpoints; a new CLI flag is added to
control that behavior.

This also re-shares some code that was unnecessarily split into parallel
implementations for linux/windows versions. There is probably more work
to be done on this front but it's a good start.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 16:50:26 -08:00
Brad Davidson 319dca3e82 Fix nil map in full snapshot configmap reconcile
If a full reconcile wins the race against sync of an individual snapshot resource, or someone intentionally deletes the configmap, the data map could be nil and cause a crash.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 16:49:58 -08:00
Brad Davidson db7091b3f6 Handle logging flags when parsing kube-proxy args
Also adds a test to ensure this continues to work.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 16:23:03 -08:00
Brad Davidson 1e663622d2 Fix the OTHER log message that prints the wrong variable
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 15:23:39 -08:00
Brad Davidson a27d660a24 Add ServiceLB support for PodHostIPs FeatureGate
If the feature-gate is enabled, use status.hostIPs for dual-stack externalTrafficPolicy=Local support

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-02 16:00:09 -08:00
Derek Nola aca1c2fd11
Add a retry around updating a secrets-encrypt node annotations (#9039)
* Add a retry around updating a se node annotations

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-01-02 12:21:37 -08:00
Pierre bbd68f3a50
Rebase & Squash (#9070)
Signed-off-by: Yodo <pierre@azmed.co>
2024-01-02 12:05:36 -08:00
Derek Nola 3190a5faa2
Remove rotate-keys subcommand (#9079)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-12-20 12:26:41 -08:00
Hussein Galal 9411196406
Update flannel to v0.24.0 and remove multiclustercidr flag (#9075)
* update flannel to v0.24.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* remove multiclustercidr flag

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2023-12-20 00:25:38 +02:00
Hussein Galal 7101af36bb
Update Kubernetes to v1.29.0+k3s1 (#9052)
* Update to v1.29.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update to v1.29.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update go to 1.21.5

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update golangci-lint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update flannel to 0.23.0-k3s1

This update uses k3s' fork of flannel to allow the removal of
multicluster cidr flag logic from the code

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix flannel calls

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update cri-tools to version v1.29.0-k3s1

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Remove GOEXPERIMENT=nounified from arm builds

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Skip golangci-lint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Fix setup logging with newer go version

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Move logging flags to components arguments

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* add sysctl commands to the test script

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update scripts/test

Signed-off-by: Brad Davidson <brad@oatmail.org>

* disable secretsencryption tests

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
Signed-off-by: Brad Davidson <brad@oatmail.org>
Co-authored-by: Brad Davidson <brad@oatmail.org>
2023-12-19 05:14:02 +02:00
Brad Davidson 231cb6ed20
Remove GA feature-gates (#8970)
Remove KubeletCredentialProviders and JobTrackingWithFinalizers feature-gates, both of which are GA and cannot be disabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-12-14 22:57:24 +02:00
Brad Davidson 08509a2a90 Allow setting default-runtime on servers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-12-08 18:18:08 -08:00
Brad Davidson b9c288f702 Bump containerd/runc to v1.7.10-k3s1/v1.1.10
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-12-08 18:17:19 -08:00
Vitor Savian 03532f7c0b Added runtime classes for crun/wasm/nvidia
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Added default runtime flag

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2023-12-08 15:49:28 -03:00
Brad Davidson 6d3a92a658 Print key instead of file path in snapshot metadata log message
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-21 14:03:27 -08:00
Brad Davidson b23e70d519 Don't apply s3 retention if S3 client failed to initialize
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-21 14:03:27 -08:00
Brad Davidson a92c4a0f17 Don't request metadata when listing objects
While some implementations may support it, it appears that most don't,
and some may in fact return an error if it is requested.

We already stat the object to get the metadata anyway, so this was
unnecessary if harmless on most implementations.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-21 14:03:27 -08:00
Brad Davidson 1e0a7044cf Reorder snapshot configmap reconcile to reduce log spew during initial startup
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-17 10:09:01 -08:00
Vitor Savian e53c189587
Handle nil pointer when runtime core is not ready in etcd
Signed-off-by: Vitor <vitor.savian@suse.com>
2023-11-16 15:58:42 -08:00
Brad Davidson 6c544a4679 Add jitter to client config retry
Also:
* Replaces labeled for/continue RETRY loops with wait helpers for improved readability
* Pulls secrets and nodes from cache for node password verification
* Migrate nodepassword tests to wrangler mocks for better code reuse

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-16 09:53:28 -08:00
Harsimran Singh Maan abc2efdd57
Disable helm CRD installation for disable-helm-controller (#8702)
* Disable helm CRD installation for disable-helm-controller
    The NewContext package requires config as input which would
    require all third-party callers to update when the new go module
    is published.
    
    This change only affects the behaviour of installation of helm
    CRDs. Existing helm crds installed in a cluster would not be removed
    when disable-helm-controller flag is set on the server.
    
    Addresses #8701
* address review comments
* remove redundant check

Signed-off-by: Harsimran Singh Maan <maan.harry@gmail.com>
2023-11-15 14:35:31 -08:00
Jason Costello 07ee854914
Tweaked order of ingress IPs in ServiceLB (#8711)
* Tweaked order of ingress IPs in ServiceLB
    Previously, ingress IPs were only string-sorted when returned
    Sorted by IP family and string-sorted in each family as part of
    filterByIPFamily method
* Update pkg/cloudprovider/servicelb.go
* Formatting

Signed-off-by: Jason Costello <jason@hazy.com>
Co-authored-by: Brad Davidson <brad@oatmail.org>
2023-11-15 14:33:31 -08:00
Brad Davidson 7ecd5874d2 Skip initial datastore reconcile during cluster-reset
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-15 14:31:44 -08:00
Brad Davidson 2088218c5f Fix issue with snapshot metadata configmap
Omit snapshot list configmap entries for snapshots without extra metadata; reduce log level of warnings about missing s3 metadata files.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-15 14:25:28 -08:00
chenk008 b47cbbfd42
add agent flag disable-apiserver-lb (#8717)
* add node flag disable-agent-lb
* add agent flag disable-apiserver-lb

Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: chenk008 <kongchen28@gmail.com>
2023-11-14 15:54:32 -08:00
Oliver Larsson 30c8ad926d QoS-class resource configuration
Problem:
Configuring qos-class features in containerd requres a custom containerd configuration template.

Solution:
Look for configuration files in default locations and configure containerd to use them if they exist.

Signed-off-by: Oliver Larsson <larsson.e.oliver@gmail.com>
2023-11-14 15:53:14 -08:00
Manuel Buil 8f7a8b23b7 Improve dualStack log
Signed-off-by: Manuel Buil <mbuil@suse.com>
2023-11-14 10:50:37 +01:00
Hussein Galal f5920d7864
Add warning for multiclustercidr flag (#8758)
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2023-11-14 01:27:52 +02:00
Flavio Castelli ba5fcf13fc
Wasm shims and runtimes detection
Create a generic helper function that finds extra containerd runtimes.
The code was originally inside of the nvidia container discovery file.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

Discover the containerd shims based on runwasi that are already
available on the node.

The runtimes could have been installed either by a package manager or by
the kwasm operator.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

The containerd configuration on a Linux system now handles the nvidia
and the WebAssembly runtimes.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>

---------

Signed-off-by: Flavio Castelli <fcastelli@suse.com>
2023-11-13 14:43:41 -08:00
Vitor Savian c5cd7b3d65
Added etcd status condition
Signed-off-by: Vitor <vitor.savian@suse.com>
2023-11-13 06:39:24 -08:00
Hussein Galal 9e13aad4a8
Update traefik to fix registry value (#8792)
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2023-11-06 23:37:21 +02:00
Hussein Galal 1ae053d944
Upgrade traefik chart to v25.0.0 (#8771)
* Upgrade traefik chart to v25.0.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go generate

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2023-11-03 01:55:03 +02:00
Texot f575a05be2
fix: Access outer scope .SystemdCgroup (#8761)
Signed-off-by: Texot <tete1030@gmail.com>
2023-11-02 10:47:16 -07:00
Brad Davidson 49411e7084 Don't try to read token hash and cluster id during cluster-reset
These fields are only necessary when saving snapshots to S3, and will block restoration if attempted

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-27 15:06:29 -07:00
Brad Davidson 5b6b9685e9 Manually requeue configmap reconcile when no nodes have reconciled snapshots
Silences error message from lasso - this is a normal startup condition
when no snapshots exist so we shouldn't log nasty looking errors.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-18 15:09:25 -07:00
Brad Davidson 3db1d33282 Re-enable etcd endpoint auto-sync
Removing this in 002e6c43ee regressed
control-plane-only nodes, as we rely on the etcd client to update its
endpoint list internally so that we can use it to sync the load-balancer
address list.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-18 08:33:03 -07:00
Brad Davidson b8dc95539b Fix CloudDualStackNodeIPs feature-gate inconsistency
Enable the feature-gate for both kubelet and cloud-controller-manager. Enabling it on only one side breaks RKE2, where feature-gates are not shared due to running in different processes.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-17 10:40:12 -07:00
Sean Yen 0c9bf36fe0
[K3s][Windows Port] Build script, multi-call binary, and Flannel (#7259)
* initial windows port.

Signed-off-by: Sean Yen <seanyen@microsoft.com>
Signed-off-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Wei Ran <weiran@microsoft.com>
2023-10-16 14:53:09 -04:00
Derek Nola aaf8409096
Use version.Program not K3s in log (#8653)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-10-16 11:02:12 -07:00
Brad Davidson 9597ea1183 Start etcd client before ensuring self removal
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-13 23:24:16 -07:00
Brad Davidson 3abc8b82ed Bump traefik, golang.org/x/net, google.golang.org/grpc
Fixes exposure to CVE-2023-39325

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-13 09:45:54 -07:00
Roberto Bonafiglia 1ffb4603cd Use IPv6 in case is the first configured IP with dualstack
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2023-10-13 10:23:31 +02:00