* Adds support for health-checking loadbalancer servers. If a
health-check fails when dialing, all existing connections to the
server will be closed.
* Wires up a remotedialer tunnel connectivity check as the health check
for supervisor/apiserver connections.
* Wires up a simple ping request to the supervisor port as the health
check for etcd connections.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
CRI and containerd APIs disagree about the registry names - CRI supports
index.docker.io as an alias for docker.io, while containerd does not.
Use the actual stored RepoTag to determine what image to ask containerd for.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Don't clobber the providerID field and instance-type/region/zone labels if provided by the kubelet. This allows the user to set these to the correct values when using the embedded CCM in a real cloud environment.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Prevents joining nodes from being stuck with bad initial member list if there is a transient failure, or if they try to join themselves
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fix the wasm shim detection and the containerd configuration generation.
Prior to this commit, the binary and the `RuntimeType` values were not
correct.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
* Set ServerNodeName in snapshot CLI setup
* Raise errer if ServerNodeName ends up empty some other way
* Fix status controller to use etcd node name annotation instead of prefix checking
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Add both dual-stack addresses to the node hosts file
* Add hostname to hosts file as alias for node name to ensure consistent resolution
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Reorder copy order for caching
* Enable longer http timeout requests
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Setup reencrypt controller to run on all apiserver nodes
* Fix reencryption for disabling secrets encryption, reenable drone tests
The nodes controller was reading from the configmaps cache, but doesn't add any handlers, so if no other controller added configmap handlers, the cache would remain empty.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Fix issue with bare host or IP as endpoint
* Fix issue with localhost registries not defaulting to http.
* Move the registry template prep to a separate function,
and adds tests of that function so that we can ensure we're
generating the correct content.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Fixes issue where proxy support only honored server address via K3S_URL, not CLI or config.
* Fixes crash when agent proxy is enabled, but proxy env vars do not return a proxy URL for the server address (server URL is in NO_PROXY list).
* Adds tests
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Moving it into config.Agent so that we can use or modify it outside the context of containerd setup
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Layer leases never did what we wanted anyways, and this is the new approved interface for ensuring that images do not get GCd
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Forces other groups packaging k3s to intentionally choose to build k3s with an unvalidated golang version
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Render cri registry mirrors.x.endpoints and configs.x.tls into config_path; keep
using mirrors.x.rewrites and configs.x.auth those do not yet have an
equivalent in the new format.
The new config file format allows disabling containerd's fallback to the
default endpoint when using mirror endpoints; a new CLI flag is added to
control that behavior.
This also re-shares some code that was unnecessarily split into parallel
implementations for linux/windows versions. There is probably more work
to be done on this front but it's a good start.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If a full reconcile wins the race against sync of an individual snapshot resource, or someone intentionally deletes the configmap, the data map could be nil and cause a crash.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
If the feature-gate is enabled, use status.hostIPs for dual-stack externalTrafficPolicy=Local support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Remove KubeletCredentialProviders and JobTrackingWithFinalizers feature-gates, both of which are GA and cannot be disabled.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
While some implementations may support it, it appears that most don't,
and some may in fact return an error if it is requested.
We already stat the object to get the metadata anyway, so this was
unnecessary if harmless on most implementations.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Disable helm CRD installation for disable-helm-controller
The NewContext package requires config as input which would
require all third-party callers to update when the new go module
is published.
This change only affects the behaviour of installation of helm
CRDs. Existing helm crds installed in a cluster would not be removed
when disable-helm-controller flag is set on the server.
Addresses #8701
* address review comments
* remove redundant check
Signed-off-by: Harsimran Singh Maan <maan.harry@gmail.com>
* Tweaked order of ingress IPs in ServiceLB
Previously, ingress IPs were only string-sorted when returned
Sorted by IP family and string-sorted in each family as part of
filterByIPFamily method
* Update pkg/cloudprovider/servicelb.go
* Formatting
Signed-off-by: Jason Costello <jason@hazy.com>
Co-authored-by: Brad Davidson <brad@oatmail.org>
Omit snapshot list configmap entries for snapshots without extra metadata; reduce log level of warnings about missing s3 metadata files.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Problem:
Configuring qos-class features in containerd requres a custom containerd configuration template.
Solution:
Look for configuration files in default locations and configure containerd to use them if they exist.
Signed-off-by: Oliver Larsson <larsson.e.oliver@gmail.com>
Create a generic helper function that finds extra containerd runtimes.
The code was originally inside of the nvidia container discovery file.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
Discover the containerd shims based on runwasi that are already
available on the node.
The runtimes could have been installed either by a package manager or by
the kwasm operator.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
The containerd configuration on a Linux system now handles the nvidia
and the WebAssembly runtimes.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
---------
Signed-off-by: Flavio Castelli <fcastelli@suse.com>
These fields are only necessary when saving snapshots to S3, and will block restoration if attempted
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Silences error message from lasso - this is a normal startup condition
when no snapshots exist so we shouldn't log nasty looking errors.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Removing this in 002e6c43ee regressed
control-plane-only nodes, as we rely on the etcd client to update its
endpoint list internally so that we can use it to sync the load-balancer
address list.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Enable the feature-gate for both kubelet and cloud-controller-manager. Enabling it on only one side breaks RKE2, where feature-gates are not shared due to running in different processes.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* initial windows port.
Signed-off-by: Sean Yen <seanyen@microsoft.com>
Signed-off-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Wei Ran <weiran@microsoft.com>
Write the extra metadata both locally and to S3. These files are placed such that they will not be used by older versions of K3s that do not make use of them.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Consolidate NewCertCommands
* Add support for user defined new token
* Add E2E testlets
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Ensure agent token also changes
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Add --image-service-endpoint flag
Problem:
External container runtime can be set but image service endpoint is unchanged
and also is not exposed as a flag. This is useful for using containerd
snapshotters outside of the ones that have built-in support like
stargz-snapshotter.
Solution:
Add a flag --image-service-endpoint and also default image service endpoint to
container runtime endpoint if set.
Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
* Update to v1.28.2
Signed-off-by: Johnatas <johnatasr@hotmail.com>
* Bump containerd and stargz versions
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Print message on upgrade fail
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Send Bad Gateway instead of Service Unavailable when tunnel dial fails
Works around new handling for Service Unavailable by apiserver aggregation added in kubernetes/kubernetes#119870
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Add 60 seconds to server upgrade wait to account for delays in apiserver readiness
Also change cleanup helper to ensure upgrade test doesn't pollute the
images for the rest of the tests.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
---------
Signed-off-by: Johnatas <johnatasr@hotmail.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Co-authored-by: Brad Davidson <brad.davidson@rancher.com>