If health checks are failing for all servers, make a second pass through the server list with health-checks ignored before returning failure
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ca39614d4e)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ed23a2bb48)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Refactor agent supervisor listener startup and authn/authz to use upstream
auth delegators to perform for SubjectAccessReview for access to
metrics.
* Convert spegel and pprof handlers over to new structure.
* Promote bind-address to agent flag to allow setting supervisor bind
address for both agent and server.
* Promote enable-pprof to agent flag to allow profiling agents. Access
to the pprof endpoint now requires client cert auth, similar to the
spegel registry api endpoint.
* Add prometheus metrics handler.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit ff679fb3ab)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 3d14092f76)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Just enable IP forwarding for all address families regardless of service address families.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 095ecdb034)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Will now use 127.0.0.1:10010, same as containerd's CRI
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
(cherry picked from commit 7374010c0c)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
The default clientaccess request timeout is too short. Wait longer by default, and add the s3 timeout if s3 is enabled.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Update traefik chart to bump image tag and fix quoting
* Fix image quoting in flat manifests
* Update local-path-provisioner config to stop using deprecated hostpath volume type
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Prefer the address of the etcd member being joined, and seed the full address list immediately on startup.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Bump spegel to v0.0.20-k3s1
* Remove deprecated libp2p Pretty function
* Remove quic-go pin
Pinned version is now out of date, indirect dependencies are now newer, with CVE issue fixed
Signed-off-by: Derek Nola <derek.nola@suse.com>
* Adds support for health-checking loadbalancer servers. If a
health-check fails when dialing, all existing connections to the
server will be closed.
* Wires up a remotedialer tunnel connectivity check as the health check
for supervisor/apiserver connections.
* Wires up a simple ping request to the supervisor port as the health
check for etcd connections.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
CRI and containerd APIs disagree about the registry names - CRI supports
index.docker.io as an alias for docker.io, while containerd does not.
Use the actual stored RepoTag to determine what image to ask containerd for.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Don't clobber the providerID field and instance-type/region/zone labels if provided by the kubelet. This allows the user to set these to the correct values when using the embedded CCM in a real cloud environment.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Prevents joining nodes from being stuck with bad initial member list if there is a transient failure, or if they try to join themselves
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fix the wasm shim detection and the containerd configuration generation.
Prior to this commit, the binary and the `RuntimeType` values were not
correct.
Signed-off-by: Flavio Castelli <fcastelli@suse.com>