Since a recent commit, rootless mode was failing with the following errors:
```
E0122 22:59:47.615567 21 kuberuntime_manager.go:755] createPodSandbox for pod "helm-install-traefik-wf8lc_kube-system(9de0a1b2-e2a2-4ea5-8fb6-22c9272a182f)" failed: rpc error: code = Unknown desc = failed to create network namespace for sandbox "285ab835609387f82d304bac1fefa5fb2a6c49a542a9921995d0c35d33c683d5": failed to setup netns: open /var/run/netns/cni-c628a228-651e-e03e-d27d-bb5e87281846: permission denied
...
E0122 23:31:34.027814 21 pod_workers.go:191] Error syncing pod 1a77d21f-ff3d-4475-9749-224229ddc31a ("coredns-854c77959c-w4d7g_kube-system(1a77d21f-ff3d-4475-9749-224229ddc31a)"), skipping: failed to "CreatePodSandbox" for "coredns-854c77959c-w4d7g_kube-system(1a77d21f-ff3d-4475-9749-224229ddc31a)" with CreatePodSandboxError: "CreatePodSandbox for pod \"coredns-854c77959c-w4d7g_kube-system(1a77d21f-ff3d-4475-9749-224229ddc31a)\" failed: rpc error: code = Unknown desc = failed to create containerd task: io.containerd.runc.v2: create new shim socket: listen unix /run/containerd/s/8f0e40e11a69738407f1ebaf31ced3f08c29bb62022058813314fb004f93c422: bind: permission denied\n: exit status 1: unknown"
```
Remove symlinks to /run/{netns,containerd} so that rootless mode can create their own /run/{netns,containerd}.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
* Replace k3s cloud provider wrangler controller with core node informer
Upstream k8s has exposed an interface for cloud providers to access the
cloud controller manager's node cache and shared informer since
Kubernetes 1.9. This is used by all the other in-tree cloud providers;
we should use it too instead of running a dedicated wrangler controller.
Doing so also appears to fix an intermittent issue with the uninitialized
taint not getting cleared on nodes in CI.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Problem:
While using ZFS on debian and K3s with docker, I am unable to get k3s working as the snapshotter value is being validated and the validation fails.
Solution:
We should not validate snapshotter value if we are using docker as it's a no-op in that case.
Signed-off-by: Waqar Ahmed <waqarahmedjoyia@live.com>
We're not setting ``--service-account-issuer` to a https URL, which causes an
error message at startup when the feature gate is enabled. From the
docs on that flag:
> If this option is not a valid URI per the OpenID Discovery 1.0 spec, the
> ServiceAccountIssuerDiscovery feature will remain disabled, even if the
> feature gate is set to true. It is highly recommended that this value
> comply with the OpenID spec:
> https://openid.net/specs/openid-connect-discovery-1_0.html. In practice,
> this means that service-account-issuer must be an https URL. It is also
> highly recommended that this URL be capable of serving OpenID discovery
> documents at {service-account-issuer}/.well-known/openid-configuration.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Solution: Set priorityClassName to system-node-critical of traefik, metrics-server, local storage and coredns deployment
Signed-off-by: transhapHigsn <fet.prashantsingh@gmail.com>
Ubuntu and Debian kernels support mounting real overlayfs inside userns,
but the vanilla kernel still does not allow it.
OTOH fuse-overlayfs can be mounted inside userns with the vanilla kernel (>= 4.18).
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Removing the cfg.DataDir mutation in 3e4fd7b did not break anything, but
did change some paths in unwanted ways. Rather than mutating the
user-supplied command-line flags, explicitly specify the agent
subdirectory as needed.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
As per documentation, the cloud-provider flag should not be passed to
controller-manager when using cloud-controller. However, the legacy
cloud-related controllers still need to be explicitly disabled to
prevent errors from being logged.
Fixing this also prevents controller-manager from creating the
cloud-controller-manager service account that needed extra RBAC.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Related to rancher/rke2#474
Note that anyone who customizes the data-dir path will have to set
CRI_CONFIG_FILE to the correct path when using the wrapped binaries
(crictl, etc). This is better than dropping files in the incorrect
location.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Resolves warning 2 from #2471.
As per https://github.com/kubernetes/cloud-provider/issues/12 the
ClusterID requirement was never really followed through on, so the
flag is probably going to be removed in the future.
One side-effect of this is that the core k8s cloud-controller-manager
also wants to watch nodes, and needs RBAC to do so.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Related to #2455 and containerd/containerd#4684
These were not meant to be enabled by default, break images with many
layers, and will be disabled by default on the next containerd release.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
The --disable/--no-deploy flags actually turn off some built-in
controllers, in addition to preventing manifests from getting loaded.
Make it clear which controllers can still be disabled even when the
packaged components are ommited by the no_stage build tag.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>