Automatic merge from submit-queue
kubenet: Fix host port for rktnetes.
Because rkt pod runs after plugin.SetUpPod() is called, so
getRunningPods() does not return the newly created pod, which
causes the hostport iptable rules to be missing for this new pod.
cc @dcbw @freehan
A follow up fix for https://github.com/kubernetes/kubernetes/pull/27878#issuecomment-227898936
Because rkt pod runs after plugin.SetUpPod() is called, so
getRunningPods() does not return the newly created pod, which
causes the hostport iptable rules to be missing for this new pod.
Use the generic runtime method to get the netns path. Also
move reading the container IP address into cni (based off kubenet)
instead of having it in the Docker manager code. Both old and new
methods use nsenter and /sbin/ip and should be functionally
equivalent.
Automatic merge from submit-queue
Sets IgnoreUnknown=1 in CNI_ARGS
```release-note
release-note-none
```
K8 uses CNI_ARGS to pass pod namespace, name and infra container
id to the CNI network plugin. CNI logic will throw an error
if these args are not known to it, unless the user specifies
IgnoreUnknown as part of CNI_ARGS. This PR sets IgnoreUnknown=1
to prevent the CNI logic from erroring and blocking pod setup.
https://github.com/appc/cni/pull/158https://github.com/appc/cni/issues/126
Automatic merge from submit-queue
Various kubenet fixes (panics and bugs and cidrs, oh my)
This PR fixes the following issues:
1. Corrects an inverse error-check that prevented `shaper.Reset` from ever being called with a correct ip address
2. Fix an issue where `parseCIDR` would fail after a kubelet restart due to an IP being stored instead of a CIDR being stored in the cache.
3. Fix an issue where kubenet could panic in TearDownPod if it was called before SetUpPod (e.g. after a kubelet restart).. because of bug number 1, this didn't happen except in rare situations (see 2 for why such a rare situation might happen)
This adds a test, but more would definitely be useful.
The commits are also granular enough I could split this up more if desired.
I'm also not super-familiar with this code, so review and feedback would be welcome.
Testing done:
```
$ cat examples/egress/egress.yml
apiVersion: v1
kind: Pod
metadata:
labels:
name: egress
name: egress-output
annotations: {"kubernetes.io/ingress-bandwidth": "300k"}
spec:
restartPolicy: Never
containers:
- name: egress
image: busybox
command: ["sh", "-c", "sleep 60"]
$ cat kubelet.log
...
Running: tc filter add dev cbr0 protocol ip parent 1:0 prio 1 u32 match ip dst 10.0.0.5/32 flowid 1:1
# setup
...
Running: tc filter del dev cbr0 parent 1:proto ip prio 1 handle 800::800 u32
# teardown
```
I also did various other bits of manual testing and logging to hunt down the panic and other issues, but don't have anything to paste for that
cc @dcbw @kubernetes/sig-network
Automatic merge from submit-queue
rkt: Pass through podIP
This is needed for the /etc/hosts mount and the downward API to work.
Furthermore, this is required for the reported `PodStatus` to be
correct.
The `Status` bit mostly worked prior to #25062, and this restores that
functionality in addition to the new functionality.
In retrospect, the regression in status is large enough the prior PR should have included at least some of this; my bad for not realizing the full implications there.
#25902 is needed for downwards api stuff, but either merge order is fine as neither will break badly by itself.
cc @yifan-gu @dcbw
The length of an IP can be 4 or 16, and even if 16 it can be a valid
ipv4 address. This check is the more-correct way to handle this, and it
also provides more granular error messages.
Teardown can run before Setup when the kubelet is restarted... in that
case, the shaper was nil and thus calling the shaper resulted in a panic
This fixes that by ensuring the shaper is always set... +1 level of
indirection and all that.
Before this change, the podCIDRs map contained both cidrs and ips
depending on which code path entered a container into it.
Specifically, SetUpPod would enter a CIDR while GetPodNetworkStatus
would enter an IP.
This normalizes both of them to always enter just IP addresses.
This also removes the now-redundant cidr parsing that was used to get
the ip before
When the IP isn't in the internal map, GetPodNetworkStatus() needs
to call the execer for the 'nsenter' program. That means the execer
needs to be !nil, which it wasn't before.
Automatic merge from submit-queue
Make IsQualifiedName return error strings
Part of the larger validation PR, broken out for easier review and merge.
@lavalamp FYI, but I know you're swamped, too.
Automatic merge from submit-queue
kubenet try to retrieve ip inside pod net namespace
Kubenet currently stores the ips of pods inside a map. Kubelet gets pod ip from kubenet during syncpod. If Kubelet restarts, all pods on the node lost their ips in podStatus. This PR adds logic to retrieve pod IP from pod netns.
cc: @yujuhong
Automatic merge from submit-queue
kubenet: fix up CNI bridge TX queue length if needed
CNI's bridge plugin mis-handles the TxQLen when creating the bridge,
leading to a zero-length TX queue. This doesn't typically cause
problems (since virtual interfaces don't have hard queue limits)
but when adding traffic shaping, some qdiscs pull their packet
limits from the TX queue length, leading to a packet limit of 0
in some cases. Until we can depend on a new enough version of
CNI, fix up the TX queue length internally.
Closes: https://github.com/kubernetes/kubernetes/issues/25092
CNI's bridge plugin mis-handles the TxQLen when creating the bridge,
leading to a zero-length TX queue. This doesn't typically cause
problems (since virtual interfaces don't have hard queue limits)
but when adding traffic shaping, some qdiscs pull their packet
limits from the TX queue length, leading to a packet limit of 0
in some cases. Until we can depend on a new enough version of
CNI, fix up the TX queue length internally.
K8 uses CNI_ARGS to pass pod namespace, name and infra container
id to the CNI network plugin. CNI logic will throw an error
if these args are not known to it, unless the user specifies
IgnoreUnknown as part of CNI_ARGS. This PR sets IgnoreUnknown=1
to prevent the CNI logic from erroring and blocking pod setup.
https://github.com/appc/cni/pull/158https://github.com/appc/cni/issues/126