When kubelet starts a pod that refers to non-existing PV, PVC or Node, it
should clearly show that the requested element does not exist.
Previous "PersistentVolumeClaim 'default/ceph-claim-wm' is not in cache"
looks like random kubelet hiccup, while "PersistentVolumeClaim
'default/ceph-claim-wm' not found" suggests that the object may not exist at
all and it might be an user error.
Fixes#27523
Implements part of #24071
I am not familiar with the scheduler enough to know what to do with the scores. Punting for now.
Missing items from the implementation plan: limitranger, rkt support, kubectl
support and user docs
DONE:
1. refactor all predicates: predicates return fitOrNot(bool) and error(Error) in which the latter is of type
PredicateFailureError or InsufficientResourceError. (For violation of either MaxEBSVolumeCount or
MaxGCEPDVolumeCount, returns one same error type as ErrMaxVolumeCountExceeded)
2. GeneralPredicates() is a predicate function, which includes serveral other predicate functions (PodFitsResource,
PodFitsHost, PodFitsHostPort). It is registered as one of the predicates in DefaultAlgorithmProvider, and
is also called in canAdmitPod() in Kubelet and should be called by other components (like rescheduler, etc)
if necessary. See discussion in issue #12744
3. remove podNumber check from GeneralPredicates
4. HostName is now verified in Kubelet's canAdminPod(). add TestHostNameConflicts in kubelet_test.go
5. add getNodeAnyWay() method in Kubelet to get node information in standaloneMode
TODO:
1. determine which predicates should be included in GeneralPredicates()
2. separate GeneralPredicates() into:
a. GeneralPredicatesEvictPod() and
b. GeneralPredicatesNotEvictPod()
3. DaemonSet should use GeneralPredicates()
For certain volume types (e.g. AWS EBS or GCE PD), a limitted
number of such volumes can be attached to a given node. This commit
introduces a predicate with allows cluster admins to cap
the maximum number of volumes matching a particular type attached to a
given node.
The volume type is configurable by passing a pair of filter functions,
and the maximum number of such volumes is configurable to allow node
admins to reserve a certain number of volumes for system use.
By default, the predicate is exposed as MaxEBSVolumeCount and
MaxGCEPDVolumeCount (for AWS ElasticBlocKStore and GCE PersistentDisk
volumes, respectively), each of which can be configured using the
`KUBE_MAX_PD_VOLS` environment variable.
Fixes#7835
For AWS EBS, a volume can only be attached to a node in the same AZ.
The scheduler must therefore detect if a volume is being attached to a
pod, and ensure that the pod is scheduled on a node in the same AZ as
the volume.
So that the scheduler need not query the cloud provider every time, and
to support decoupled operation (e.g. bare metal) we tag the volume with
our placement labels. This is done automatically by means of an
admission controller on AWS when a PersistentVolume is created backed by
an EBS volume.
Support for tagging GCE PVs will follow.
Pods that specify a volume directly (i.e. without using a
PersistentVolumeClaim) will not currently be scheduled correctly (i.e.
they will be scheduled without zone-awareness).