Commit Graph

54 Commits (a4574bbb042402974a0269651ca42bc9e8137c72)

Author SHA1 Message Date
Di Xu 57ead4898b use GetFileType per mount.Interface to check hostpath type 2017-09-26 09:57:06 +08:00
Di Xu 46b0b3491f refactor nsenter to new pkg/util 2017-09-26 09:56:44 +08:00
Kubernetes Submit Queue 7d9eb60837 Merge pull request #51518 from jianglingxia/jlx8291910
Automatic merge from submit-queue (batch tested with PRs 43016, 50503, 51281, 51518, 51582). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

double const in mount_linux.go

**What this PR does / why we need it**:
fix some typo and double const
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-22 23:35:59 -07:00
Kubernetes Submit Queue 980a8e6367 Merge pull request #50503 from karataliu/mount_clean
Automatic merge from submit-queue (batch tested with PRs 43016, 50503, 51281, 51518, 51582). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Clean up diskLooksUnformatted literal

**What this PR does / why we need it**:
#16948 moved the `formatAndMount` function to mount_linux.go, but `diskLooksUnformatted` does not necessarily need to appear in mount_unsupported.go
#31515 Renames `diskLooksUnformatted` to `getDiskFormat`, but did not update the comment

This is to do the small cleanup.

**Which issue this PR fixes**

**Special notes for your reviewer**:

**Release note**:
2017-09-22 23:35:54 -07:00
xiazhang 82c909cc99 enable azure disk mount on windows node
add initial work for mount azure file on windows

fix review comments

full implementation for attach azure file on windows node

working azure file mount

remove useless functions

add a workable implementation about mounting azure file on windows node

fix review comments and make the pod creating successful even azure file mount failed

fix according to review comments

add mount_windows_test

add implementation for IsLikelyNotMountPoint func

remove mount_windows_test.go temporaly

add back unit test for mount_windows.go

add normalizeWindowsPath func

fix normalizeWindowsPath func issue

implment azure disk on windows

update bazel BUILD

revert validation.go change as it's another PR

fix merge issue and compiling issue

fix windows compiling issue

fix according to review comments

fix according to review comments

fix cross-build failure

fix according to review comments

fix test build failure temporalily

fix darwin build failure

fix azure windows test failure

add empty implementation of MakeRShared on windows

fix gofmt errors
2017-09-12 01:52:48 +00:00
jianglingxia 4629c8a54e squash the commits into one 2017-09-04 09:44:53 +08:00
Jan Safranek d9500105d8 Share /var/lib/kubernetes on startup
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
2017-08-30 16:45:04 +02:00
Jan Safranek 0e547bae22 SafeFormatAndMount should use volume.Exec provided by VolumeHost
We need to execute mkfs / fsck where the utilities are.
2017-08-14 12:16:27 +02:00
Dong Liu a6ff000ea5 Clean up diskLooksUnformatted literal 2017-08-11 16:11:39 +08:00
Kubernetes Submit Queue 68ac78ae45 Merge pull request #49640 from jsafrane/systemd-mount-service
Automatic merge from submit-queue

Run mount in its own systemd scope.

Kubelet needs to run /bin/mount in its own cgroup.

- When kubelet runs as a systemd service, "systemctl restart kubelet" may kill all processes in the same cgroup and thus terminate fuse daemons that are needed for gluster and cephfs mounts.

- When kubelet runs in a docker container, restart of the container kills all fuse daemons started in the container.

Killing fuse daemons is bad, it basically unmounts volumes from running pods.

This patch runs mount via "systemd-run --scope /bin/mount ...", which makes sure that any fuse daemons are forked in its own systemd scope (= cgroup) and they will survive restart of kubelet's systemd service or docker container.

This helps with #34965

As a downside, each new fuse daemon will run in its own transient systemd service and systemctl output may be cluttered.

@kubernetes/sig-storage-pr-reviews 
@kubernetes/sig-node-pr-reviews 

```release-note
fuse daemons for GlusterFS and CephFS are now run in their own systemd scope when Kubernetes runs on a system with systemd.
```
2017-08-09 12:05:01 -07:00
Jan Safranek dd03384747 Detect systemd on mounter startup 2017-08-08 15:40:27 +02:00
Kubernetes Submit Queue 5d24a2c199 Merge pull request #49300 from tklauser/syscall-to-x-sys-unix
Automatic merge from submit-queue

Switch from package syscall to golang.org/x/sys/unix

**What this PR does / why we need it**:

The syscall package is locked down and the comment in https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24 advises to switch code to use the corresponding package from golang.org/x/sys. This PR does so and replaces usage of package syscall with package golang.org/x/sys/unix where applicable. This will also allow to get updates and fixes
without having to use a new go version.

In order to get the latest functionality, golang.org/x/sys/ is re-vendored. This also allows to use Eventfd() from this package instead of calling the eventfd() C function.

**Special notes for your reviewer**:

This follows previous works in other Go projects, see e.g. moby/moby#33399, cilium/cilium#588

**Release note**:

```release-note
NONE
```
2017-08-03 04:02:12 -07:00
Jan Safranek 5a8a6110a2 Run mount in its own systemd scope.
Kubelet needs to run /bin/mount in its own cgroup.

- When kubelet runs as a systemd service, "systemctl restart kubelet" may kill
  all processes in the same cgroup and thus terminate fuse daemons that are
  needed for gluster and cephfs mounts.

- When kubelet runs in a docker container, restart of the container kills all
  fuse daemons started in the container.

Killing fuse daemons is bad, it basically unmounts volumes from running pods.

This patch runs mount via "systemd-run --scope /bin/mount ...", which makes
sure that any fuse daemons are forked in its own systemd scope (= cgroup) and
they will survive restart of kubelet's systemd service or docker container.

As a downside, each new fuse daemon will run in its own transient systemd
service and systemctl output may be cluttered.
2017-07-26 16:14:39 +02:00
Tobias Klauser 4a69005fa1 switch from package syscall to x/sys/unix
The syscall package is locked down and the comment in [1] advises to
switch code to use the corresponding package from golang.org/x/sys. Do
so and replace usage of package syscall with package
golang.org/x/sys/unix where applicable.

  [1] https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24

This will also allow to get updates and fixes for syscall wrappers
without having to use a new go version.

Errno, Signal and SysProcAttr aren't changed as they haven't been
implemented in /x/sys/. Stat_t from syscall is used if standard library
packages (e.g. os) require it. syscall.SIGTERM is used for
cross-platform files.
2017-07-21 12:14:42 +02:00
ymqytw 3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Ian Chakeres 2b18d3b6f7 Fixes bind-mount teardown failure with non-mount point Local volumes
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
2017-07-11 17:19:58 -04:00
Jan Safranek 4cf36b8b39 Do not reformat devices with partitions
lsblk reports FSTYPE of devices with partition tables as empty string "",
which is indistinguishable from empty devices. We must look for dependent
devices (i.e. partitions) to see that the device is really empty and report
error otherwise.

I checked that LVM, LUKS and MD RAID have their own FSTYPE in lsblk output,
so it should be only a partition table that has empty FSTYPE.

The main point of this patch is to run lsblk without "-n", i.e. print all
dependent devices and check if they're there.
2017-03-20 13:08:13 +01:00
Kubernetes Submit Queue 81d01a84e0 Merge pull request #41944 from jingxu97/Feb/mounter
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)

Use chroot for containerized mounts

This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-28 09:20:21 -08:00
Kubernetes Submit Queue a426904009 Merge pull request #31515 from jsafrane/format-error
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)

Show specific error when a volume is formatted by unexpected filesystem.

kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.

Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.

Example kubectl describe pod output:

```
  FirstSeen     LastSeen        Count   From                                    SubobjectPath   Type            Reason          Message
  41s           3s              7       {kubelet ip-172-18-3-82.ec2.internal}                   Warning         FailedMount     MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/aws://us-east-1d/vol-ba79c81d" (spec.Name: "pvc-ce175cbb-6b82-11e6-9fe4-0e885cca73d3") pod "3d19cb64-6b83-11e6-9fe4-0e885cca73d3" (UID: "3d19cb64-6b83-11e6-9fe4-0e885cca73d3") with: failed to mount the volume as "ext4", it's already formatted with "xfs". Mount error: mount failed: exit status 32
Mounting arguments: /dev/xvdba /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-ba79c81d ext4 [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdba,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
```
2017-02-25 02:17:57 -08:00
Jing Xu ac22416835 Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-24 13:46:26 -08:00
Harry Zhang 3bdc3f25ec Use fnv.New32a() in hash instead adler32 2017-02-15 14:03:54 +08:00
Jan Safranek c8df30973b Show specific error when a volume is formatted by unexpected filesystem.
kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.

Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.
2017-02-13 12:15:34 +01:00
Kubernetes Submit Queue 6dfe5c49f6 Merge pull request #38865 from vwfs/ext4_no_lazy_init
Automatic merge from submit-queue

Enable lazy initialization of ext3/ext4 filesystems

**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #30752, fixes #30240

**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```

**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.

These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount

Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```

The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.

When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system. 

As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory. 

I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.

As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
2017-01-18 09:09:52 -08:00
deads2k 6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Alexander Block 13a2bc8afb Enable lazy initialization of ext3/ext4 filesystems 2016-12-18 11:08:51 +01:00
Jing Xu 37136e9780 Enable containerized mounter only for nfs and glusterfs types
This change is to only enable containerized mounter for nfs and
glusterfs types. For other types such as tmpfs, ext2/3/4 or empty type,
we should still use mount from $PATH
2016-12-02 15:06:24 -08:00
Pengfei Ni f584ed4398 Fix package aliases to follow golang convention 2016-11-30 15:40:50 +08:00
Vishnu kannan dd8ec911f3 Revert "Revert "Merge pull request #35821 from vishh/gci-mounter-scope""
This reverts commit 402116aed4.
2016-11-08 11:09:10 -08:00
saadali 402116aed4 Revert "Merge pull request #35821 from vishh/gci-mounter-scope"
This reverts commit 973fa6b334, reversing
changes made to 41b5fe86b6.
2016-11-03 20:23:25 -07:00
Vishnu Kannan 414e4ae549 Revert "Adding a root filesystem override for kubelet mounter"
This reverts commit e861a5761d.
2016-11-02 15:18:09 -07:00
Vishnu Kannan 1ecc12f724 [Kubelet] Do not use custom mounter script for bind mounts, ext* and tmpfs mounts
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2016-11-02 15:18:08 -07:00
Vishnu kannan 7fd03c4b6e Fix source and target path with overriden rootfs in mount utility package
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-27 09:46:33 -07:00
Vishnu kannan e861a5761d Adding a root filesystem override for kubelet mounter
This is useful for supporting hostPath volumes via containerized
mounters in kubelet.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-26 21:42:59 -07:00
Jing Xu b02481708a Fix volume states out of sync problem after kubelet restarts
When kubelet restarts, all the information about the volumes will be
gone from actual/desired states. When update node status with mounted
volumes, the volume list might be empty although there are still volumes
are mounted and in turn causing master to detach those volumes since
they are not in the mounted volumes list. This fix is to make sure only
update mounted volumes list after reconciler starts sync states process.
This sync state process will scan the existing volume directories and
reconstruct actual states if they are missing.

This PR also fixes the problem during orphaned pods' directories. In
case of the pod directory is unmounted but has not yet deleted (e.g.,
interrupted with kubelet restarts), clean up routine will delete the
directory so that the pod directoriy could be cleaned up (it is safe to
delete directory since it is no longer mounted)

The third issue this PR fixes is that during reconstruct volume in
actual state, mounter could not be nil since it is required for creating
container.VolumeMap. If it is nil, it might cause nil pointer exception
in kubelet.

Details are in proposal PR #33203
2016-10-25 12:29:12 -07:00
Michael Taufen dba917c5b7 Include mount command in Kubelet mounter output 2016-10-24 05:50:24 -07:00
Jing Xu 34ef93aa0c Add mounterPath to mounter interface
In order to be able to use new mounter library, this PR adds the
mounterPath flag to kubelet which passes the flag to the mount
interface. If flag is empty, mount uses default mount path.
2016-10-20 14:15:27 -07:00
Brendan Burns 07c8f9a173 Don't return an error if a file doesn't exist for IsPathDevice(...) 2016-09-05 20:45:22 -07:00
Morgan Bauer 92a043e833
ensure pkg/util/mount compiles & crosses
- move compile time check from linux code to generic code
2016-08-21 17:47:24 -07:00
Jing Xu f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Scott Creeley 11d1289afa Add volume and mount logging 2016-07-21 09:10:00 -04:00
Cindy Wang e13c678e3b Make volume unmount more robust using exclusive mount w/ O_EXCL 2016-07-18 16:20:08 -07:00
David McMahon ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
laushinka 7ef585be22 Spelling fixes inspired by github.com/client9/misspell 2016-02-18 06:58:05 +07:00
Sami Wagiaalla c18f342ac6 Use constants for fsck return values 2015-12-08 10:51:12 -05:00
Sami Wagiaalla 10688f1a11 Run fsck before formatting disk
Signed-off-by: Sami Wagiaalla <swagiaal@redhat.com>
2015-12-08 10:50:30 -05:00
Sami Wagiaalla 1880c4eedb move formatAndMount and diskLooksUnformatted to mount_linux 2015-11-06 15:37:46 -05:00
Huamin Chen 3b14135cad mount returns more verbose message upon error
Signed-off-by: Huamin Chen <hchen@redhat.com>
2015-10-21 11:52:02 -04:00
Eric Paris f125ad88ce Rename IsMountPoint to IsLikelyNotMountPoint
IsLikelyNotMountPoint determines if a directory is not a mountpoint.
It is fast but not necessarily ALWAYS correct. If the path is in fact
a bind mount from one part of a mount to another it will not be detected.
mkdir /tmp/a /tmp/b; mount --bin /tmp/a /tmp/b; IsLikelyNotMountPoint("/tmp/b")
will return true. When in fact /tmp/b is a mount point. So this patch
renames the function and switches it from a positive to a negative (I
could think of a good positive name). This should make future users of
this function aware that it isn't quite perfect, but probably good
enough.
2015-08-14 18:45:43 -04:00
markturansky 450002a52e Fixed formatting of error message 2015-06-19 11:21:57 -04:00
Paul Morie e5521234e4 Add NsenterMounter mount implementation 2015-05-04 14:40:04 -04:00