Skip to content

clone volume: cp doesn't support sparse file #279

@stoneshi-yunify

Description

@stoneshi-yunify

Hostpath: v1.6.2

Cloning volume will call cp -a <src-vol> <dest_vol>, refer to

func loadFromFilesystemVolume(hostPathVolume hostPathVolume, destPath string) error {
.

The cp from Alpine by default doesn't support sparse file, it will copy a sparse file as a regular file. Therefore, if the source volume has a large sparse file, the cp will be extremely slow.

A QEMU/VM disk image is a kind of sparse file we usually see, projects like kubevirt them a lot.

The cp from coreutils supports sparse file by default, and will extremely shorten the copying time. So hostpath may just install the coreutils.

The cp test:

root@kubevm:~# kubectl -n kube-system exec -it csi-hostpathplugin-0 -c hostpath -- sh
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # ls -l
total 17056
-rw-rw---- 1 root root 20293720064 Apr 23 05:46 disk.img
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # du -sh *
17M	disk.img
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # time cp -a /csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 /csi-data-dir/old-cp
real	2m 35.27s
user	0m 0.02s
sys	1m 58.11s
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # cp --help
BusyBox v1.32.1 () multi-call binary.

Usage: cp [OPTIONS] SOURCE... DEST

Copy SOURCE(s) to DEST

	-a	Same as -dpR
	-R,-r	Recurse
	-d,-P	Preserve symlinks (default if -R)
	-L	Follow all symlinks
	-H	Follow symlinks on command line
	-p	Preserve file attributes if possible
	-f	Overwrite
	-i	Prompt before overwrite
	-l,-s	Create (sym)links
	-T	Treat DEST as a normal file
	-u	Copy only newer files
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # apk add coreutils
fetch https://mirrors.aliyun.com/alpine/v3.13/main/x86_64/APKINDEX.tar.gz
fetch https://mirrors.aliyun.com/alpine/v3.13/community/x86_64/APKINDEX.tar.gz
(1/6) Installing libacl (2.2.53-r0)
(2/6) Installing libattr (2.4.48-r0)
(3/6) Installing skalibs (2.10.0.0-r0)
(4/6) Installing s6-ipcserver (2.10.0.0-r0)
(5/6) Installing utmps (0.1.0.0-r0)
Executing utmps-0.1.0.0-r0.pre-install
(6/6) Installing coreutils (8.32-r2)
Executing busybox-1.32.1-r6.trigger
OK: 14 MiB in 39 packages
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 #
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 #
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # cp --help
Usage: cp [OPTION]... [-T] SOURCE DEST
  or:  cp [OPTION]... SOURCE... DIRECTORY
  or:  cp [OPTION]... -t DIRECTORY SOURCE...
Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.

Mandatory arguments to long options are mandatory for short options too.
  -a, --archive                same as -dR --preserve=all
      --attributes-only        don't copy the file data, just the attributes
      --backup[=CONTROL]       make a backup of each existing destination file
  -b                           like --backup but does not accept an argument
      --copy-contents          copy contents of special files when recursive
  -d                           same as --no-dereference --preserve=links
  -f, --force                  if an existing destination file cannot be
                                 opened, remove it and try again (this option
                                 is ignored when the -n option is also used)
  -i, --interactive            prompt before overwrite (overrides a previous -n
                                  option)
  -H                           follow command-line symbolic links in SOURCE
  -l, --link                   hard link files instead of copying
  -L, --dereference            always follow symbolic links in SOURCE
  -n, --no-clobber             do not overwrite an existing file (overrides
                                 a previous -i option)
  -P, --no-dereference         never follow symbolic links in SOURCE
  -p                           same as --preserve=mode,ownership,timestamps
      --preserve[=ATTR_LIST]   preserve the specified attributes (default:
                                 mode,ownership,timestamps), if possible
                                 additional attributes: context, links, xattr,
                                 all
      --no-preserve=ATTR_LIST  don't preserve the specified attributes
      --parents                use full source file name under DIRECTORY
  -R, -r, --recursive          copy directories recursively
      --reflink[=WHEN]         control clone/CoW copies. See below
      --remove-destination     remove each existing destination file before
                                 attempting to open it (contrast with --force)
      --sparse=WHEN            control creation of sparse files. See below
      --strip-trailing-slashes  remove any trailing slashes from each SOURCE
                                 argument
  -s, --symbolic-link          make symbolic links instead of copying
  -S, --suffix=SUFFIX          override the usual backup suffix
  -t, --target-directory=DIRECTORY  copy all SOURCE arguments into DIRECTORY
  -T, --no-target-directory    treat DEST as a normal file
  -u, --update                 copy only when the SOURCE file is newer
                                 than the destination file or when the
                                 destination file is missing
  -v, --verbose                explain what is being done
  -x, --one-file-system        stay on this file system
  -Z                           set SELinux security context of destination
                                 file to default type
      --context[=CTX]          like -Z, or if CTX is specified then set the
                                 SELinux or SMACK security context to CTX
      --help     display this help and exit
      --version  output version information and exit

By default, sparse SOURCE files are detected by a crude heuristic and the
corresponding DEST file is made sparse as well.  That is the behavior
selected by --sparse=auto.  Specify --sparse=always to create a sparse DEST
file whenever the SOURCE file contains a long enough sequence of zero bytes.
Use --sparse=never to inhibit creation of sparse files.

When --reflink[=always] is specified, perform a lightweight copy, where the
data blocks are copied only when modified.  If this is not possible the copy
fails, or if --reflink=auto is specified, fall back to a standard copy.
Use --reflink=never to ensure a standard copy is performed.

The backup suffix is '~', unless set with --suffix or SIMPLE_BACKUP_SUFFIX.
The version control method may be selected via the --backup option or through
the VERSION_CONTROL environment variable.  Here are the values:

  none, off       never make backups (even if --backup is given)
  numbered, t     make numbered backups
  existing, nil   numbered if numbered backups exist, simple otherwise
  simple, never   always make simple backups

As a special case, cp makes a backup of SOURCE when the force and backup
options are given and SOURCE and DEST are the same name for an existing,
regular file.

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Report any translation bugs to <https://translationproject.org/team/>
Full documentation <https://www.gnu.org/software/coreutils/cp>
or available locally via: info '(coreutils) cp invocation'
/csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 # time cp -a /csi-data-dir/24c4a772-a3f7-11eb-bea3-7e7563fe7b85 /csi-data-dir/coreutils-cp
real	0m 0.08s
user	0m 0.00s
sys	0m 0.02s

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions