Skip to content

Conversation

@nvazquez
Copy link
Contributor

@nvazquez nvazquez commented Oct 29, 2018

Description

Feature Specification: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653548

Live storage migration on KVM under these conditions:

  • From source and destination hosts within the same cluster
  • From NFS primary storage to NFS cluster-wide primary storage
  • Source NFS and destination NFS storage mounted on hosts

In order to enable this functionality, database should be updated in order to enable live storage capacibilty for KVM, if previous conditions are met. This is due to existing conflicts between qemu and libvirt versions. This has been tested on CentOS 6 hosts.

Additional notes:

  1. To use this feature set the storage_motion_supported=1 in the hypervisor_capability table for KVM. This is done by default as the feature may not work in some environments, read below.
  2. This feature of online storage+VM migration for KVM will only work with CentOS6 and possible Ubuntu as KVM hosts but not with CentOS7 due to:

Fix for CentOS 7:

  • Create repo file on /etc/yum.repos.d/:
[qemu-kvm-rhev]
name=oVirt rebuilds of qemu-kvm-rhev
baseurl=http://resources.ovirt.org/pub/ovirt-3.5/rpm/el7Server/
mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-3.5-el7Server
enabled=1
skip_if_unavailable=1
gpgcheck=0
  • yum install qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-img-ev-2.3.0-29.1.el7.x86_64
  • Reboot host

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

How Has This Been Tested?

Tested on environment having 2 KVM CentOS 6 hosts.

Note: CentOS 7 hosts report an issue around qemu version:
org.libvirt.LibvirtException: internal error: unable to execute QEMU command 'migrate': this feature or command is not currently supported

Fix for CentOS 7:

  • Create repo file on /etc/yum.repos.d/:
[qemu-kvm-rhev]
name=oVirt rebuilds of qemu-kvm-rhev
baseurl=http://resources.ovirt.org/pub/ovirt-3.5/rpm/el7Server/
mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-3.5-el7Server
enabled=1
skip_if_unavailable=1
gpgcheck=0
  • yum install qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-img-ev-2.3.0-29.1.el7.x86_64
  • Reboot host

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good overall, some comments. A biggy is; there is no integration test. That would seem needed for a functional change like this. @nvazquez /cc @rhtyd @borisstoyanov ?

wido
wido previously requested changes Oct 30, 2018
public boolean executeInSequence() {
return false;
}
} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a newline here :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks


-- PR#1448 update description of 'execute.in.sequence.network.element.commands' parameter to reflect an unused command that has been removed. The removed class command is 'UserDataCommand'.
update `cloud`.`configuration` set description = 'If set to true, DhcpEntryCommand, SavePasswordCommand, VmDataCommand will be synchronized on the agent side. If set to false, these commands become asynchronous. Default value is false.' where name = 'execute.in.sequence.network.element.commands';
update `cloud`.`configuration` set description = 'If set to true, DhcpEntryCommand, SavePasswordCommand, VmDataCommand will be synchronized on the agent side. If set to false, these commands become asynchronous. Default value is false.' where name = 'execute.in.sequence.network.element.commands';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update by accident of this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had previously included an update sql on the hypervisor_cappabilities to enable live storage motion on KVM. But due to conflicting versions on qemu just decided to remove it and must have removed an end of line or extra space, will sort it out


private String getQemuImgPathScript = String.format("which %s >& /dev/null; " +
"if [ $? -gt 0 ]; then echo \"%s\"; else echo \"%s\"; fi",
cloudQemuImgPath, _qemuImgPath, cloudQemuImgPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a Qemu utility in the KVM code, can't we offload this there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this feature requires qemu-img command supporting snapshots I have introduced this check. This cloud-qemu-img can be included on KVM hosts to support snapshots as per: http://www.nux.ro/oldblog/archive/2014/01/Taking_KVM_volume_snapshots_with_Cloudstack_4_2_on_CentOS_6_5.html
If cloud-qemu-img is present then we can use it, and if not just qemu-img. It is also done this way on scripts: https://github.com/apache/cloudstack/search?q=cloud-qemu-img&unscoped_q=cloud-qemu-img.


/* The qemu-img binary. We expect this to be in $PATH */
public String _qemuImgPath = "qemu-img";
private String cloudQemuImgPath = "cloud-qemu-img";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This binary does not exist on Ubuntu/Debian. Why the name change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure it works with snapshot support. If the binary is present on the KVM host it will be used, and if not just the qemu-img.

@nvazquez
Copy link
Contributor Author

Thanks @wido @DaanHoogland for reviewing, I'll address your comments soon. Along with @borisstoyanov we'll be adding integration tests on this PR.

@nvazquez
Copy link
Contributor Author

Feature specification link added to the PR description

@mike-tutkowski
Copy link
Member

@nvazquez Hi Nicolas - Just an FYI that I plan to look at this tomorrow.

@nvazquez
Copy link
Contributor Author

Thanks @mike-tutkowski that's great

Copy link
Member

@mike-tutkowski mike-tutkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I just had the one comment on the code.

Copy link
Contributor

@borisstoyanov borisstoyanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvazquez, I think this one is missing the marvin tests we created in the private repo

@rohityadavcloud rohityadavcloud self-assigned this Dec 28, 2018
@rohityadavcloud rohityadavcloud force-pushed the kvmlivestoragemigration-master branch from 50de309 to 5439cf3 Compare December 28, 2018 09:54
@rohityadavcloud
Copy link
Member

@nvazquez I've rebased and fixed conflicts against latest master. Can you check the final rebased commit against the original commit here: 5439cf3

@rohityadavcloud rohityadavcloud force-pushed the kvmlivestoragemigration-master branch from 5439cf3 to 224b74d Compare December 28, 2018 11:37
@rohityadavcloud
Copy link
Member

@blueorangutan package

1 similar comment
@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2856

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 7, 2019

@blueorangutan test

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3662)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 29564 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2983-t3662-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_direct_download.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 69 look OK, 2 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_upload_direct_download_certificates Failure 0.04 test_direct_download.py
test_02_redundant_VPC_default_routes Error 0.75 test_vpc_redundant.py
test_04_rvpc_network_garbage_collector_nics Failure 317.93 test_vpc_redundant.py

@rohityadavcloud
Copy link
Member

@nvazquez can you check the test_02_upload_direct_download_certificates failure?
@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2864

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 7, 2019

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2865

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 7, 2019

@blueorangutan test

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3669)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 25753 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2983-t3669-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_hostha_kvm.py
Smoke tests completed. 71 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 8, 2019

@rhtyd marvin test failure fixed, now all tests are passing

@nvazquez nvazquez closed this Jun 8, 2019
@nvazquez nvazquez reopened this Jun 8, 2019
@rohityadavcloud rohityadavcloud merged commit 0fbf500 into apache:master Jun 10, 2019
rohityadavcloud added a commit to shapeblue/cloudstack that referenced this pull request Sep 18, 2019
Considering that we have addressed a considerable effort on allowing KVM
to perform storage data motion, this PR proposes updating the
'hypervisor_capabilities' table setting the 'storage_motion_supported'
to '1' for KVM.

PRs that implemented KVM storage motion features:

Non-managed storages
 apache#2997 KVM VM live migration with ROOT volume on file storage type
 apache#2983 KVM live storage migration intra cluster from NFS source and destination
Managed storages
 apache#2298 CLOUDSTACK-9620: Enhancements for managed storage

Signed-off-by: Rohit Yadav <[email protected]>
rohityadavcloud added a commit that referenced this pull request Sep 19, 2019
- Removes CentOS6/el6 packaging (voting thread reference https://markmail.org/message/u3ka4hwn2lzwiero)
- Add upgrade path from 4.13 to 4.14
- Enable live storage migration support for KVM by default as el6 is deprecated
- PRs using live storage migration
  #2997 KVM VM live migration with ROOT volume on file storage type
  #2983 KVM live storage migration intra cluster from NFS source and destination
  #2298 CLOUDSTACK-9620: Enhancements for managed storage

Signed-off-by: Rohit Yadav <[email protected]>
@nvazquez nvazquez deleted the kvmlivestoragemigration-master branch April 6, 2020 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants