Skip to content

Conversation

@mike-tutkowski
Copy link
Member

@mike-tutkowski mike-tutkowski commented Oct 16, 2017

Allowed zone-wide primary storage based on a custom plug-in to be added via the GUI in a KVM-only environment (previously this only worked for XenServer and VMware)

Added support for root disks on managed storage with KVM

Added support for volume snapshots with managed storage on KVM

Enabled creating a template directly from a volume (i.e. without having to go through a volume snapshot) on KVM with managed storage

Only allowed the resizing of a volume for managed storage on KVM if the volume in question is either not attached to a VM or is attached to a VM in the Stopped state

Included support for Reinstall VM on KVM with managed storage

Enabled offline migration on KVM from non-managed storage to managed storage and vice versa

Included support for online storage migration on KVM with managed storage (NFS and Ceph to managed storage)

Added support to download (extract) a managed-storage volume to a QCOW2 file

When uploading a file from outside of CloudStack to CloudStack, set the min and max IOPS, if applicable.

Included support for the KVM auto-convergence feature

The compression flag was actually added in version 1.0.3 (1000003) as opposed to version 1.3.0 (1003000) (changed this to reflect the correct version)

On KVM when using iSCSI-based managed storage, if the user shuts a VM down from the guest OS (as opposed to doing so from CloudStack), we need to pass to the KVM agent a list of applicable iSCSI volumes that need to be disconnected.

Added a new Global Setting: kvm.storage.live.migration.wait

For XenServer, added a check to enforce that only volumes from zone-wide managed storage can be storage motioned from a host in one cluster to a host in another cluster (cannot do so at the time being with volumes from cluster-scoped managed storage)

Don’t allow Storage XenMotion on a VM that has any managed-storage volume with one or more snapshots.

Enabled for managed storage with VMware: Template caching, create snapshot, delete snapshot, create volume from snapshot, and create template from snapshot

Added an SIOC API plug-in to support VMware SIOC

When starting a VM that uses managed storage in a cluster other than the one it last was running in, we need to remove the reference to the iSCSI volume from the original cluster.

Added the ability to revert a volume to a snapshot

Enabled cluster-scoped managed storage

Added support for VMware dynamic discovery

@mike-tutkowski
Copy link
Member Author

@syed
Copy link
Contributor

syed commented Oct 24, 2017

Don’t allow Storage XenMotion on a VM that has any managed-storage volume with one or more snapshots.

Is this VM snapshots on XenServer or snapshots on SolidFire?

@mike-tutkowski
Copy link
Member Author

mike-tutkowski commented Oct 24, 2017 via email

@mike-tutkowski
Copy link
Member Author

Although, @syed, now that you mention it, we should probably prohibit Storage XenMotion when either existing backend snapshots (volume snapshots) or existing VM snapshots are present. I believe either snapshot type will be a problem in this scenario.

This, of course, doesn't mean we have to prohibit them forever. Perhaps we will have a good solution in the future. However, at least for this release, they should both be prohibited.

I have taken a note to implement the extract check. Thanks!

@mike-tutkowski mike-tutkowski force-pushed the managed-storage-enhancements branch from f9eeceb to 87f1dbf Compare November 3, 2017 21:34
@mike-tutkowski
Copy link
Member Author

@syed I investigated the VM-snapshots issue. As it turns out, there is already logic in place to prohibit Storage XenMotion from occurring if the VM has any VM snapshots on it.

@mike-tutkowski mike-tutkowski force-pushed the managed-storage-enhancements branch 4 times, most recently from b0c5c6b to 466c701 Compare November 16, 2017 06:16
@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-1269

@mike-tutkowski mike-tutkowski force-pushed the managed-storage-enhancements branch 5 times, most recently from 3ee3eb9 to af72fde Compare November 24, 2017 07:18
@rohityadavcloud
Copy link
Member

@mike-tutkowski is this ready for prime time review and testing, or still in progress?

@mike-tutkowski
Copy link
Member Author

Hi @rhtyd - This is essentially ready to go minus a small bit of VMware work around iSCSI dynamic discovery that I've been adding to it.

I would say it would be great if people would review it now and just kind of know there might be a couple more VMware updates to it over the next week or so. There shouldn't be much code added to it. The extra VMware code is something I worked on for a customer last month for the 4.6 branch and am now merging to this PR.

@rohityadavcloud
Copy link
Member

rohityadavcloud commented Nov 30, 2017

@mike-tutkowski thanks for replying, looking forward to your changes. Once you're done, please ping people on this PR and ask explicitly. I may find some time to review by end of next week. Can you also add a link to a FS (if any)?

@mike-tutkowski mike-tutkowski force-pushed the managed-storage-enhancements branch 4 times, most recently from 95a3183 to 6d1da69 Compare December 8, 2017 19:58
@mike-tutkowski mike-tutkowski force-pushed the managed-storage-enhancements branch from 78589f0 to b9a3320 Compare January 13, 2018 02:16
@mike-tutkowski
Copy link
Member Author

@rhtyd

XenServer:

test_volumes.py::test_07_resize_fail: Fixed

test_vm_snapshots.py::test_change_service_offering_for_vm_with_snapshots: Fixed

KVM:

test_router_dhcphosts.py::test_router_dhcphosts: Doesn’t seem related to PR: AssertionError: Ping to outside world from VM should be successful
test_router_dhcphosts.py::ContextSuite context=TestRouterDHCPHosts>:teardown: Related to the problem in the issue above

test_ssvm.py::test_01_list_sec_storage_vm: Doesn’t seem related to PR: AssertionError: Check gateway with that of corresponding ip range
test_ssvm.py::test_05_stop_ssvm: Doesn’t seem related to this PR: AssertionError: Check gateway with that of corresponding ip range

test_volumes.py::test_07_resize_fail: Fixed (same fix as first issue in XenServer section)

Still investigating for KVM

test_templates.py::test_03_deploy_vm_wrong_checksum
test_templates.py::ContextSuite context=TestCreateTemplateWithDirectDownload>:teardown
test_templates.py::test_04_extract_template

test_usage.py::ContextSuite context=TestISOUsage>:setup

test_volumes.py::test_06_download_detached_volume

VMware:

test_ssvm.py::test_01_list_sec_storage_vm: Doesn’t seem related to PR: AssertionError: Check gateway with that of corresponding ip range
test_ssvm.py::test_05_stop_ssvm: Doesn’t seem related to PR: AssertionError: Check gateway with that of corresponding ip range

test_vm_snapshots.py::test_change_service_offering_for_vm_with_snapshots: Fixed (same fix as second issue in XenServer section)

Still investigating for VMware

test_templates.py::test_04_extract_template

test_usage.py::ContextSuite context=TestISOUsage>:setup

test_volumes.py::test_01_create_volume
test_volumes.py::test_06_download_detached_volume

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖centos6 ✔centos7 ✔debian. JID-1651

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-1652

@rohityadavcloud
Copy link
Member

@blueorangutan test matrix

@blueorangutan
Copy link

@rhtyd a Trillian-Jenkins matrix job (centos6 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

@mike-tutkowski
Copy link
Member Author

@rhtyd I don't seem to see any issues that would stem from those remaining tests (the ones I listed in the "Still investigating" category). As such, I'll wait for the currently running tests to complete and go from there. Thanks!

@blueorangutan
Copy link

Trillian test result (tid-2147)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33589 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2298-t2147-kvm-centos7.zip
Intermitten failure detected: /marvin/tests/smoke/test_public_ip_range.py
Intermitten failure detected: /marvin/tests/smoke/test_ssvm.py
Intermitten failure detected: /marvin/tests/smoke/test_templates.py
Intermitten failure detected: /marvin/tests/smoke/test_usage.py
Intermitten failure detected: /marvin/tests/smoke/test_volumes.py
Intermitten failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Intermitten failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Smoke tests completed. 63 look OK, 4 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_list_sec_storage_vm Failure 0.13 test_ssvm.py
test_05_stop_ssvm Failure 79.25 test_ssvm.py
test_03_deploy_vm_wrong_checksum Error 24.02 test_templates.py
ContextSuite context=TestCreateTemplateWithDirectDownload>:teardown Error 37.58 test_templates.py
test_04_extract_template Failure 128.33 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
test_06_download_detached_volume Failure 137.78 test_volumes.py

@blueorangutan
Copy link

Trillian test result (tid-2144)
Environment: vmware-65 (x2), Advanced Networking with Mgmt server 7
Total time taken: 38392 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2298-t2144-vmware-65.zip
Intermitten failure detected: /marvin/tests/smoke/test_vm_snapshots.py
Intermitten failure detected: /marvin/tests/smoke/test_volumes.py
Smoke tests completed. 65 look OK, 2 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_change_service_offering_for_vm_with_snapshots Failure 299.25 test_vm_snapshots.py
test_01_create_volume Failure 217.50 test_volumes.py

@rohityadavcloud
Copy link
Member

rohityadavcloud commented Jan 14, 2018

@mike-tutkowski /cc @rafaelweingartner @DaanHoogland
I relooked at each of the errors, the only outstanding issues are with vmware test results, rest have been addressed in #2403:

test_change_service_offering_for_vm_with_snapshots | Failure | 299.25 | test_vm_snapshots.py

Based on my analysis of the logs and test results, all the kvm and xenserver failures are not related to this PR. The test_public_ip_range.py test adds a fake public IP range which if not removed causes new ssvm/cpvm to pick a fake public IP that causes iso/template/volume download and setup related failures, the new tests (failing sometimes now) were introduced in #2295.

If you can comment on that one failing vmware test (see above) we can accept this - please see that asap @mike-tutkowski @DaanHoogland @rafaelweingartner. The test has been suggestd to be fixed, as soon as xenserver-65sp1 results are back and they don't show this regression I can accept the PR.

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on regression smoke test. There is only outstanding issue, but not blocking this PR from acceptance.

@mike-tutkowski
Copy link
Member Author

@rhtyd Thanks, Rohit! By the way, I have just finished manually walking through the steps in test_vm_snapshots.py::test_change_service_offering_for_vm_with_snapshots and they have always worked (I've tried it three times now).

@rohityadavcloud
Copy link
Member

Okay @mike-tutkowski I'll take your word on it and request that you work with us in case we find regression post-merge. I'll cut RC1 tomorrow, but given our experience it's unlikely that RC1 graduates to a release which will get attention from community to start testing it and buy you some time to fix smoketest/regression and solidfire related tests failures. Do you agree and willing to work?

@mike-tutkowski
Copy link
Member Author

mike-tutkowski commented Jan 14, 2018 via email

@rohityadavcloud
Copy link
Member

Fantastic @mike-tutkowski, thanks for your support. I'll merge the PR as soon as XenServer test results are back (maybe in next hour).

@blueorangutan
Copy link

Trillian test result (tid-2150)
Environment: xenserver-65sp1 (x2), Advanced Networking with Mgmt server 7
Total time taken: 35235 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2298-t2150-xenserver-65sp1.zip
Intermitten failure detected: /marvin/tests/smoke/test_templates.py
Smoke tests completed. 66 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_02_edit_template Failure 90.19 test_templates.py

@rohityadavcloud
Copy link
Member

Tests LGTM. Merging this based on two code reviews and test results. Post merging I'll kick a smoketest round and work with @mike-tutkowski and others to fix any regressions. Thanks everyone for their contributions and work (of course the author!).

@rohityadavcloud rohityadavcloud merged commit a30a31c into apache:master Jan 14, 2018
@mike-tutkowski mike-tutkowski deleted the managed-storage-enhancements branch January 22, 2018 19:50
rohityadavcloud added a commit to shapeblue/cloudstack that referenced this pull request Sep 18, 2019
Considering that we have addressed a considerable effort on allowing KVM
to perform storage data motion, this PR proposes updating the
'hypervisor_capabilities' table setting the 'storage_motion_supported'
to '1' for KVM.

PRs that implemented KVM storage motion features:

Non-managed storages
 apache#2997 KVM VM live migration with ROOT volume on file storage type
 apache#2983 KVM live storage migration intra cluster from NFS source and destination
Managed storages
 apache#2298 CLOUDSTACK-9620: Enhancements for managed storage

Signed-off-by: Rohit Yadav <[email protected]>
rohityadavcloud added a commit that referenced this pull request Sep 19, 2019
- Removes CentOS6/el6 packaging (voting thread reference https://markmail.org/message/u3ka4hwn2lzwiero)
- Add upgrade path from 4.13 to 4.14
- Enable live storage migration support for KVM by default as el6 is deprecated
- PRs using live storage migration
  #2997 KVM VM live migration with ROOT volume on file storage type
  #2983 KVM live storage migration intra cluster from NFS source and destination
  #2298 CLOUDSTACK-9620: Enhancements for managed storage

Signed-off-by: Rohit Yadav <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants