-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Allow KVM VM live migration with ROOT volume on file storage type #2997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow KVM VM live migration with ROOT volume on file storage type #2997
Conversation
wido
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main question is how this code handles a Instance which has a disk on Local Storage and one or more on Shared storage like NFS or Ceph
|
|
||
| for (VolumeInfo volumeInfo : volumeInfoSet) { | ||
| StoragePoolVO storagePoolVO = _storagePoolDao.findById(volumeInfo.getPoolId()); | ||
| if (storagePoolVO.getPoolType() != StoragePoolType.Filesystem) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if the VM has a ROOT volume on a local disk (FileSystem) and has one or more data disks on NFS or Ceph for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this implementation is not working in such an example. The user would need to unplug the attached volume first and then migrate. I will check alternatives and ping in case I find a better way. If I am not able to solve it, I will make sure to document it and send the proper log/message for the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if it doesn't work we should throw a proper error message saying that we can't migrate the Instance as long as data disks are attached.
Check if numDataVolumes > 0, then throw an Exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, @wido, thanks for pointing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that it will not work?
I mean, the XML that described the VM, and that is going to be used in the target host will have an entry for a disk that is in a shared volume. Therefore, as long as that volume is connected to the host, you should be good to go. That is the tricky you are doing here right, because you are dealing with local storage, you will need to first move/declare the volume in the target host, and then execute a migration command that will take care of the "copy/paste" of the local disks contents in the target host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wido @rafaelweingartner updated the code and tested. Now it migrates VMs with attached data disk on shared storage.
The added code can be found in StorageSystemDataMotionStrategy.copyAsync(Map, VirtualMachineTO, Host, Host, AsyncCompletionCallback):
if (!shouldMigrateVolume(sourceStoragePool, destHost, destStoragePool)) {
continue;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Thanks!
...c/main/java/org/apache/cloudstack/storage/motion/KvmNonManagedStorageDataMotionStrategy.java
Outdated
Show resolved
Hide resolved
| if (migrateDiskInfo != null && migrateDiskInfo.isSourceDiskOnStorageFileSystem()) { | ||
| deleteLocalVolume(disk.getDiskPath()); | ||
| } else { | ||
| libvirtComputingResource.cleanupDisk(disk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this clean up a disk which is non-local? Isn't that a potential danger that you remove a data disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this the old behaviour anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wido the libvirtComputingResource.cleanupDisk(disk) was already being executed. The cleanupDisk just disconnects from the source host, after it has been migrated to another host; it works that way with iScsi, for instance. I can also work on renaming/documenting the cleanupDisk method, if you think that it would good.
The deleteLocalVolume method was added to handle in case of the disk is on a file; in such case it deletes the file from the source host (local primary storage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will delete the disk only after a successful migration, right?
...tion/src/main/java/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java
Outdated
Show resolved
Hide resolved
|
|
||
| if (migrateStorage) { | ||
| flags |= 1 << 6; | ||
| flags += VIR_MIGRATE_NON_SHARED_DISK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like these changes 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write as flags |= VIR_MIGRATE_NON_SHARED_DISK. In general, enabling is flag is more readable as performing an OR, | operation than an addition operation.
...ns/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/KVMStoragePoolManager.java
Show resolved
Hide resolved
DaanHoogland
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok at first sight but the change does not fit in my puny brain. let's test extensively in all kinds of environments.
| return sourceText; | ||
| } | ||
|
|
||
| public boolean isSourceDiskOnStorageFileSystem() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for serialised objectsw like this we better use wrapper classes instead of primatives. gson upgrade is going to demand it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DaanHoogland I was not aware of such issue when using primitives. Shouldn't gson work with primitives? Can you please show some details? I did some googling but could not find such information; if it is indeed a problem I will also change all the primitives in MigrateCommand.java and MigrateDiskInfo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old gson does, but when we try to upgrade, we'll have issues, so let's start to use wrapper classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DaanHoogland, I was checking and Gson 2.0 doesn’t support type adapters for primitive types. However, the primitive types still are supported.
The Gson documentation is misleading us to think that it does not support primitive types; however, it does support. What it does not support is creating Type Adapters for primitive types.
I did some testing with Gson 2.8.5 just to make sure; we can still marshall/unmarshal a primitive variable with Gson 2.X.
...c/main/java/org/apache/cloudstack/storage/motion/KvmNonManagedStorageDataMotionStrategy.java
Outdated
Show resolved
Hide resolved
| if (migrateDiskInfo != null && migrateDiskInfo.isSourceDiskOnStorageFileSystem()) { | ||
| deleteLocalVolume(disk.getDiskPath()); | ||
| } else { | ||
| libvirtComputingResource.cleanupDisk(disk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this the old behaviour anyway?
b64176f to
3d81ec4
Compare
| private boolean migrateStorage; | ||
| private boolean autoConvergence; | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need more of those, and please let's stop bit shifting.
|
|
||
| for (VolumeInfo volumeInfo : volumeInfoSet) { | ||
| StoragePoolVO storagePoolVO = _storagePoolDao.findById(volumeInfo.getPoolId()); | ||
| if (storagePoolVO.getPoolType() != StoragePoolType.Filesystem) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that it will not work?
I mean, the XML that described the VM, and that is going to be used in the target host will have an entry for a disk that is in a shared volume. Therefore, as long as that volume is connected to the host, you should be good to go. That is the tricky you are doing here right, because you are dealing with local storage, you will need to first move/declare the volume in the target host, and then execute a migration command that will take care of the "copy/paste" of the local disks contents in the target host.
...c/main/java/org/apache/cloudstack/storage/motion/KvmNonManagedStorageDataMotionStrategy.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/cloudstack/storage/motion/KvmNonManagedStorageSystemDataMotionTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/cloudstack/storage/motion/KvmNonManagedStorageSystemDataMotionTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/cloudstack/storage/motion/KvmNonManagedStorageSystemDataMotionTest.java
Outdated
Show resolved
Hide resolved
.../src/test/java/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategyTest.java
Outdated
Show resolved
Hide resolved
.../src/test/java/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategyTest.java
Outdated
Show resolved
Hide resolved
| if (migrateDiskInfo != null && migrateDiskInfo.isSourceDiskOnStorageFileSystem()) { | ||
| deleteLocalVolume(disk.getDiskPath()); | ||
| } else { | ||
| libvirtComputingResource.cleanupDisk(disk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will delete the disk only after a successful migration, right?
...vm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtMigrateCommandWrapper.java
Outdated
Show resolved
Hide resolved
3d81ec4 to
2f49d01
Compare
...vm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper/LibvirtMigrateCommandWrapper.java
Show resolved
Hide resolved
rohityadavcloud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, overall lgtm!
|
Test run lgtm. |
58bc835 to
ebb81b1
Compare
|
Looks like some changes were pushed -f, will kick tests |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
ebb81b1 to
039c1eb
Compare
|
Packaging result: ✖centos6 ✖centos7 ✖debian. JID-2488 |
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2489 |
|
@blueorangutan test |
|
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-3254)
|
|
Thanks for the feedback, @rhtyd, @DaanHoogland, @nvazquez, @wido, @rafaelweingartner, @mike-tutkowski! Trillian test is good, and we have 4 LGTMs. This one looks ready to merge. The Travis failures are not related to this PR. I have been digging on what is causing Travis to fail on recent PRs. |
implementation (developers can easily guess the name and use autocomplete)
039c1eb to
97771f2
Compare
|
Integration tests passed, reviews were made and adjusts executed. Travis problems are not related to this PR. Therefore, I am going to merge it. |
|
Few of my comments were not addressed or replied to, but I'll live with the changes. @GabrielBrascher @rafaelweingartner |
|
@rhtyd which ones? |
|
@rafaelweingartner go to the files changed tab, there are a few minor comments but nothing I cannot live with. |
|
Well, I indeed checked everything, and everything seems to be already answered/addressed. There was only one comment that you did regarding the flag building that @GabrielBrascher used a more human readable method instead of playing/shifting bits around. That seemed to be covered by the feedback of the community, which was very positive. |
Considering that we have addressed a considerable effort on allowing KVM to perform storage data motion, this PR proposes updating the 'hypervisor_capabilities' table setting the 'storage_motion_supported' to '1' for KVM. PRs that implemented KVM storage motion features: Non-managed storages apache#2997 KVM VM live migration with ROOT volume on file storage type apache#2983 KVM live storage migration intra cluster from NFS source and destination Managed storages apache#2298 CLOUDSTACK-9620: Enhancements for managed storage Signed-off-by: Rohit Yadav <[email protected]>
- Removes CentOS6/el6 packaging (voting thread reference https://markmail.org/message/u3ka4hwn2lzwiero) - Add upgrade path from 4.13 to 4.14 - Enable live storage migration support for KVM by default as el6 is deprecated - PRs using live storage migration #2997 KVM VM live migration with ROOT volume on file storage type #2983 KVM live storage migration intra cluster from NFS source and destination #2298 CLOUDSTACK-9620: Enhancements for managed storage Signed-off-by: Rohit Yadav <[email protected]>
This PR proposes a new
DataMotionStrategyKvmNonManagedStorageDataMotionStrategythat extends theStorageSystemDataMotionStrategy. This strategy allows migrating KVM VMs with the ROOT volume in a primary local file system storage (volumes on path /var/lib/libvirt/images/).This design also aims to ease the development of any non-managed KVM volume migration, such as the implementation of @nvazquez in PR #2983 (NFS to NFS).