Skip to content

Data Disks get unavailable when VM is shut down #7490

@VincentHermes

Description

@VincentHermes
ISSUE TYPE
Bug Report
COMPONENT NAME
Disk Controller
CLOUDSTACK VERSION
4.16
OS / ENVIRONMENT
KVM, Windows, SCSI rootDiskController

SUMMARY

Adding more than 6 Disks in a VM results in a second SCSI controller being created. The type of the controller varies whether the disk is attached while the VM is running or the VM is started while having more than 6 disks. If disks are added on the fly, everything works fine. If the VM is stopped and started while already having more than 6 disks, the second controller being added is of a type that breaks Windows 2022 (and others I think, still testing around).

STEPS TO REPRODUCE

Normal Disk Setting in XML
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/storpool-byid/nbmn.b.xxxx' index='0'/>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>abcdefghijklmnop42</serial>
      <alias name='scsi0-0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>

! Note the alias name in this config
Every other disk until the 6th will be configured the same way, the alias name iterates to "scsi0-0-0-5"

7th Disk Setting in XML if attached live, VM not being stopped
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/storpool-byid/nbmn.b.xxxx' index='7'/>
      <backingStore/>
      <target dev='sdg' bus='scsi'/>
      <serial>abcdefghijklmnop42</serial>
      <alias name='scsi1-0-0-0'/>
      <address type='drive' controller='1' bus='0' target='0' unit='0'/>

! Note the alias name in this config, its now "scsi1-0-0-0" which is okay as it has three zeroes for some reason and in the OS it is recognized as a "RedHat Virtio SCSI controller". All disks work correctly in the OS this way.

7th Disk Setting in XML if the VM has been stopped and then started again (XML gets recreated)
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/storpool-byid/nbmn.b.xxxx' index='7'/>
      <backingStore/>
      <target dev='sdg' bus='scsi'/>
      <serial>abcdefghijklmnop42</serial>
      <alias name='scsi1-0-0'/>
      <address type='drive' controller='1' bus='0' target='0' unit='0'/>

! Note the alias name in this config, its now "scsi1-0-0" so it is missing a zero and also it becomes a different type of controller. The RedHat driver no longer works. The only driver able to be installed for this device is the VMWare PVSCSI driver, which still renders the attached disks unavailable and breaks the Windows Boot (BSOD) even though the root disk is on the other controller. In this case you need to remove every disk until you have only 6 left and start the VM. If you attach a disk again, it will be a new "unknown device" again.

I wonder what happens if the VM has virtio instead of SCSI as rootDiskController. Checking that out.

EXPECTED RESULTS
At least keep the controller type the same
ACTUAL RESULTS
Customers bricking their VMs after being stopped one time because disks are missing.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions