-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
Milestone
Description
ISSUE TYPE
- Bug Report
COMPONENT NAME
KVM Agent
CLOUDSTACK VERSION
4.11.2.0-41120rc2
CONFIGURATION
OS / ENVIRONMENT
SUMMARY
Also see comment thread on PR #2722
We installed an RC release which includes PR #2722 on a test system expecting the host to get marked as Disconnected after using iptables to drop NFS requests, but instead the host gets marked as Down. My investigation shows that the line storage = conn.storagePoolLookupByUUIDString(uuid); blocks indefinitely. So, kvmheartbeat.sh is never executed, a host investigation is started, the host with blocked NFS is marked as Down and finally all VMs on that host are rescheduled and result in duplicate VMs.
I pulled a thread dump and found the KVMHAMonitor thread will hang here until NFS is unblocked.
java.lang.Thread.State: RUNNABLE
at com.sun.jna.Native.invokePointer(Native Method)
at com.sun.jna.Function.invokePointer(Function.java:470)
at com.sun.jna.Function.invoke(Function.java:404)
at com.sun.jna.Function.invoke(Function.java:315)
at com.sun.jna.Library$Handler.invoke(Library.java:212)
at com.sun.proxy.$Proxy3.virStoragePoolLookupByUUIDString(Unknown Source)
at org.libvirt.Connect.storagePoolLookupByUUIDString(Unknown Source)
at com.cloud.hypervisor.kvm.resource.KVMHAMonitor$Monitor.runInContext(KVMHAMonitor.java:95)
- locked <1afb3370> (a java.util.concurrent.ConcurrentHashMap)
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
STEPS TO REPRODUCE
EXPECTED RESULTS
The host still runs kvmheartbeat.sh and shows as `Disconnected`
ACTUAL RESULTS
The host heartbeat hangs and get marked as `Down` via host investigation