Skip to content

Conversation

@Pearl1594
Copy link
Contributor

Description

This PR fixes an NPE that's noticed when we perform the following:

  1. Download a volume - such that we have the download_url set in the volume_store_ref table against the respective volume
  2. Perform an operation that causes the download_url to be cleared, say, destroy the SSVM
  3. Post re-creation of the SSVM , re-initiate download of the same volume, it fails with an NPE
2020-12-10 08:56:03,164 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-2:ctx-ef7f5a39 job-652) (logid:169bd39d) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd
java.lang.NullPointerException
	at com.cloud.hypervisor.guru.VMwareGuru.getCommandHostDelegation(VMwareGuru.java:274)
	at com.cloud.hypervisor.HypervisorGuruManagerImpl.getGuruProcessedCommandTargetHost(HypervisorGuruManagerImpl.java:76)
	at org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessage(RemoteHostEndPoint.java:120)
	at org.apache.cloudstack.storage.datastore.driver.CloudStackImageStoreDriverImpl.createEntityExtractUrl(CloudStackImageStoreDriverImpl.java:83)
	at org.apache.cloudstack.storage.image.store.ImageStoreImpl.createEntityExtractUrl(ImageStoreImpl.java:212)
	at com.cloud.storage.VolumeApiServiceImpl.setExtractVolumeSearchCriteria(VolumeApiServiceImpl.java:2887)
	at com.cloud.storage.VolumeApiServiceImpl.extractVolume(VolumeApiServiceImpl.java:2800)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
	at org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:107)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
	at com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
	at com.sun.proxy.$Proxy215.extractVolume(Unknown Source)
	at org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd.execute(ExtractVolumeCmd.java:137)
	at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:156)
	at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
	at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:620)
	at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
	at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
	at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:568)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

How Has This Been Tested?

Perform the above mentioned steps and the download url is created.

@Pearl1594
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@Pearl1594 a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos7 ✔centos8 ✔debian. JID-2477

@Pearl1594
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@Pearl1594 a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@weizhouapache
Copy link
Member

@Pearl1594 is this also a bug in 4.14 ?

@Pearl1594
Copy link
Contributor Author

@weizhouapache I believe this got introduced in 4.15

@weizhouapache
Copy link
Member

@weizhouapache I believe this got introduced in 4.15

ok thanks @Pearl1594

@weizhouapache
Copy link
Member

@weizhouapache I believe this got introduced in 4.15

@Pearl1594 by the way, do you know what lines of code changes causes it ? is there any other issue ?

@Pearl1594
Copy link
Contributor Author

@weizhouapache I think it was introduced by https://github.com/apache/cloudstack/pull/4078/files#diff-b93da5c5293f8cb4470ff6b1642d312131bb053df07b71b4b951b08dff0ad9abR2881 and there are no other issues wrt this that I noticed.

@weizhouapache
Copy link
Member

thanks @Pearl1594
we will look into it @ravening

@ravening
Copy link
Member

@Pearl1594 Im not able to reproduce the issue is kvm. Is it specific to vmware which is seen from your logs?

Before destroying systemvmn

mysql> select * from volume_store_ref\G
*************************** 1. row ***************************
                  id: 1
            store_id: 1
           volume_id: 962
      download_state: DOWNLOADED
        install_path: volumes/2/962/14961c07-b490-43f4-991a-1628e5878178.qcow2
                 url: NULL
        download_url: https://10-135-122-128.cloud.net/userdata/e21f48ea-3ffd-448f-95ad-defdb9d08245.qcow2
               state: Ready
           destroyed: 0
        update_count: 2
             ref_cnt: 0
             updated: 2020-12-10 16:20:15
download_url_created: 2020-12-10 16:20:17

After destroying systemvm

mysql> select * from volume_store_ref\G
*************************** 1. row ***************************
      download_state: DOWNLOADED
        install_path: volumes/2/962/14961c07-b490-43f4-991a-1628e5878178.qcow2
                 url: NULL
        download_url: NULL
               state: Ready
           destroyed: 0
        update_count: 2
             ref_cnt: 0
             updated: 2020-12-10 16:20:15
download_url_created: 1969-12-31 00:00:00
1 row in set (0.01 sec)

After creating systevm again

mysql> select * from volume_store_ref\G
*************************** 1. row ***************************
      download_state: DOWNLOADED
        install_path: volumes/2/962/14961c07-b490-43f4-991a-1628e5878178.qcow2
                 url: NULL
        download_url: https://10-135-122-128.cloud.net/userdata/26df39b8-1a43-45db-b515-f8a6ec9150a9.qcow2
               state: Ready
           destroyed: 0
        update_count: 2
             ref_cnt: 0
             updated: 2020-12-10 16:20:15
download_url_created: 2020-12-10 16:22:40
1 row in set (0.00 sec)

@Pearl1594
Copy link
Contributor Author

@ravening I faced this issue on a KVM environment.

@rohityadavcloud rohityadavcloud added this to the 4.15.0.0 milestone Dec 10, 2020
@blueorangutan
Copy link

Trillian test result (tid-3330)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 35253 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4530-t3330-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_kubernetes_clusters.py
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Smoke tests completed. 86 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@harikrishna-patnala harikrishna-patnala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could reproduce the issue on VMware environment.
I've tested the fix and it is working good.
LGTM

@rohityadavcloud rohityadavcloud merged commit edd5f23 into apache:master Dec 11, 2020
qrry added a commit to qrry/cloudstack that referenced this pull request Dec 23, 2020
* master:
  server: add conditions for custom offerings (apache#4540)
  vr: Ensuring dnsmasq.leases file is populated (apache#4529)
  template: Ensuring template is cross zone if type changed to system (apache#4522)
  storage: Fix hypervisor type cast to string (apache#4516)
  db upgrade: fix sql exception: Access denied; you need (at least one of) the SUPER privilege(s) for this operation (apache#4533)
  CLOUDSTACK-10423:Potential sensitive information disclosure (apache#4536)
  jobs: The patch remove the password from resultObject and make it be humanreadable (apache#4538)
  listphysicalnetworks: Honouring keyword parameter (apache#4511)
  Fix NPE when Volume exists on secondary store but doesn't have a download URL (apache#4530)
  apidoc issue (apache#4532)
  db: Fix description of volume.stats.interval which is in milliseconds not seconds (apache#4526)
  kvm: set cpu topology only if cpucore per socket is positive value (apache#4527)
  xenserver: check and eject patch vbd for systemvms (apache#4525)
  Fix warning when setup cloudstack-common (apache#4523)
  kvm: FIX cpucorespersocket is not working on KVM (apache#4497)
  change debug to warn for unknown exceptions (apache#4521)
  Fix failure in validating IP address in case of multiple Management Servers (apache#4507)
  Update log output for FirstFitPlanner (apache#4515)
  ui: deprecate old UI and move to legacy to be served at /client/legacy (apache#4518)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants