Skip to content

Cache Collisions when you ignore cache for a re-run #375

@gordonwatts

Description

@gordonwatts

This is with the current 3.0 alpha 16.

To repro:

  1. Run a query
  2. Re-run query and tell the system to ignore the cache
  3. Re-run the query again, this time without the ignore.

You'll get a cache collision error:

0000.0845 - INFO - root - Using release 22.2.107 for type information.
0000.1185 - WARNING - func_adl.type_based_replacement - Unknown type for name len
0000.8609 - INFO - root - Building ServiceX query
0000.8610 - INFO - root - Using dataset mc20_13TeV.364157.Sherpa_221_NNPDF30NNLO_Wmunu_MAXHTPTV0_70_CFilterBVeto.deriv.DAOD_PHYSLITE.e5340_s3681_r13145_p6026.
0000.8611 - INFO - root - Running on 10 files of dataset.
0000.8615 - INFO - root - Starting ServiceX query
0000.9198 - INFO - servicex.servicex_client - Returning code generators from cache

Traceback (most recent call last):
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 356, in <module>
    main(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 119, in main
    files = query_servicex(
  File "/home/gwatts/code/iris-hep/idap-200gbps-atlas/servicex/servicex_materialize_branches.py", line 92, in query_servicex
    results = sx.deliver(spec)
  File "/venv/lib/python3.9/site-packages/servicex/servicex_client.py", line 107, in deliver
    results = group.as_signed_urls()
  File "/venv/lib/python3.9/site-packages/make_it_sync/func_wrapper.py", line 63, in wrapped_call
    return _sync_version_of_function(fn, *args, **kwargs)
  File "/venv/lib/python3.9/site-packages/make_it_sync/func_wrapper.py", line 14, in _sync_version_of_function
    return loop.run_until_complete(r)
  File "/usr/AnalysisBaseExternals/25.2.2/InstallArea/x86_64-el9-gcc13-opt/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/venv/lib/python3.9/site-packages/servicex/dataset_group.py", line 76, in as_signed_urls_async
    return await asyncio.gather(*self.tasks)
  File "/venv/lib/python3.9/site-packages/servicex/query.py", line 521, in as_signed_urls_async
    return await self.submit_and_download(
  File "/venv/lib/python3.9/site-packages/servicex/query.py", line 210, in submit_and_download
    self.cache.get_transform_by_hash(sx_request.compute_hash())
  File "/venv/lib/python3.9/site-packages/servicex/query_cache.py", line 84, in get_transform_by_hash
    raise CacheException("Multiple records found in db for hash")
servicex.query_cache.CacheException: Multiple records found in db for hash

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions