-
-
Notifications
You must be signed in to change notification settings - Fork 233
Description
There appear to be 2 bugs in cwltool
handling of InitialWorkDirRequirement
, in the use of symlinks to optimize and avoid making copies of directories:
- Directories get corrupted with incorrect symlinks that point to themselves.
- Symlinks that point to themselves cause an inifinite loop in
cwltool
The issue is subtle, and seems to manifest when different parts of a single directory structure are referenced.
Expected Behavior
Input directories should be left pristine, and output directories should be emitted correctly, when using InitialWorkDirRequirement
. In the example workflow, a subdirectory with one test file should be the final output.
Actual Behavior
A symlink is installed at dir/subdir/file
with target being itself (1st bug), causing an infinite loop in cwltool
as it tries to dereference the symlink within PathMapper.visit()
(2nd bug).
Workflow Code
Here is a code example that demonstrates the issue: test.cwl
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow
inputs: []
steps:
# Create a test directory structure; could be done outside CWL and passed in as input.
# This input directory should be left pristine.
mkdirs:
run:
class: CommandLineTool
baseCommand: [bash, '-c', 'mkdir dir dir/subdir && touch dir/subdir/file', '-']
inputs: []
outputs:
mkdirs_out:
type: Directory
outputBinding:
glob: dir
in: []
out: [mkdirs_out]
# Given an input directory, emit a subdirectory as output.
passthrough1:
run:
class: CommandLineTool
requirements:
- class: InitialWorkDirRequirement
listing:
- entry: $(inputs.passthrough1_in)
writable: false
baseCommand: ["true"]
inputs:
passthrough1_in:
type: Directory
outputs:
passthrough1_subdir:
type: Directory
outputBinding:
glob: $(inputs.passthrough1_in.basename)/subdir
in:
passthrough1_in: mkdirs/mkdirs_out
out: [passthrough1_subdir]
# Given a (sub-)directory, emit it unchanged.
passthrough2:
run:
class: CommandLineTool
requirements:
- class: InitialWorkDirRequirement
listing:
- entry: $(inputs.passthrough2_in)
writable: false
baseCommand: ["true"]
inputs:
passthrough2_in:
type: Directory
outputs:
passthrough2_subdir:
type: Directory
outputBinding:
glob: $(inputs.passthrough2_in.basename)
in:
passthrough2_in: passthrough1/passthrough1_subdir
out: [passthrough2_subdir]
outputs:
out:
type: Directory
outputSource: passthrough2/passthrough2_subdir
As input, use this empty test.yaml
:
{}
Full Traceback
Infinite loop in pathmapper.py
around line 249, in PathMapper.visit()
; see the comment line # Dereference symbolic links
.
Your Environment
- cwltool version: 1.0.20181217162649