Issue/real time logging #1600

kannon92 · 2022-01-21T15:27:00Z

The goal of this PR is to enable logging of plugins while a workflow runs.

Added a CLI option --log-dir
If you specify log-dir, it will generate a unique id for stdout/stderr (Helps keep logs unique for all instances of plugin)
Logs will be written to the directory you specify.
In the CLT, you still need to specify stdout and/or stderr to actually log the plugins stdout and stderr, respectively.

Caveats:

If you want to capture logs as outputs, this will not work.

lgtm-com · 2022-01-21T15:37:16Z

This pull request introduces 1 alert when merging e33b357 into 073568f - view on LGTM.com

new alerts:

1 for Module is imported more than once

codecov · 2022-01-21T15:40:09Z

Codecov Report

Merging #1600 (717f3f4) into main (2232bcc) will increase coverage by 0.13%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1600      +/-   ##
==========================================
+ Coverage   66.48%   66.62%   +0.13%     
==========================================
  Files          93       93              
  Lines       16575    16607      +32     
  Branches     4398     4402       +4     
==========================================
+ Hits        11020    11064      +44     
+ Misses       4409     4402       -7     
+ Partials     1146     1141       -5

Impacted Files	Coverage Δ
cwltool/argparser.py	`91.63% <100.00%> (+0.03%)`	⬆️
cwltool/context.py	`97.56% <100.00%> (+0.26%)`	⬆️
cwltool/job.py	`78.65% <100.00%> (+2.27%)`	⬆️
cwltool/main.py	`71.71% <100.00%> (+0.03%)`	⬆️
cwltool/cwltool/context.py	`88.41% <0.00%> (-4.83%)`	⬇️
cwltool/cwltool/main.py	`32.17% <0.00%> (+0.09%)`	⬆️
cwltool/cwltool/argparser.py	`75.42% <0.00%> (+0.10%)`	⬆️
cwltool/command_line_tool.py	`77.19% <0.00%> (+0.41%)`	⬆️
cwltool/cwltool/job.py	`62.57% <0.00%> (+1.03%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2232bcc...717f3f4. Read the comment docs.

lgtm-com · 2022-01-21T15:50:20Z

This pull request introduces 1 alert when merging ad69355 into 073568f - view on LGTM.com

new alerts:

1 for Module is imported more than once

mr-c · 2022-01-21T16:07:21Z

hmm.. this causes failures for me for any CWL document that expects the stdout or stderr to have a particular filename. Maybe the uuid should be for a subdirectory, and not the filename itself?

To test yourself, checkout https://github.com/common-workflow-language/cwl-v1.2/tree/v1.2.0 pip install cwltest and run ./run_test.sh RUNNER=cwltool EXTRA="--log-dir $PWD/logs"

kannon92 · 2022-01-21T18:36:49Z

hmm.. this causes failures for me for any CWL document that expects the stdout or stderr to have a particular filename. Maybe the uuid should be for a subdirectory, and not the filename itself?

Good call.

One issue I find with specying stdout/stderr is that every call of the tool will write out the same file name. If you include the same tool in a workflow multiple times, the log files will conflict with each other.

I got around this but just including stdout/stderr as an output so the name is auto generated.

I'd like to keep the same behavior. Generate a random name for the log file so every invocation of the tool produces a unqiue log. Open to suggestions on this?

To test yourself, checkout https://github.com/common-workflow-language/cwl-v1.2/tree/v1.2.0 pip install cwltest and run ./run_test.sh RUNNER=cwltool EXTRA="--log-dir $PWD/logs"

I am having some trouble with this call. If I remove EXTRA,

./run_test.sh RUNNER=cwltool

importlib.metadata.PackageNotFoundError: cwltool

37 tests passed, 299 failures, 0 unsupported features

mr-c · 2022-01-22T08:39:08Z

I am having some trouble with this call.

I presume you have a virtualenv active with both cwltest and your branch installed? pip install -e path_to_branch

kannon92 · 2022-01-24T14:02:10Z

I am having some trouble with this call.

I presume you have a virtualenv active with both cwltest and your branch installed? pip install -e path_to_branch

Thank you! I was missing the pip install -e path_to_branch. Thought I could just activate my cwltool branch and install cwltest into that. But that didn't seem to work.

They are running now.

kannon92 · 2022-01-24T14:46:24Z

@mr-c I ran these tests and I realize that they are failing. I notice that if you specify the log-dir parameter, any staging of the output fails because that code tries to find the output in the container. I was thinking that if you specify the log-dir command, you should not have stdout/stderr captured as outputs.

So I think that outputs for stdout/stderr is incompatible with saying you want to capture logs outside of the container.

mr-c · 2022-01-24T14:58:36Z

@mr-c I ran these tests and I realize that they are failing. I notice that if you specify the log-dir parameter, any staging of the output fails because that code tries to find the output in the container. I was thinking that if you specify the log-dir command, you should not have stdout/stderr captured as outputs.

So I think that outputs for stdout/stderr is incompatible with saying you want to capture logs outside of the container.

Hmm.. what is a user supposed to do if their workflow has some steps with stdout and other steps they want live logging from? (Or they want live logging from all steps)

kannon92 · 2022-01-24T15:17:52Z

Hmm.. what is a user supposed to do if their workflow has some steps with stdout and other steps they want live logging from? (Or they want live logging from all steps)

Good point. I realize now that having it as an cli command would mean that you wouldn't be able to capture stdout/stderr as outputs for any steps of your workflow. ie that parameter would mean that outputs can not be used to capture stdout/stderr.

For our cloud/hpc plugins that we are building, we realize that relying on stdout/stderr as outputs is difficult as stdout/stderr are typically expected to be logs that are streamed as the plugin runs. So in my case, I was thinking that every plugin/workflow would have live logging turned on and the only thing that you would put in outputs is the actual output of the plugin, not logs.

Selectively turning on logs means that we may want to add this on a step level basis? Maybe this would be a future specification for workflows?

mr-c · 2022-01-24T15:32:20Z

Selectively turning on logs means that we may want to add this on a step level basis?

I see a couple options:

Change the code so that --log-dir only affects CommandLineTools without stdout/stderr

or

Fix the code to still work in the above cases

I feel like a step-level hint would be rather obnoxious and that the above would be less work

kannon92 · 2022-01-24T17:28:34Z

Selectively turning on logs means that we may want to add this on a step level basis?

I see a couple options:

Change the code so that --log-dir only affects CommandLineTools without stdout/stderr

or

Fix the code to still work in the above cases

I feel like a step-level hint would be rather obnoxious and that the above would be less work

Let me walk through the expected behavior just so I understand what you are saying.

If a user says they want outputs from stdout/stderr, then we should ignore the log-dir field.

If a user specifies stdout/stderr fields which will tell the tool to log stdout/stderr and give the log file a name. This would work with the log-dir parameter. If you specify a log-dir parameter, we generate a uuid under the log directory and stick the named log files there.

If a user does not put stdout/stderr in the CLT definition and no stdout/stderr outputs, then I won't log, ignoring the log-dir, because we aren't telling the tool that we want logs.

The problem arises if someone specifies stdout/stderr and output fields. The output looks in the container and tries to stage to host system.

So if outputs and stdout/stderr are not being captured, if someone uses log-dir, then we assume that they want to capture logs for every tool. And we just generate a random uuid for the log files?

kannon92 · 2022-01-24T21:44:37Z

Forget what I wrote above.

Reading through the code. I think the best thing to do is to override the staging code (which copies outputs from container to outdir) to use my new log path. If a user wants the logs as part of the output, we will move them from their log-dir directory to the outdir. Should be same behavior as before.

tests/test_stdout_stderr_log_dir.py

Co-authored-by: Michael R. Crusoe <[email protected]>

kannon92 · 2022-01-26T13:39:24Z

Hey @mr-c, is there anything else you'd want from me for this PR? Otherwise, I think its good to go whenever you want.

mr-c · 2022-01-26T14:35:40Z

Lets ask @tetron about this at today's meeting

cwltool/context.py

tests/test_stdout_stderr_log_dir.py

cwltool/argparser.py

cwltool/job.py

tests/test_stdout_stderr_log_dir.py

Co-authored-by: Michael R. Crusoe <[email protected]>

kannon92 added 5 commits January 18, 2022 16:52

feat: Enable logging to host system as container runs

1fb91f6

Merge branch 'main' into issue/real-time-logging

a3c83dd

Merge branch 'main' into issue/real-time-logging

fbd5745

feat: auto generate unique id from stdout/stderr

7c67b17

feat: rename argument

e33b357

feat: ran formatter

ad69355

kannon92 added 2 commits January 21, 2022 16:14

feat: use uuid as directory rather than add to name

35e67cd

feat: remove double import

39433f7

Merge branch 'main' into issue/real-time-logging

70ba0f8

kannon92 added 5 commits January 24, 2022 17:41

feat: add tests and copy logs to output if specified

82e85b8

Merge branch 'main' into issue/real-time-logging

b554ddf

feat: add a new test

63cf0cd

feat: format

5ac19c3

feat: mypy

52d8828

mr-c reviewed Jan 25, 2022

View reviewed changes

tests/test_stdout_stderr_log_dir.py Outdated Show resolved Hide resolved

kannon92 and others added 2 commits January 25, 2022 10:03

Update tests/test_stdout_stderr_log_dir.py

d7a1441

Co-authored-by: Michael R. Crusoe <[email protected]>

feat: lint

8cdda1d

mr-c requested a review from tetron January 26, 2022 14:35

kannon92 added 2 commits January 27, 2022 15:13

feat: add callback for log copying

31d8b30

feat: added callback for getting a log dir file path

1eafad8

kannon92 changed the title ~~Issue/real time logging~~ draft: Issue/real time logging Jan 31, 2022

kannon92 changed the title ~~draft: Issue/real time logging~~ Issue/real time logging Feb 2, 2022

kannon92 mentioned this pull request Feb 2, 2022

Issues/4005 add real time logs DataBiosphere/toil#4031

Merged

19 tasks

Merge branch 'main' into issue/real-time-logging

20b667a

mr-c requested changes Feb 2, 2022

View reviewed changes

kannon92 and others added 9 commits February 2, 2022 10:34

Update cwltool/context.py

3be874e

Co-authored-by: Michael R. Crusoe <[email protected]>

Update cwltool/context.py

6c4e8bb

Co-authored-by: Michael R. Crusoe <[email protected]>

Update tests/test_stdout_stderr_log_dir.py

90d2128

Co-authored-by: Michael R. Crusoe <[email protected]>

Merge branch 'main' into issue/real-time-logging

a4c5b30

feat: format

30a4086

feat: add tmp_path fixture and validate output

a8cc3d7

feat: format test file

e02e7b5

Merge branch 'main' into issue/real-time-logging

3ffacc1

improve tests

34fe9f0

mr-c approved these changes Feb 3, 2022

View reviewed changes

Merge branch 'main' into issue/real-time-logging

717f3f4

mr-c enabled auto-merge (rebase) February 4, 2022 08:21

mr-c merged commit cc9efc0 into common-workflow-language:main Feb 4, 2022

kannon92 mentioned this pull request Feb 14, 2022

document troubleshooting techniques #1434

Open

Uh oh!

Issue/real time logging #1600

Issue/real time logging #1600

Uh oh!

Conversation

kannon92 commented Jan 21, 2022

Uh oh!

lgtm-com bot commented Jan 21, 2022

Uh oh!

codecov bot commented Jan 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lgtm-com bot commented Jan 21, 2022

Uh oh!

mr-c commented Jan 21, 2022

Uh oh!

kannon92 commented Jan 21, 2022

Uh oh!

mr-c commented Jan 22, 2022

Uh oh!

kannon92 commented Jan 24, 2022

Uh oh!

kannon92 commented Jan 24, 2022

Uh oh!

mr-c commented Jan 24, 2022

Uh oh!

kannon92 commented Jan 24, 2022

Uh oh!

mr-c commented Jan 24, 2022

Uh oh!

kannon92 commented Jan 24, 2022

Uh oh!

kannon92 commented Jan 24, 2022

Uh oh!

Uh oh!

kannon92 commented Jan 26, 2022

Uh oh!

mr-c commented Jan 26, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jan 21, 2022 •

edited

Loading