Skip to content

Conversation

@0x2b3bfa0
Copy link
Member

@0x2b3bfa0 0x2b3bfa0 commented May 27, 2022

Use npm install --global github:iterative/cml#7bd9257 to use this feature.

Usage

tee report.md <<END
# Report
![cat](cat.jpg)
END

npx github:iterative/cml#7bd9257 send-comment --publish --watch report.md &

while sleep 30; do
  curl --location https://thecatapi.com/api/images/get?format=src > cat.jpg
done

When using locally (as opposed to running from CI/CD), provide also --repo https://github.com/user/repository, --token ghp_personal_access_token and --commit-sha a1b2c3d pointing to a commit on that repository, preferably part of an open pull request.

Behavior

  • --publish — uploads and replaces all the local paths on report.md (e.g. links & images)
    • e.g. ![description](outputs/plot.png) becomes ![description](https://assets.cml.dev/...)
  • --watch — watches report.md and all the local paths it contains
    • i.e. when any of them changes, updates the comment in the forge

Experimental

  • --trigger-file — specify a trigger file with the same behavior as DVC checkpoint file-based API1
    • only effective along with --watch

Using a trigger file

cml send-comment --publish --watch --trigger-file=example report.md &

while true; do
  date > report.md # modify the report
  touch example # trigger an update
  while test -f example; do sleep 1; done # wait for the update to finish
done

Pending

Reverted

Questions

  • is there any use case for calling cml publish directly after we implement this?
    • maybe (e.g. publish for use outside markdown reports?), though probably fine to keep for backward-compat? — @casperdcl dixit

    • fine to keep hidden, perhaps, as any other deprecated command? — @0x2b3bfa0 respondebat

    • p3-whatever@casperdcl 🙃

  • How does this work outside CI? Are we asking users to run this before using DVCLive?

Footnotes

  1. As the DVC team may know, a file-based API needs synchronization to avoid all sorts of pitfalls; e.g. like lost events, rate limits, corrupted files... et cetera. This makes me question whether we should follow this approach or not. 2

@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 27, 2022 02:31 Inactive
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 27, 2022 03:00 Inactive
@casperdcl
Copy link
Contributor

btw shouldn't cml publish --update be a prerequisite (i.e. overwriting existing files on assets.cml.dev)?

@0x2b3bfa0
Copy link
Member Author

0x2b3bfa0 commented May 27, 2022

@casperdcl, cml publish --update is not necessary, because storage is content-addressed.

$ echo one > file
$ cml publish file
https://host/4355a46b19d348dc2f57c046f8ef63d4538ebb936000f3c9ee954a27460dd865
$ echo two > file
$ cml publish file
https://host/53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3
$ echo two > file
$ cml publish file # same CONTENT, thus same URL
https://host/53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3

Location-addressed storage is probably what you’re looking for with cml publish --update

$ echo one > file
$ cml publish file
https://host/file
$ echo two > file
$ cml publish file
https://host/file
$ echo three > file
$ cml publish --update file # same PATH, thus same URL
https://host/file

However, paths generated this second way are:

  • Insecure, i.e. easy to IDOR without authentication
  • Prone to undesirable collisions across users, repositories and workflow runs
  • Unable to bypass GitHub user content cache; it rewrites all the image source URLs to camo.githubusercontent.com1

Footnotes

  1. If dynamic status badges work despite this, it may be a non-issue (?)

@0x2b3bfa0
Copy link
Member Author

An intermediate solution to avoid clutter and preserve desirable properties of storage is:

  • Generate a UUIDv4 when the daemon starts
  • Use it as a prefix for location-based storage

@casperdcl
Copy link
Contributor

casperdcl commented May 27, 2022

I mean this:

$ echo one > file
$ ONE_URL=$(cml publish file)
$ curl -I $ONE_URL | head -n1
HTTP/1.1 200 OK
$ echo two > file
$ TWO_URL=$(cml publish file)
warn: CML detected subsequent publish of same filename in same session. Deleting old file.
$ curl -I $TWO_URL | head -n1
HTTP/1.1 200 OK
$ curl -I $ONE_URL | head -n1
HTTP/1.1 404 Not Found

To avoid us hosting 1M images per training run.

@0x2b3bfa0
Copy link
Member Author

CML detected subsequent publish of same filename in same session

Sounds like #1026 (comment)?

@0x2b3bfa0
Copy link
Member Author

Related to iterative/dvclive#91 (comment)

@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 29, 2022 23:32 Inactive
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 29, 2022 23:35 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 30d35c0 to eaf33bc Compare May 30, 2022 00:00
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 00:00 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from eaf33bc to 85374e4 Compare May 30, 2022 00:19
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 00:19 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 85374e4 to 88d15df Compare May 30, 2022 01:46
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 01:46 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 88d15df to 869ad75 Compare May 30, 2022 02:06
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 02:06 Inactive
@0x2b3bfa0
Copy link
Member Author

0x2b3bfa0 commented May 30, 2022

After #1026 (comment) and some other limitations1 of the file event interface, I wonder if this is the right approach to integrate CML with other tools. 🤔

Footnotes

  1. Namely, the inability of synchronizing events with rate limits, and the issues with corrupted files during write.

@casperdcl casperdcl mentioned this pull request May 30, 2022
7 tasks
@casperdcl casperdcl added cml-publish Subcommand cml-comment Subcommand labels May 30, 2022
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 12, 2022 20:01 Inactive
dacbd
dacbd previously approved these changes Jun 12, 2022
Copy link
Contributor

@dacbd dacbd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, functionality I'll defer to others. 😁 🥼

@0x2b3bfa0 0x2b3bfa0 linked an issue Jun 13, 2022 that may be closed by this pull request
7 tasks
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 14, 2022 00:46 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 763b4ff to 484143d Compare June 14, 2022 00:50
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 14, 2022 00:50 Inactive
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 14, 2022 21:38 Inactive
@0x2b3bfa0
Copy link
Member Author

@iterative/cml, shall we merge this?

@DavidGOrtega DavidGOrtega temporarily deployed to internal June 15, 2022 18:21 Inactive
@0x2b3bfa0 0x2b3bfa0 mentioned this pull request Jun 15, 2022
1 task
},
triggerFile: {
type: 'string',
description: 'File used to trigger the watcher',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trigger how?

Suggested change
description: 'File used to trigger the watcher',
description: 'If specified, --watch will trigger only when the lockfile is present, and will delete the lockfile',

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly as stated in the “using a trigger file” section on #1026 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to describe in a help text. 😅

  • Takes a path to a regular file that doesn't exist yet
  • Creating a regular file in that path triggers an --update
  • Once the --update finishes, the file gets automatically deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: 'File used to trigger the watcher',
description: 'Path to the watcher trigger; create a file on that path to triger an update, then wait until the file disappears',

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to do #762 before rewording this again :)

Copy link
Contributor

@casperdcl casperdcl Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: If you'd prefer to do #762 immediately after this PR please do feel free to leave this unresolved here @0x2b3bfa0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're happy to see #1073 merged immediately after this pull request, fine by me. I would push a new commit to #1073 with unified periods and (perhaps) some better descriptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a hidden option and we don't know if DVCLive is going to use it, I'd rather not care too much about rewording (?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolving unresolved

Copy link
Contributor

@casperdcl casperdcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only had a super quick glance, just checking - is this "solution to avoid clutter and preserve desirable properties of storage" or equivalent implemented?

@0x2b3bfa0
Copy link
Member Author

only had a super quick glance, just checking - is this "solution to avoid clutter and preserve desirable properties of storage" or equivalent implemented?

yes, yes, YES

@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 23, 2022 14:03 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cml-comment Subcommand cml-publish Subcommand enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

epic: CML <> DVCLive

7 participants