Skip to content

Conversation

frankmcsherry
Copy link
Member

This is a cleaned up version of #326 which does the same thing, but removes some unintended noise of related changes.

The change here is to properly populate log messages for progress statements. This has the potential to be very expensive, and we may quickly return here to put this under a separate logger (the main risk: there is no data reduction going on, so we are logging something proportional to the volume of progress messages rather than a summary of how much work we are doing). In addition, timestamps are converted to strings from their native types (which are often non-allocating) which could be another source of overhead.

This is potentially mitigated by the recent change to default progress tracking, to the demand variant that reduces the risk of runaway volumes of progress information.

In principle, this and #321 could log into a different logger that could be enabled or disabled as appropriate. I think no one was using these messages, as they weren't properly populated, so while it would be technically breaking to do that, I don't think the fix would be more than changing enums around to not reference this variant.

@frankmcsherry
Copy link
Member Author

For example, I think I'm a bit more comfortable with the following diff, which allows a user to override logging of progress events by specifying a new destination (which could just discard the logged records; I suppose in that case we might want a clearer way to "disable" to prevent the formation of the events in the first place).

--- a/timely/src/progress/broadcast.rs
+++ b/timely/src/progress/broadcast.rs
@@ -38,6 +38,13 @@ impl<T:Timestamp+Send> Progcaster<T> {
             identifier: channel_identifier,
             kind: crate::logging::CommChannelKind::Progress,
         }));
+
+        // If progress logging is enabled, route messages there instead of to
+        // the general event logging stream.
+        if let Some(progress) = worker.log_register().get::<crate::logging::TimelyEvent>("timely/progress") {
+            logging = Some(progress);
+        }
+
         let worker_index = worker.index();
         let addr = path.clone();
         Progcaster {

This would allow folks to opt out of feeding the events to the logging stream, but not opt out of the cost of generating the events.

Ignoring ergonomics, the logging options may look a fair bit like the recently added Config stuff, which maps strings to Box<Any> types, which can then be downcast to e.g. Logger types among others. That would allow more general instructions, like expecting an Option<Logger> to signify either a new destination or an explicit nothing to avoid producing the event (i.e. a disabled logger). It would be great to avoid a random global stash of crud, but it would also be nice to be able to easily avoid expensive logging.

… for timestamps (#353)

* Separate the progress logging stream, use dyn trait instead of String for timestamps

* Remove the serialization machinery from progress logging, provide dynamic type information instead

* Add example for progress logging

* Always box logging progress vectors on construction

* Explain why we need the `ProgressEventTimestampVec` trait
@frankmcsherry
Copy link
Member Author

Okies, I think technically this bounces back to @utaal for thoughts, but as most of the PR is now his maybe there isn't much to say.

@frankmcsherry frankmcsherry merged commit 392a476 into master Jan 22, 2021
@github-actions github-actions bot mentioned this pull request Oct 29, 2024
@antiguru antiguru deleted the log_progress_accurately branch October 29, 2024 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants