Skip to content

Conversation

@utaal
Copy link
Member

@utaal utaal commented Jan 19, 2021

This only supports cfg(not(feature = "bincode")), bincode support upcoming.
Currently based on the log_progress_accurately branch.

@frankmcsherry

Comment on lines 77 to 78
pub trait ProgressEventTimestamp: std::fmt::Debug + std::fmt::Display + std::any::Any {
/// Encodes a typed reference into a binary buffer.
Copy link
Member

@frankmcsherry frankmcsherry Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check, Timestamp has already Debug+Any+Abomonation/Serialize. Is the motivation for the trait to add Display, or because the communication::Data trait isn't actually helpful enough?

Copy link
Member Author

@utaal utaal Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s because Data can’t be turned into a dyn trait, because entomb takes a type parameter. Should probably add a comment to remind future readers.

@frankmcsherry
Copy link
Member

Thanks very much!

I think this looks good, though I have to admit not knowing about ease/pain of logging out. Strings look pretty good, but other formats could be hard and I wouldn't know. This does the good things (imo) of extracting the progress logging to a new stream so that it can be turned off cheaply. I had a comment or two about the rationale for some traits, not because I doubt the rationale but more to write it down so that in a few months I can remember why it wasn't another way. :)

@utaal utaal force-pushed the log_progress_accurately branch from d3aec0d to 08964d8 Compare January 21, 2021 08:26
@utaal utaal marked this pull request as ready for review January 21, 2021 08:26
@utaal
Copy link
Member Author

utaal commented Jan 21, 2021

I changed things to be serialization-agnostic, but to still offer the option of recovering the real timestamp type in the registered log function. This way, someone who needs detailed, typed information, can downcast to the concrete types of the timestamp present in the dataflow.

I also actually sent native types and not formatted strings, which I had failed to do previously (in Progcaster).


#[derive(Serialize, Deserialize, Abomonation, Debug, Clone, Hash, Eq, PartialEq, Ord, PartialOrd)]
/// Encapsulates Any and Debug for dynamically typed timestamps in logs
pub trait ProgressEventTimestamp: std::fmt::Debug + std::any::Any {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random comment, not a change but: I bet this trait eventually becomes useful beyond progress events. At least, assuming there is any other reason to report information about timestamps (e.g., maybe other timestamp things, I dunno) they'll probably make use of this too. Again: nothing to change, just thinking out loud.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I thought the same when I wrote it, but it seemed unnecessary to push it to a separate module at this stage.

fn type_name(&self) -> &'static str { std::any::type_name::<T>() }
}

/// A vector of progress updates in logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but this would be a great place to explain the need for a trait (vs a type or a struct that matches the same description).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the doc in the latest commit.

Comment on lines +67 to +68
let mut messages = Box::new(Vec::with_capacity(changes.len()));
let mut internal = Box::new(Vec::with_capacity(changes.len()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm super confused that this works. I would have thought that as soon as this is boxed up you lose access to the push methods. Or is it because there is an auto-conversion from Box<T> to Box<dyn Trait> for any T: Trait? No change needed, just surprising to me!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe push works because of https://doc.rust-lang.org/std/boxed/struct.Box.html#impl-Deref (Deref impl for Box). And yes, you can upcast a Box<T> to a Box<dyn Trait> for the same reason you can upcast a &T to a &dyn Trait for T: Trait.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the type of messages and internal is still the concrete, non-dyn vector type. The upcast happens on assignment to the target struct below.

Comment on lines 157 to 158
messages: Box::new(messages),
internal: Box::new(internal),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is what I thought you would need to do. Is there a reason this is needed here but not previously? If so I think we should doc it, and if not we should unify up the two cases to look the same (maybe just aesthetics, but also to avoid randoms (myself) asking this question in the future).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asymmetry was unintentional (and I guess a potential perf defect). Fixed in the last commit.

Copy link
Member

@frankmcsherry frankmcsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. I have some questions, but none of them need to block merging this in to the PR (and potentially then merging the PR). I'll not merge right away in case any of the comments seem actionable to you, but otherwise I'm happy to do it now!

@utaal
Copy link
Member Author

utaal commented Jan 21, 2021

I believe I addressed the comments: so go ahead and merge if the latest changes look good! Thanks for the review.

@frankmcsherry frankmcsherry merged commit 009a178 into TimelyDataflow:log_progress_accurately Jan 22, 2021
@utaal utaal deleted the log_progress_accurately branch January 22, 2021 06:45
frankmcsherry added a commit that referenced this pull request Jan 22, 2021
* Actually log progress updates

* remove commented code, improve comments

* Separate the progress logging stream, use dyn trait instead of String for timestamps (#353)

* Separate the progress logging stream, use dyn trait instead of String for timestamps

* Remove the serialization machinery from progress logging, provide dynamic type information instead

* Add example for progress logging

* Always box logging progress vectors on construction

* Explain why we need the `ProgressEventTimestampVec` trait

Co-authored-by: Andrea Lattuada <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants