Skip to content

Conversation

@scoopr
Copy link
Contributor

@scoopr scoopr commented Sep 15, 2020

This is wip, mostly a conversation starter.

I've attempted to have some labels show up in the errors, outside of only creation errors.
Steps to do that include,

  1. Retain the label in the underlying struct (here, Buffer). It is guarded by debug_assertions
  2. Add the label option to the error variant, and show it
  3. Add label option to the Element::Error variant, so that invalid ids retain the label also. Didn't guard it with debug_assertions yet, but probably should. buffer_error now takes the label option.
  4. Added a query for invalid id labels
  5. Add Global::buffer_label for returning the label from the struct, or the invalid id.
  6. Change the error variants to query the buffer_label when creating the error

with that (and little bit of fudging to get in to the error state), I can get

wgpu error: Validation error

Caused by:
    In CommandEncoder::copy_buffer_to_buffer
    buffer (0, 1, Metal) (label: Some("Staging buffer")) is invalid

instead of

wgpu error: Validation error

Caused by:
    In CommandEncoder::copy_buffer_to_buffer
    buffer (0, 1, Metal) is invalid

Some of the hub handling etc. is a bit iffy for me, I really have no idea what I'm doing.

The point would be to massage this to the point that there would be a simple steps to follow on how to decorate all the errors with labels to whatever they refer to.

Other ideas I had, included trying to have impl Debug for Id<BufferId> to print out the label (from the struct, wouldn't work for invalid ids), but in that context I don't think I have access to the Global, so I'm not sure how resolve the id to anything useful. I suppose there could a whole separate singleton storage for id to label maps which could be easier to access in any context. Then the errors could just store the Id, and not have a separate label field.

Copy link
Contributor

@monocodus monocodus bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an autogenerated code review.

Checker summary (by rust_clippy):
The tool has found 0 warnings, 1 errors.

@kvark
Copy link
Member

kvark commented Sep 16, 2020

Thank you for the PR!

What was the original motivation for this? What problem are you solving? Just trying to get better error messages?

What is the end goal that you see in this direction? Every error on any object carrying around the label strings?

@scoopr
Copy link
Contributor Author

scoopr commented Sep 16, 2020

Yes, in the rust community it is pretty easy to get spoiled by really good error messages, I think. And having an error message that pinpoints the whole context is, I think, such a breath of fresh air after years of looking at GL_INVALID_OPERATION.

Showing the label of all the involved resources would put it really far in that direction. I haven't researched what different errors would look like before/after, but if wgpu knows the context of an error, it would be really nice to show it. Especially in debug builds.

I would rather have the errors have the ids of the resources than the label itself, there would be less cloning of the label, and most of the work would just be in Error Display time. And it would give the possibility of showing even more context of the resource, if useful.

@kvark
Copy link
Member

kvark commented Sep 16, 2020

Thank you for explanation!

I would rather have the errors have the ids of the resources than the label itself

This appears to not be possible, since we need the global context in order to grab the labels.

What I was thinking, originally, is that: if you see an error that is confusing, you could record an API trace. Then you'd be able to see what descriptors (including the label, but also other useful things) associated with any IDs mentioned in the error messages.

Showing the label of all the involved resources would put it really far in that direction

With that other workflow in mind (using an API trace), my concern was whether labels are going to be enough. If we'd then decide to go all the way and basically provide the full descriptor (passed to resource creation), then it would seem inferior to just making the users to record API traces. If, on the other hand, labels is all we need, then I think it's a good improvement, assuming it doesn't add too much code complexity and maintenance load.

TL;DR: I think the feature is useful. We should probably just do a String instead of Option<String>, leaving it empty if there is no label.

@scoopr
Copy link
Contributor Author

scoopr commented Sep 16, 2020

I removed the Option.

I also tried to use the cfg(debug_assertions) more carefully, though I'm not totally sure about how I feel about it. Retaining the labels in release seem certainly superfluous for wgpu-rs applications. For browsers it would certainly be compiled with release, so missing them there would be kind of odd – and then the error messages claiming an empty string as labels suddenly become more confusing than helpful.

Oh, and the _error methods are going to have an api change as they accept the label, but this is still implementing it only for buffer_error.

I'm a bit annoyed that I'm again making the error generation more verbose, and adding those labels to all kinds of error variants all over is going to be a bit messy and annoying work.
I kinda wish I could augment the errors with invalid_buffer_error.add_note("Source buffer label {}",label) or some such. Might read nicely in the error message if they would be formatted like the compiler notes. Then the errors themselves wouldn't carry the label. So yeah, I'm a bit concerned about the maintenance burden as well.

What I was thinking, originally, is that: if you see an error that is confusing, you could record an API trace.

For complex cases, you may be right, that reaching for other tools might be better. Focusing on making the simple cases crystal clear is certainly enough. There is value to have the simpler error explained in-situ, without trying to reproduce with tools.

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this some more, I believe this logic should (once again!) land in wgpu-rs. It can resolve all the errors and replace the IDs (which are totally useless for wgpu-rs consumer) to the labels.

@kvark
Copy link
Member

kvark commented Sep 17, 2020

Notably, the public methods like buffer_label that this PR has would be required for these wgpu-rs changes. Also, it may be useful to make the error types generic over the indices, so that we could replace them by labels in wgpu-rs.

@kvark
Copy link
Member

kvark commented Nov 2, 2020

Heads up - this PR is still very much needed, just focused on bringing the logic on wgpu-rs side.

@kvark
Copy link
Member

kvark commented Nov 3, 2020

Here is a more concrete proposal. First of all, the goal we are pursuing here is to have more descriptive messages. So in a sense you don't actually need the proper error structs with String in them. You just need a resulting error message string.

I think we can do this by just being smart in the error handler. For example, transfer operations in wgpu-rs could use the following code to produce an error string:

pub trait PrettyError {
    fn fmt_pretty(&self, f: &mut fmt::Formatter<'_>, context: &Context) -> Result<(), fmt::Error>;
}

impl PrettyError for wgc::command::TransferError {
    fn fmt_pretty(&self, f: &mut fmt::Formatter<'_>, context: &Context) -> Result<(), fmt::Error> {
        use wgc::command::TransferError as Te;
        match *self {
            Te::InvalidBuffer(id) => {
                let name = context.0.buffer_label(id);
                f.debug_tuple("InvalidBuffer").field(&name).finish()
            }
            Te::InvalidTexture(id) => {
                let name = context.0.texture_label(id);
                f.debug_tuple("InvalidTexture").field(&name).finish()
            }
            _ => self.fmt(f), // relies on the classy `Debug` implementation
        }
    }
}

As you can see, it skips all the boilerplate for variants that don't contain IDs. It takes care of the ID variants only, and that boilerplate turns out to be pretty light. We'll need that trait implemented for all the errors that we are currently unwrapping with PrettyResult. The PrettyResult implementation would need to be adjusted to use this new fmt_pretty.

And of course, we'd need those buffer_label, texture_label, etc, in wgpu-core in order to query this information back.

@scoopr
Copy link
Contributor Author

scoopr commented Nov 9, 2020

Ahh, but I need to traverse the errors.. by going through err.source(), which I would then have to downcast_ref to the PrettyError, but you can't downcast to a trait, but only to the concrete type? .. which would mean in wgpu-rs there would be this massive downcast attempt to any known type that implements PrettyError?

With matching concrete type, I did manage to make it work for a single instance!

wgpu error: Validation Error

Caused by:
    In CommandEncoder::copy_buffer_to_buffer
    Copy error
    destination buffer/texture is missing the `COPY_DST` usage flag
      destination label=staging_buffer

For that I added the buffer id to TransferError::MissingCopyDstUsageFlag, and matched that in the fmt_pretty. Also I couldn't easily use the std::fmt::Formatter machinery, so the fmt_pretty returns a String currently.

And this of course means that the wgpu::Error needs to carry the formatted String around, and I also some of the indenting is now applied all over the place .

@scoopr
Copy link
Contributor Author

scoopr commented Nov 11, 2020

The latest commits makes this pull just about retaining the labels and related things. The formatting would then happen in in wgpu-rs side, with gfx-rs/wgpu-rs#617

#[allow(unused_variables)]
match buffer_guard.get(buffer_id) {
Ok(buffer) => {
let buffer = buffer; // silence warning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we already allowing unused variables?

Ok(buffer) => {
let buffer = buffer; // silence warning
#[cfg(debug_assertions)]
return buffer.label.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't need return


pub(crate) fn label_for_invalid_id(&self, id: I) -> &str {
let (index, _, _) = id.unzip();
match &self.map[index as usize] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not match on a reference, please

Vacant,
Occupied(T, Epoch),
#[cfg(debug_assertions)]
Error(Epoch, String),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can carry String unconditionally here. If it's empty, it doesn't cost much, and the Occupied already takes more size, so adding String to Error variant wouldn't change the total size of the enum.

impl<B: hal::Backend> Access<RenderPipeline<B>> for Device<B> {}
impl<B: hal::Backend> Access<RenderPipeline<B>> for BindGroup<B> {}
impl<B: hal::Backend> Access<RenderPipeline<B>> for ComputePipeline<B> {}
impl<B: hal::Backend> Access<RenderPipeline<B>> for Root {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not modify this list please. Until we get a stronger mean of protection, we should strive to keep this minimal.
For your purposes, all of these things (RenderPipeline, Buffer, Texture) are accessible in the Device context. So you can lock the device guard for reading first, and then get any of these.

@scoopr scoopr force-pushed the retained-labels branch 2 times, most recently from 626b0fb to e5a2e08 Compare November 12, 2020 21:42
@scoopr
Copy link
Contributor Author

scoopr commented Nov 12, 2020

Alright, I organized some stuff differently, in attempt to DRY it a bit. I added the label_for_resource to Registry, but that meant it needed to have a trait to ask the label from the type, so that what HasLabel is.

It kind of goes around the Token system, I'm not entirely sure how important that is here, but it would of had to make some custom token with all access anyway, as I understood it.

@scoopr
Copy link
Contributor Author

scoopr commented Nov 12, 2020

Implemented retaining of the labels for the rest of the types.
I can squash them if needed

pub(crate) bind_group_layout_ids: ArrayVec<[Valid<BindGroupLayoutId>; MAX_BIND_GROUPS]>,
pub(crate) push_constant_ranges: ArrayVec<[wgt::PushConstantRange; SHADER_STAGE_COUNT]>,
#[cfg(debug_assertions)]
pub(crate) label: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering... if this is on all objects, and life_guard is in all objects, can we put the label inside it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least I can't see it on BindGroupLayout or CommandBuffer. But I guess it could be moved for the ones that do have it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It looks like all the things that have life_guard also need the label, so we should move it in.
For others few we can have it aside, like your PR does now.

@scoopr
Copy link
Contributor Author

scoopr commented Nov 13, 2020

Moved some labels to LifeGuard, hid some repeating code behind LabelHelpers.

@scoopr
Copy link
Contributor Author

scoopr commented Nov 13, 2020

Oh, also, I didn't yet update my -rs side locally, so the latest hasn't seen as much testing, but It Compiles(tm)

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here! I have a few suggestions on how we can make it better, and avoid repeating ourselves in the codebase.

};

#[allow(unused_imports)]
use crate::LabelHelpers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

}

pub(crate) fn insert_error(&mut self, id: I) {
#[allow(unused_variables)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it need unused variables?

Some(value)
}
Element::Error(_) => None,
Element::Error(_, ..) => None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, that's a weird thing, why not just Error(..)?

ref_count: Option<RefCount>,
submission_index: AtomicUsize,
#[cfg(debug_assertions)]
pub(crate) label: Label<'static>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to not make it a String?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah, since I noticed this Label typed used elsewhere, I figured might as well use it here too. The main benefit is that some of the internal labels don't need to be constructed as String, but can utilize the Cow<'static, str> here, for example in swap_chain_get_current_texture_view the label for the TextureView.

If it made a heap allocation here, it would be tempting to leave it be a empty string (which I think doesn't heap allocate). But it would actually be nice to have the internally created resource to be labeled too.

But this is a tiny thing, I'm fine plain 'ol String as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to deal with swap_chain_get_current_texture_view later. For now, just make it something like

if cfg!(debug_assertions) { "swapchain".to_string() } else { String::new() }

fn get_surface_mut(surface: &mut Surface) -> &mut Self::Surface;
}

pub trait HasLabel {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that this trait repeats logic that we pretty much have already in the code.

First, the label thing. We have Borrow<RefCount> implemented for all the resources, and all of them just return &life_guard.ref_count. So we should instead probably do something like:

trait Resource {
  const TYPE: &'static str;
  fn life_guard(&self) -> &LifeGuard;
}

We'd then use this trait in the state tracker instead of relying on Borrow<RefCount>.

Secondly, the "type" thing is already specified in Hub::new function when registries are created. We should unite the two, possibly by having something like this:

impl<T: Resource, I: TypedId, F: IdentityHandlerFactory<I>> Registry<T, I, F> {
    fn new_resource(backend: Backend, factory: &F) -> Self {
      Self::new(backend, factory, T::TYPE)
    }
}

And then call Registry::new_resource instead of Registry::new for everything that implements Resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few of them that don't have LifeGuard, quickly glancing at least CommandBuffer and BindGroupLayout.

I'm not sure if there is really a semantic relation with LifeGuard and having a label, or is it just something convenient many of them have and could be coerced to function as something common for all of them? Might just be a naming thing.

The type name makes sense though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just convenience, I suppose. I don't feel strongly about this. If you feel that it shouldn't be a part of LifeGuard, feel free to discard that idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, didn't notice that the kind str is already given at Registry::new, should I just ask it from the Storage?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one way to reduce duplication, I suppose.
What I suggested above was to use the Resource::TYPE to give it to the storage, instead

@scoopr
Copy link
Contributor Author

scoopr commented Nov 17, 2020

Fixed the stupid mistakes, but did not yet touch the haslabel/lifeguard/typename thing.

@scoopr
Copy link
Contributor Author

scoopr commented Nov 17, 2020

I tried the Resource trait, I think it turned out ok, I use the life_guard() of it for the default impl for label() but it can be then implemented when there is no life_guard to be had.

It wasn't obvious to me how the Borrow<RefCount> should be replaced, but I didn't try to dig too deeply.

I should probably prepare a PR in wgpu-rs that just does the minimal braking change update (the labels for the error-ids), and handle the error message notes separately

@scoopr
Copy link
Contributor Author

scoopr commented Nov 18, 2020

A toned down the use of Label in preference to plain 'ol String. What do you think now?

@scoopr
Copy link
Contributor Author

scoopr commented Nov 18, 2020

Made the *_error methods take Option<&str>. I noticed while looking at the wgpu-rs side, that the input is almost always like that, and the call sites looked silly doing conversions to Option<String>.

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It's looking great, I think we are almost done.
There is a few more suggestions, plus we'd need this squashed for landing.

pub type Label<'a> = Option<Cow<'a, str>>;

trait LabelHelpers<'a> {
fn new_borrowed(s: &'a str) -> Label<'a>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's used only once, maybe we don't need this helper?

impl LifeGuard {
fn new() -> Self {
#[allow(unused_variables)]
fn new(label: String) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead do something like this:

pub fn new(label: &str) -> Self {
// something under `cfg(debug_assertions)` inside
}

This would allow all the call sites to not care about debug_assertions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, how does the callsite here care about debug_assertions?

It could be &str though I guess

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misread. It will only affect the call sites with static strings, like

            #[cfg(debug_assertions)]
            life_guard: LifeGuard::new("<device>".to_string()),
            #[cfg(not(debug_assertions))]
            life_guard: LifeGuard::new(String::new()),

@kvark kvark changed the title [wip] Show label of Buffer in error message Show label of Buffer in error message Nov 18, 2020
Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last note, and please squash!

},
life_guard: LifeGuard::new(),
#[cfg(debug_assertions)]
life_guard: LifeGuard::new("<SwapChain View>"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to split that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh duh, of course :)

queue_group,
life_guard: LifeGuard::new(),
#[cfg(debug_assertions)]
life_guard: LifeGuard::new("<device>"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to split that

limits,
life_guard: LifeGuard::new(),
#[cfg(debug_assertions)]
life_guard: LifeGuard::new("<Adapter>"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

let swap_chain = swap_chain::SwapChain {
life_guard: LifeGuard::new(),
#[cfg(debug_assertions)]
life_guard: LifeGuard::new("<SwapChain>"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

As a braking api change, adds also label for the error id generation,
for labeling invalid ids too.

Also adds query methods the label.
@scoopr
Copy link
Contributor Author

scoopr commented Nov 18, 2020

squash

@scoopr scoopr changed the title Show label of Buffer in error message Retain labels for resources Nov 18, 2020
@kvark
Copy link
Member

kvark commented Nov 18, 2020 via email

@bors
Copy link
Contributor

bors bot commented Nov 18, 2020

@bors bors bot merged commit 19b4ec5 into gfx-rs:master Nov 18, 2020
@scoopr scoopr deleted the retained-labels branch November 19, 2020 21:38
bors bot added a commit that referenced this pull request Nov 21, 2020
1036: Stop using Borrow<RefCount> r=scoopr a=kvark

**Connections**
Follow-up to #931

**Description**
Takes advantage of the new `trait Resource` in the state tracker, reducing the code.

**Testing**
Not tested outside of `cargo test`

Co-authored-by: Dzmitry Malyshau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants