Skip to content

Conversation

Victoronz
Copy link
Contributor

@Victoronz Victoronz commented Mar 19, 2025

Objective

Related to #16547.

Currently we wrap HashSet/HashMap/IndexSet/IndexMap over the Entity + EntityHash combination. This allows them to be used with EntitySet functionality.
However, when Entity is newtyped, these wrappers can suddenly no longer be used!
This causes types like MainEntityHashSet/MainEntityHashMap to fall back to the hashbrown/indexmap versions again.

Solution

We allow keying our map and set wrappers with K: TrustedEntityBorrow types, not just Entity.

This requires a new trait to be sound: TrustedBuildHasher<H: Hash>: BuildHasher.
Because both a BuildHasher and the hashed Hash type could introduce nondeterminism into the final hash, we have to guarantee that such combinations produce valid results. Collisions are fine, set and map types can deal with them, only nondeterministic hashes present an issue.

On account of our wrappers only using EntityHash, we need only bound them on EntityHash: TrustedBuilderHasher<T>. Not all TrustedEntityBorrow types make sense to be hashed with EntityHash however, so the only EntityHash: TrustedBuildHasher implementations for now are with Entity, entity newtypes, and the EntityRef-family.

Adding a generic in place of Entity presents an issue for HashMap and IndexMap:
If we want to retain the main case of HashMap<Entity, V>, one would think to make Entity the default for K.
But if we were to do that, we'd need to swap the positions for the K and V generics, as defaulted type parameters need to trail.

Because map types with swapped K and V would be very misleading, we abstain from generics, and use a type alias for the EntityHashMap/EntityIndexMap case instead.
A type alias and the aliased type need to have separate names, we name these new, broader versions EntityEquivalent* instead of just Entity*.
To keep the names consistent, we do the same for the set versions.

The name EntityEquivalent is more accurate to the actual meaning of the TrustedEntityBorrow trait, so it may be worth renaming it in another PR alongside the currently not so used EntityBorrow. (EntityBorrow could have a separate name, so we do not need a Trusted prefix for EntityEquivalent)

While the additions/deletions seem quite ominous, what this PR does is conceptually not that complex:
In hash.rs

  • Add TrustedBuildHasher trait and its impls for Entity/EntityRef-family
    In hash_set.rs/hash_map.rs/index_set.rs/index_map.rs:
  • Rename set/map type from Entity* to EntityEquivalent*
  • Make aliases with the old Entity* names
  • Replace Entity key with K generic
  • Bound K on TrustedEntityBorrow + Hash plus whatever specific impls require
  • Add where EntityHash: TrustedBuildHasher<K> bounds to type definitions and impl blocks
    In sync_world.rs:
  • Add TrustedBuildHasher impls with MainEntity/RenderEntity
  • Make MainEntityHashSet/Map aliases of EntityEquivalentHashSet/Map respectively.
    The rest:
  • Fix up some HashMap::default to MainEntityHashMap::default
  • Replace map/set imports
  • Adjust a mention of hash_set::Iter

It is painful to have wrapper types that take up this much space and should really not be necessary, but on the other hand, it clarifies/increases our justification to request the required changes upstream (hashbrown, then indexmap)!

With future upstream changes, we'd ultimately only need to retain the TrustedBuildHasher trait, some aliases and the added bounds in some impls.
If we'd then like to keep the wrappers around, we still could as well, but they could be made a lot lighter.

Migration Guide

MainEntityHashSet/MainEntityHashMap are no longer aliases of the hashbrown types, so some associated functions have to use the proper names or aliases to be used in place of HashMap. Example: HashMap::default->MainEntityHashMap::default`

Types associated with any of EntityHashSet, EntityHashMap, EntityIndexSet, EntityIndexMap now have an additional K generic. To maintain the previous meaning, use Entity for K.

@Victoronz Victoronz added A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Mar 19, 2025
@chescock
Copy link
Contributor

This requires a new trait to be sound: TrustedBuildHasher<H: Hash>: BuildHasher. Because both a BuildHasher and the hashed Hash type could introduce nondeterminism into the final hash, we have to guarantee that such combinations produce valid results. Collisions are fine, set and map types can deal with them, only nondeterministic hashes present an issue.

On account of our wrappers only using EntityHash, we need only bound them on EntityHash: TrustedBuilderHasher<T>. Not all TrustedEntityBorrow types make sense to be hashed with EntityHash however, so the only EntityHash: TrustedBuildHasher implementations for now are with Entity, entity newtypes, and the EntityRef-family.

I don't understand the attack here. TrustedEntityBorrow requires that Hash "must be equivalent for Self and its underlying entity". Since hashing Entity with EntityHash is deterministic, wouldn't hashing any valid TrustedEntityBorrow with EntityHash also be deterministic? If we didn't have TrustedBuildHasher, how could a malicious user cause nondeterminism without violating the existing safety contract of TrustedEntityBorrow?

@Victoronz
Copy link
Contributor Author

Victoronz commented Mar 20, 2025

This requires a new trait to be sound: TrustedBuildHasher<H: Hash>: BuildHasher. Because both a BuildHasher and the hashed Hash type could introduce nondeterminism into the final hash, we have to guarantee that such combinations produce valid results. Collisions are fine, set and map types can deal with them, only nondeterministic hashes present an issue.
On account of our wrappers only using EntityHash, we need only bound them on EntityHash: TrustedBuilderHasher<T>. Not all TrustedEntityBorrow types make sense to be hashed with EntityHash however, so the only EntityHash: TrustedBuildHasher implementations for now are with Entity, entity newtypes, and the EntityRef-family.

I don't understand the attack here. TrustedEntityBorrow requires that Hash "must be equivalent for Self and its underlying entity". Since hashing Entity with EntityHash is deterministic, wouldn't hashing any valid TrustedEntityBorrow with EntityHash also be deterministic? If we didn't have TrustedBuildHasher, how could a malicious user cause nondeterminism without violating the existing safety contract of TrustedEntityBorrow?

You're right, I didn't use the correct language here. For the current implementation, because Entity hashes deterministically, all TrustedEntityBorrow types must too. And because EntityHash does not introduce nondeterminism, the implementation is sound.

To be more accurate with my intent: "To ensure this implementation remains sound under future extensions/refactors, we introduce a new trait...".

It is feasible to want to extend the implementation to more hashers, whether that is to be able to use TrustedEntityBorrow types which are not Entity values themselves (any pointer to Entity f.e.), or if we want to support BuildHashers with focuses other than performance, like cryptographic hashers.

Even aside from the soundness, we want to have a trait that limits BuildHasher/Hash combinations like this, because we only want to use TrustedEntityBorrow types that EntityHash is optimized to hash for.

@chescock
Copy link
Contributor

To ensure this implementation remains sound under future extensions/refactors, we introduce a new trait...

Do you have concrete plans for those extensions? Otherwise it feels like premature generalization. If we don't need it yet, then there's no reason to build it yet!

Either way, I still don't see why TrustedBuildHasher needs to be parameterized by H: Hash. Any realistic BuildHashers is deterministic for any sequence of bytes. What sort of BuildHasher would be deterministic for some (deterministic) H: Hash but not another?

Even aside from the soundness, we want to have a trait that limits BuildHasher/Hash combinations like this, because we only want to use TrustedEntityBorrow types that EntityHash is optimized to hash for.

Why would we want to limit the combinations? As I said above, TrustedEntityBorrow requires that Hash "must be equivalent for Self and its underlying entity". How can there be a valid TrustedEntityBorrow that doesn't work with EntityHash?

@Victoronz
Copy link
Contributor Author

Victoronz commented Mar 20, 2025

To ensure this implementation remains sound under future extensions/refactors, we introduce a new trait...

Do you have concrete plans for those extensions? Otherwise it feels like premature generalization. If we don't need it yet, then there's no reason to build it yet!

That might be the case, I did conflate the purposes of "must be a sound BuildHasher/Hash combination" with "should be a designed-for BuildHasher/Hash combination".

Either way, I still don't see why TrustedBuildHasher needs to be parameterized by H: Hash. Any realistic BuildHashers is deterministic for any sequence of bytes. What sort of BuildHasher would be deterministic for some (deterministic) H: Hash but not another?

The problem is that both the Hasher and Hash are in control of parts of the hashing process. Whether a BuildHasher is deterministic for any sequence of bytes does not matter if the same bytes for the same value aren't being fed in the first place.
A Hash impl could even mem::swap a Hasher out entirely if it wanted to.

Even aside from the soundness, we want to have a trait that limits BuildHasher/Hash combinations like this, because we only want to use TrustedEntityBorrow types that EntityHash is optimized to hash for.

Why would we want to limit the combinations? As I said above, TrustedEntityBorrow requires that Hash "must be equivalent for Self and its underlying entity". How can there be a valid TrustedEntityBorrow that doesn't work with EntityHash?

The intent is not for TrustedEntityBorrow to limit Hash impls to be equal to the Entity impl, but for the hashes to form the same total order as Entity!
The very purpose of EntityHash is to be performance optimized, and if we begin hashing pointers instead of literal Entity, we defeat the purpose.
This is one of the reasons why having more BuildHashers might make sense, so other TrustedEntityBorrow types could use their own dedicated hashers.

It seems we might want to expand on the TrustedEntityBorrow safety comment somewhat.
The steps toward concluding that EntityHash is sound with any TrustedEntityBorrow type are quite nebulous with the current documentation.

@chescock
Copy link
Contributor

The intent is not for TrustedEntityBorrow to limit Hash impls to be equal to the Entity impl, but for the hashes to form the same total order as Entity!

Ah, this must be the miscommunication! I've been assuming it works the same was as Borrow, where they have to be equal so that you can get the hash from a &str to key a HashMap<String, T>.

I'd be in favor of changing that, if we can. I think it would just be a doc change, since all the existing impls do follow that. And those of us who don't read docs very carefully are likely to assume that TrustedEntityBorrow works the same as Borrow<Entity>, no matter what we write :).

The problem is that both the Hasher and Hash are in control of parts of the hashing process. Whether a BuildHasher is deterministic for any sequence of bytes does not matter if the same bytes for the same value aren't being fed in the first place.

Yup, I agree we need an unsafe trait for both parts! I just think we can make the requirements independent, so that we don't need to verify each combination. (And then, since this PR only uses a single BuildHasher type, we don't need TrustedBuildHasher just yet.)

@Victoronz
Copy link
Contributor Author

The intent is not for TrustedEntityBorrow to limit Hash impls to be equal to the Entity impl, but for the hashes to form the same total order as Entity!

Ah, this must be the miscommunication! I've been assuming it works the same was as Borrow, where they have to be equal so that you can get the hash from a &str to key a HashMap<String, T>.

I'd be in favor of changing that, if we can. I think it would just be a doc change, since all the existing impls do follow that. And those of us who don't read docs very carefully are likely to assume that TrustedEntityBorrow works the same as Borrow<Entity>, no matter what we write :).

I made a rename/doc PR concerning this (and other doc/name issues) in #18470!

@Victoronz
Copy link
Contributor Author

The problem is that both the Hasher and Hash are in control of parts of the hashing process. Whether a BuildHasher is deterministic for any sequence of bytes does not matter if the same bytes for the same value aren't being fed in the first place.

Yup, I agree we need an unsafe trait for both parts! I just think we can make the requirements independent, so that we don't need to verify each combination. (And then, since this PR only uses a single BuildHasher type, we don't need TrustedBuildHasher just yet.)

If we split the trait, we'd end up with this structure:

  • unsafe trait DeterministicBuildHasher
  • unsafe trait DeterministicHash
  • trait CorrectBuildHasher<H: DeterministicHash>: DeterministicBuildHasher

The BuildHasher/Hasher/Hash traits are quite a strange bit of design. While the Hasher API can handle just plain bytes, real BuildHasher/Hash combinations are not built to work in all cases. Some BuildHashers will panic when encountering any but their designed-for type, and some Hash impls will do their own thing, and still be deterministic.
Either side can have its own notion of "this is what hashing is for", which can cause pathological logic errors when mixed.

The only way to prevent this is to have a layer that accounts for the full picture, which means an impl on one side that carries a generic of the other. (Instead of requiring that a Hash type be aware of various BuildHashers that can be used to hash it, it makes more sense to require a BuildHasher to determine what types it should be used to hash for. A BuildHasher usually does not interact with non-hashing API, reducing the pain of newtyping by a lot.)

In the case of this PR, you are right that we can punt on is DeterministicBuildHasher because we currently only use EntityHash, and DeterministicHash because TrustedEntityBorrow already ensures determinism. But that doesn't enable what you thought it would!

We still need to limit inputs to EntityHasher to only u64 fields, otherwise there will be no just be hashes, but panics as well!
This is an aspect of the Hash impl that TrustedEntityBorrow does not, and should not promise.

Meaning that while we can punt on the safety parts, the combination checking and consequent bounds on our set/map types will need to remain, just under a different name. :P

@Victoronz
Copy link
Contributor Author

Victoronz commented Mar 22, 2025

In the case of this PR, you are right that we can punt on is DeterministicBuildHasher because we currently only use EntityHash, and DeterministicHash because TrustedEntityBorrow already ensures determinism. But that doesn't enable what you thought it would!

I said here TrustedEntityBorrow ensures determinism, but I just realized that it does not do that, because the current wording doesn't take into account that the Hash impl alone cannot guarantee correct behavior, ugh.

While we might not need the traits themselves, we should still describe this situation in the safety comment of TrustedEntityBorrow, because we are otherwise requiring a promise that an implementer cannot make by themselves.

@chescock
Copy link
Contributor

We still need to limit inputs to EntityHasher to only u64 fields, otherwise there will be no just be hashes, but panics as well! This is an aspect of the Hash impl that TrustedEntityBorrow does not, and should not promise.

Why do we need to limit the combinations? std::collections::HashMap doesn't.

The only safety requirement we have is that the hashing be deterministic. It's fine for hash() to panic, as long as it panics deterministically, since there's no way to get duplicates of a value that fails to hash.

I said here TrustedEntityBorrow ensures determinism, but I just realized that it does not do that, because the current wording doesn't take into account that the Hash impl alone cannot guarantee correct behavior, ugh.

Sure, it does. A deterministic Hash is one that makes a deterministic sequence of calls to the provided Hasher. The fact that the Hasher can do non-deterministic things doesn't change the fact that the Hash was deterministic.

Entity has a deterministic Hash, and TrustedEntityBorrow is required to have the same Hash behavior as Entity, so TrustedEntityBorrow is required to have a deterministic Hash.

@Victoronz
Copy link
Contributor Author

Victoronz commented Mar 23, 2025

This is a really subtle discussion! It has design considerations on multiple levels, which I find/can make it difficult to hone in on where the different aspects come in/from, I hope the back and forth isn't getting frustrating!
That being said, I'll try and be more thorough:

We still need to limit inputs to EntityHasher to only u64 fields, otherwise there will be no just be hashes, but panics as well! This is an aspect of the Hash impl that TrustedEntityBorrow does not, and should not promise.

Why do we need to limit the combinations? std::collections::HashMap doesn't.

Safety-wise, panicking is completely fine!
This part is more about "what should an EntityEquivalentHashMap" be, that is the issue. To be a useful and compatible design, I think we should restrict it to use cases that make sense. Why does std not limit combinations? Because their DefaultHasher is intentionally as compatible as feasible!

First, let me paint a situation from a standpoint of compatibility:
We add an EntityEquivalentHashMap that has no restriction on T: TrustedEntityBorrow.
Some library/code comes along, and makes use of EntityEquivalentHashSet<&Entity>.
This code compiles perfectly fine! However, once somebody on a 32-bit platform comes along, their bevy program suddenly explodes.
That is because on 64-bit platforms, where pointers are u64s, EntityHash will not panic. On 32-bit platforms, where pointers are u32s, EntityHash is suddenly being fed something other than u64 and panics.
Note that this can occur with any type that has platform/feature-dependent fields (which most pointer types do).
If we allow all combinations, we will still have to list the types that should not be used, and have users manage this concern manually.

Let's have a look from a performance standpoint:
EntityHash is designed to hash Entity. Designing hash functions is complicated, and the more they can assume about their input, the better. As a consequence, if we then pass data that no longer follows those assumptions, we might run into some pathologically bad performance! An example is the recently created foldhash.
I remember some people running into issues with a large performance regression when foldhash was introduced as the default. IIRC, this turned out to be because they were hashing pointers, and in their case, those pointers assumed a distribution that the hash function could not hash performantly for, i.e. There were a lot of collisions, regressing performance to a large degree.

The problems that are created by incompatible Hasher/Hash combinations are difficult to identify, and are made up of a lot of not-easily-found details. How is a user/library author supposed to know/predict when a combination fails?
In the first situation I described, I talked about how a u64 type can succeed with EntityHash, and a u32 type doesn't.
However, you cannot actually tell from the type size whether EntityHash will panic. A Hash impl itself can make decisions about how it should be hashed, and pass less or more data than what the type actually consists of.

The only safety requirement we have is that the hashing be deterministic. It's fine for hash() to panic, as long as it panics deterministically, since there's no way to get duplicates of a value that fails to hash.

Yeah! Though I think even panicking non-deterministically is fine, "being deterministic" is a property only required when function execution does not diverge.

I said here TrustedEntityBorrow ensures determinism, but I just realized that it does not do that, because the current wording doesn't take into account that the Hash impl alone cannot guarantee correct behavior, ugh.

Sure, it does. A deterministic Hash is one that makes a deterministic sequence of calls to the provided Hasher. The fact that the Hasher can do non-deterministic things doesn't change the fact that the Hash was deterministic.

Entity has a deterministic Hash, and TrustedEntityBorrow is required to have the same Hash behavior as Entity, so TrustedEntityBorrow is required to have a deterministic Hash.

Calling Hash::hash on Entity is not deterministic for all Hashers. Even if Hash calls Hasher deterministically, what happens inside of Hash is reliant on different trait code, that of Hasher. And unless we bind the code of Hasher to a safety contract, the execution of a Hash impl cannot be relied upon as deterministic by unsafe code.

And because Hash of Entity is not deterministic, the Hash of any TrustedEntityBorrow type doesn't have to be deterministic either!
Were I to paraphrase what the current wording in the safety contract of TrustedEntityBorrow is requiring:
"Ensure that all foreign Hasher impls are deterministic"
This is impossible to fulfill, which makes for an invalid safety contract, or at least one with no valid implementors.

Note, we can simply say "matches in Hash behavior when called with EntityHasher/a deterministic Hashers".
We have then limited the scope of Hash impls we want guarantees about.

All of this is extremely subtle, to the point where I believe, that by this technicality, the official Rust docs for Hash are incorrect:
#[derive(PartialEq, Eq, Hash)] does not prevent logic errors in all cases! The H: Hasher parameter would need a DeterministicHasher bound for this to be ensured.

Does this mean that those official docs have to change? Probably not, because most people don't actually have to worry about this. It is unsafe code (and thus TrustedEntityBorrow) that does!

Ultimately, I believe restricting combinations can avoid pushing the entire headache toward the user side:
Do you want a map over Entity? Use EntityHashMap.
Do you want a map over an Entity newtype/EntityRef-like type? Use EntityEquivalentHashMap.
Do you want a map not covered by the above? Use HashMap.

The first two benefit from performance and EntitySet-compatibility, with the third, the user is free to manage these aspects themselves.

@Victoronz Victoronz added D-Complex Quite challenging from either a design or technical perspective. Ask for help! and removed D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes labels Mar 23, 2025
@chescock
Copy link
Contributor

This is a really subtle discussion! It has design considerations on multiple levels, which I find/can make it difficult to hone in on where the different aspects come in/from, I hope the back and forth isn't getting frustrating!

Nope! I'm worried we'll scare off other reviewers, though :).

Safety-wise, panicking is completely fine! This part is more about "what should an EntityEquivalentHashMap" be, that is the issue. To be a useful and compatible design, I think we should restrict it to use cases that make sense. Why does std not limit combinations? Because their DefaultHasher is intentionally as compatible as feasible!

This may be the fundamental disagreement. I think we should follow a "simple things should be simple, complex things should be possible" approach. So I agree that we want good defaults here: EntityHashMap and EntityEquivalentHashMap should default to EntityHash. But I don't think we should spend a lot of effort limiting what users can do.

First, let me paint a situation from a standpoint of compatibility: We add an EntityEquivalentHashMap that has no restriction on T: TrustedEntityBorrow. Some library/code comes along, and makes use of EntityEquivalentHashSet<&Entity>. This code compiles perfectly fine! However, once somebody on a 32-bit platform comes along, their bevy program suddenly explodes. That is because on 64-bit platforms, where pointers are u64s, EntityHash will not panic. On 32-bit platforms, where pointers are u32s, EntityHash is suddenly being fed something other than u64 and panics. Note that this can occur with any type that has platform/feature-dependent fields (which most pointer types do). If we allow all combinations, we will still have to list the types that should not be used, and have users manage this concern manually.

I mean, &Entity does impl TrustedEntityBorrow, which is fine because <&T as Hash>::hash() just delegates to T::Hash. If you meant *const Entity, that calls write_usize and would fail consistently on all platforms.

But, why would we lift the T: TrustedEntityBorrow restriction? Wouldn't that just wind up being an ordinary HashSet? There's nothing we can do to stop HashSet<*const Entity, EntityHash>.

This may be the other fundamental disagreement: I think we might agree that we don't need DeterministicBuildHasher yet, but you see a future use that I don't have the context for. So I'd like to just punt the whole thing for now, and when you make a future change to loosen one of these bounds, I (and other reviewers) will be able to see the trait alongside the context that makes it necessary.

Yeah! Though I think even panicking non-deterministically is fine, "being deterministic" is a property only required when function execution does not diverge.

(Yeah, it's fine for Hash to panic non-deterministically, but it's not fine for Hasher to, because Hash could catch_unwind and turn it into regular non-determinism. But panicking non-deterministically isn't useful, and the rules are simpler if we prohibit it.)

Sure, it does. A deterministic Hash is one that makes a deterministic sequence of calls to the provided Hasher. The fact that the Hasher can do non-deterministic things doesn't change the fact that the Hash was deterministic.
Entity has a deterministic Hash, and TrustedEntityBorrow is required to have the same Hash behavior as Entity, so TrustedEntityBorrow is required to have a deterministic Hash.

Calling Hash::hash on Entity is not deterministic for all Hashers. Even if Hash calls Hasher deterministically, what happens inside of Hash is reliant on different trait code, that of Hasher. And unless we bind the code of Hasher to a safety contract, the execution of a Hash impl cannot be relied upon as deterministic by unsafe code.

This is arguing semantics, but I don't think it's useful to consider a Hash nondeterministic just because you call it with a nondeterministic Hasher. It seems a bit like saying + is nondeterministic because I can write rng.random() + 1.

Either way, I think "makes the same sequence of calls to the provided Hasher that self.entity().hash(state) would" is the behavior we need from TrustedEntityBorrow.

@Victoronz
Copy link
Contributor Author

Victoronz commented Mar 24, 2025

This is a really subtle discussion! It has design considerations on multiple levels, which I find/can make it difficult to hone in on where the different aspects come in/from, I hope the back and forth isn't getting frustrating!

Nope! I'm worried we'll scare off other reviewers, though :).

Yeah, this has become a large wall of text. :)

Safety-wise, panicking is completely fine! This part is more about "what should an EntityEquivalentHashMap" be, that is the issue. To be a useful and compatible design, I think we should restrict it to use cases that make sense. Why does std not limit combinations? Because their DefaultHasher is intentionally as compatible as feasible!

This may be the fundamental disagreement. I think we should follow a "simple things should be simple, complex things should be possible" approach. So I agree that we want good defaults here: EntityHashMap and EntityEquivalentHashMap should default to EntityHash. But I don't think we should spend a lot of effort limiting what users can do.

First, let me paint a situation from a standpoint of compatibility: We add an EntityEquivalentHashMap that has no restriction on T: TrustedEntityBorrow. Some library/code comes along, and makes use of EntityEquivalentHashSet<&Entity>. This code compiles perfectly fine! However, once somebody on a 32-bit platform comes along, their bevy program suddenly explodes. That is because on 64-bit platforms, where pointers are u64s, EntityHash will not panic. On 32-bit platforms, where pointers are u32s, EntityHash is suddenly being fed something other than u64 and panics. Note that this can occur with any type that has platform/feature-dependent fields (which most pointer types do). If we allow all combinations, we will still have to list the types that should not be used, and have users manage this concern manually.

I mean, &Entity does impl TrustedEntityBorrow, which is fine because <&T as Hash>::hash() just delegates to T::Hash. If you meant *const Entity, that calls write_usize and would fail consistently on all platforms.

Ah this was a mistake on my part, I moreso meant a Ptr that has a custom hash impl, but you're right, all the types we implement TrustedEntityBorrow for delegate themselves to Hash::hash and this is only an issue for weird pointer types.
Meaning we should punt on this entirely!

But, why would we lift the T: TrustedEntityBorrow restriction? Wouldn't that just wind up being an ordinary HashSet? There's nothing we can do to stop HashSet<*const Entity, EntityHash>.

This may be the other fundamental disagreement: I think we might agree that we don't need DeterministicBuildHasher yet, but you see a future use that I don't have the context for. So I'd like to just punt the whole thing for now, and when you make a future change to loosen one of these bounds, I (and other reviewers) will be able to see the trait alongside the context that makes it necessary.

A bit of miscommunication here: by "restriction" I mean the combination restriction, TrustedEntityBorrow would always stay.

Yeah! Though I think even panicking non-deterministically is fine, "being deterministic" is a property only required when function execution does not diverge.

(Yeah, it's fine for Hash to panic non-deterministically, but it's not fine for Hasher to, because Hash could catch_unwind and turn it into regular non-determinism. But panicking non-deterministically isn't useful, and the rules are simpler if we prohibit it.)

This is more of a consequence of "diverging programs are safe" than anything we're describing. We can omit talking about this detail entirely!

Sure, it does. A deterministic Hash is one that makes a deterministic sequence of calls to the provided Hasher. The fact that the Hasher can do non-deterministic things doesn't change the fact that the Hash was deterministic.
Entity has a deterministic Hash, and TrustedEntityBorrow is required to have the same Hash behavior as Entity, so TrustedEntityBorrow is required to have a deterministic Hash.

Calling Hash::hash on Entity is not deterministic for all Hashers. Even if Hash calls Hasher deterministically, what happens inside of Hash is reliant on different trait code, that of Hasher. And unless we bind the code of Hasher to a safety contract, the execution of a Hash impl cannot be relied upon as deterministic by unsafe code.

This is arguing semantics, but I don't think it's useful to consider a Hash nondeterministic just because you call it with a nondeterministic Hasher. It seems a bit like saying + is nondeterministic because I can write rng.random() + 1.

Either way, I think "makes the same sequence of calls to the provided Hasher that self.entity().hash(state) would" is the behavior we need from TrustedEntityBorrow.

I agree that we can require "delegate the Hash impl to another TrustedEntityBorrow type". This will inevitable have all implementors delegate to Entity, and doesn't step on the toes of the pointers that only delegate one step down.

As for the usefulness of these semantics, I agree it is not useful for an implementor to worry about, but moreso because they have no control over it. What is important to communicate is that "Correct Hash" does not guarantee correct hashing!
Not for an implementor, but for those that would need to decide whether they can call UniqueEntityIter::from_iterator_unchecked with some HashSet<T, SomeWeirdHasher>::iter.
TrustedEntityBorrow is load-bearing for the definition of "unique" all other EntitySet code uses.
Meaning it also serves as a basis for why we wrap EntityHashMap the way we do.

The reason we are able to use EntityHashSet as an EntitySet underlies several assumptions, which might not be clear to a user that does some bespoke solution. Another such assumption is the collision-handling logic I touched on in #18470 (comment)

I think the simplest solution for now to punt on more Hasher-handling logic.
Once we do introduce it: Instead of some thorough Hash/Hasher combination safety traits, we can introduce one additional trait, DeterministicBuildHasher. And then rely on the Hash impl of TrustedEntityBorrow being to delegated to Entity for the rest.
This approach makes using pathological, but deterministic hashers a logic error!

@Victoronz Victoronz added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Mar 24, 2025
github-merge-queue bot pushed a commit that referenced this pull request Mar 30, 2025
…alent (#18470)

# Objective

Fixes #9367.

Yet another follow-up to #16547.

These traits were initially based on `Borrow<Entity>` because that trait
was what they were replacing, and it felt close enough in meaning.
However, they ultimately don't quite match: `borrow` always returns
references, whereas `EntityBorrow` always returns a plain `Entity`.
Additionally, `EntityBorrow` can imply that we are borrowing an `Entity`
from the ECS, which is not what it does.

Due to its safety contract, `TrustedEntityBorrow` is important an
important and widely used trait for `EntitySet` functionality.
In contrast, the safe `EntityBorrow` does not see much use, because even
outside of `EntitySet`-related functionality, it is a better idea to
accept `TrustedEntityBorrow` over `EntityBorrow`.

Furthermore, as #9367 points out, abstracting over returning `Entity`
from pointers/structs that contain it can skip some ergonomic friction.

On top of that, there are aspects of #18319 and #18408 that are relevant
to naming:
We've run into the issue that relying on a type default can switch
generic order. This is livable in some contexts, but unacceptable in
others.

To remedy that, we'd need to switch to a type alias approach: 
The "defaulted" `Entity` case becomes a
`UniqueEntity*`/`Entity*Map`/`Entity*Set` alias, and the base type
receives a more general name. `TrustedEntityBorrow` does not mesh
clearly with sensible base type names.

## Solution
Replace any `EntityBorrow` bounds with `TrustedEntityBorrow`.
+
Rename them as such:
`EntityBorrow` -> `ContainsEntity`
`TrustedEntityBorrow` -> `EntityEquivalent`

For `EntityBorrow` we produce a change in meaning; We designate it for
types that aren't necessarily strict wrappers around `Entity` or some
pointer to `Entity`, but rather any of the myriad of types that contain
a single associated `Entity`.
This pattern can already be seen in the common `entity`/`id` methods
across the engine.
We do not mean for `ContainsEntity` to be a trait that abstracts input
API (like how `AsRef<T>` is often used, f.e.), because eliding
`entity()` would be too implicit in the general case.

We prefix "Contains" to match the intuition of a struct with an `Entity`
field, like some contain a `length` or `capacity`.
It gives the impression of structure, which avoids the implication of a
relationship to the `ECS`.
`HasEntity` f.e. could be interpreted as "a currently live entity", 

As an input trait for APIs like #9367 envisioned, `TrustedEntityBorrow`
is a better fit, because it *does* restrict itself to strict wrappers
and pointers. Which is why we replace any
`EntityBorrow`/`ContainsEntity` bounds with
`TrustedEntityBorrow`/`EntityEquivalent`.

Here, the name `EntityEquivalent` is a lot closer to its actual meaning,
which is "A type that is both equivalent to an `Entity`, and forms the
same total order when compared".
Prior art for this is the
[`Equivalent`](https://docs.rs/hashbrown/latest/hashbrown/trait.Equivalent.html)
trait in `hashbrown`, which utilizes both `Borrow` and `Eq` for its one
blanket impl!

Given that we lose the `Borrow` moniker, and `Equivalent` can carry
various meanings, we expand on the safety comment of `EntityEquivalent`
somewhat. That should help prevent the confusion we saw in
[#18408](#18408 (comment)).

The new name meshes a lot better with the type aliasing approach in
#18408, by aligning with the base name `EntityEquivalentHashMap`.
For a consistent scheme among all set types, we can use this scheme for
the `UniqueEntity*` wrapper types as well!
This allows us to undo the switched generic order that was introduced to
`UniqueEntityArray` by its `Entity` default.

Even without the type aliases, I think these renames are worth doing!

## Migration Guide

Any use of `EntityBorrow` becomes `ContainsEntity`.
Any use of `TrustedEntityBorrow` becomes `EntityEquivalent`.
mockersf pushed a commit that referenced this pull request Mar 30, 2025
…alent (#18470)

# Objective

Fixes #9367.

Yet another follow-up to #16547.

These traits were initially based on `Borrow<Entity>` because that trait
was what they were replacing, and it felt close enough in meaning.
However, they ultimately don't quite match: `borrow` always returns
references, whereas `EntityBorrow` always returns a plain `Entity`.
Additionally, `EntityBorrow` can imply that we are borrowing an `Entity`
from the ECS, which is not what it does.

Due to its safety contract, `TrustedEntityBorrow` is important an
important and widely used trait for `EntitySet` functionality.
In contrast, the safe `EntityBorrow` does not see much use, because even
outside of `EntitySet`-related functionality, it is a better idea to
accept `TrustedEntityBorrow` over `EntityBorrow`.

Furthermore, as #9367 points out, abstracting over returning `Entity`
from pointers/structs that contain it can skip some ergonomic friction.

On top of that, there are aspects of #18319 and #18408 that are relevant
to naming:
We've run into the issue that relying on a type default can switch
generic order. This is livable in some contexts, but unacceptable in
others.

To remedy that, we'd need to switch to a type alias approach: 
The "defaulted" `Entity` case becomes a
`UniqueEntity*`/`Entity*Map`/`Entity*Set` alias, and the base type
receives a more general name. `TrustedEntityBorrow` does not mesh
clearly with sensible base type names.

## Solution
Replace any `EntityBorrow` bounds with `TrustedEntityBorrow`.
+
Rename them as such:
`EntityBorrow` -> `ContainsEntity`
`TrustedEntityBorrow` -> `EntityEquivalent`

For `EntityBorrow` we produce a change in meaning; We designate it for
types that aren't necessarily strict wrappers around `Entity` or some
pointer to `Entity`, but rather any of the myriad of types that contain
a single associated `Entity`.
This pattern can already be seen in the common `entity`/`id` methods
across the engine.
We do not mean for `ContainsEntity` to be a trait that abstracts input
API (like how `AsRef<T>` is often used, f.e.), because eliding
`entity()` would be too implicit in the general case.

We prefix "Contains" to match the intuition of a struct with an `Entity`
field, like some contain a `length` or `capacity`.
It gives the impression of structure, which avoids the implication of a
relationship to the `ECS`.
`HasEntity` f.e. could be interpreted as "a currently live entity", 

As an input trait for APIs like #9367 envisioned, `TrustedEntityBorrow`
is a better fit, because it *does* restrict itself to strict wrappers
and pointers. Which is why we replace any
`EntityBorrow`/`ContainsEntity` bounds with
`TrustedEntityBorrow`/`EntityEquivalent`.

Here, the name `EntityEquivalent` is a lot closer to its actual meaning,
which is "A type that is both equivalent to an `Entity`, and forms the
same total order when compared".
Prior art for this is the
[`Equivalent`](https://docs.rs/hashbrown/latest/hashbrown/trait.Equivalent.html)
trait in `hashbrown`, which utilizes both `Borrow` and `Eq` for its one
blanket impl!

Given that we lose the `Borrow` moniker, and `Equivalent` can carry
various meanings, we expand on the safety comment of `EntityEquivalent`
somewhat. That should help prevent the confusion we saw in
[#18408](#18408 (comment)).

The new name meshes a lot better with the type aliasing approach in
#18408, by aligning with the base name `EntityEquivalentHashMap`.
For a consistent scheme among all set types, we can use this scheme for
the `UniqueEntity*` wrapper types as well!
This allows us to undo the switched generic order that was introduced to
`UniqueEntityArray` by its `Entity` default.

Even without the type aliases, I think these renames are worth doing!

## Migration Guide

Any use of `EntityBorrow` becomes `ContainsEntity`.
Any use of `TrustedEntityBorrow` becomes `EntityEquivalent`.
github-merge-queue bot pushed a commit that referenced this pull request Apr 3, 2025
# Objective

Newest installment of the #16547 series.

In #18319 we introduced `Entity` defaults to accomodate the most common
use case for these types, however that resulted in the switch of the `T`
and `N` generics of `UniqueEntityArray`.
Swapping generics might be somewhat acceptable for `UniqueEntityArray`,
it is not at all acceptable for map and set types, which we would make
generic over `T: EntityEquivalent` in #18408.

Leaving these defaults in place would result in a glaring inconsistency
between these set collections and the others.

Additionally, the current standard in the engine is for "entity" to mean
`Entity`. APIs could be changed to accept `EntityEquivalent`, however
that is a separate and contentious discussion.

## Solution

Name these set collections `UniqueEntityEquivalent*`, and retain the
`UniqueEntity*` name for an alias of the `Entity` case.
While more verbose, this allows for all generics to be in proper order,
full consistency between all set types*, and the "entity" name to be
restricted to `Entity`.
On top of that, `UniqueEntity*` now always have 1 generic less, when
previously this was not enforced for the default case.

*`UniqueEntityIter<I: Iterator<T: EntityEquivalent>>` is the sole
exception to this. Aliases are unable to enforce bounds
(`lazy_type_alias` is needed for this), so for this type, doing this
split would be a mere suggestion, and in no way enforced.
Iterator types are rarely ever named, and this specific one is intended
to be aliased when it sees more use, like we do for the corresponding
set collection iterators.
Furthermore, the `EntityEquivalent` precursor `Borrow<Entity>` was used
exactly because of such iterator bounds!
Because of that, we leave it as is.

While no migration guide for 0.15 users, for those that upgrade from
main:
`UniqueEntityVec<T>` -> `UniqueEntityEquivalentVec<T>`
`UniqueEntitySlice<T>` -> `UniqueEntityEquivalentSlice<T>`
`UniqueEntityArray<N, T>` -> `UniqueEntityEquivalentArray<T, N>`
mockersf pushed a commit that referenced this pull request Apr 3, 2025
# Objective

Newest installment of the #16547 series.

In #18319 we introduced `Entity` defaults to accomodate the most common
use case for these types, however that resulted in the switch of the `T`
and `N` generics of `UniqueEntityArray`.
Swapping generics might be somewhat acceptable for `UniqueEntityArray`,
it is not at all acceptable for map and set types, which we would make
generic over `T: EntityEquivalent` in #18408.

Leaving these defaults in place would result in a glaring inconsistency
between these set collections and the others.

Additionally, the current standard in the engine is for "entity" to mean
`Entity`. APIs could be changed to accept `EntityEquivalent`, however
that is a separate and contentious discussion.

## Solution

Name these set collections `UniqueEntityEquivalent*`, and retain the
`UniqueEntity*` name for an alias of the `Entity` case.
While more verbose, this allows for all generics to be in proper order,
full consistency between all set types*, and the "entity" name to be
restricted to `Entity`.
On top of that, `UniqueEntity*` now always have 1 generic less, when
previously this was not enforced for the default case.

*`UniqueEntityIter<I: Iterator<T: EntityEquivalent>>` is the sole
exception to this. Aliases are unable to enforce bounds
(`lazy_type_alias` is needed for this), so for this type, doing this
split would be a mere suggestion, and in no way enforced.
Iterator types are rarely ever named, and this specific one is intended
to be aliased when it sees more use, like we do for the corresponding
set collection iterators.
Furthermore, the `EntityEquivalent` precursor `Borrow<Entity>` was used
exactly because of such iterator bounds!
Because of that, we leave it as is.

While no migration guide for 0.15 users, for those that upgrade from
main:
`UniqueEntityVec<T>` -> `UniqueEntityEquivalentVec<T>`
`UniqueEntitySlice<T>` -> `UniqueEntityEquivalentSlice<T>`
`UniqueEntityArray<N, T>` -> `UniqueEntityEquivalentArray<T, N>`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants