-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
implement UniqueEntityVec #17549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement UniqueEntityVec #17549
Conversation
|
I'd like to add support for this as a relations storage (for e.g. Children) too. I think that's a really powerful and sensible invariant to uphold there (and something we should probably swap to internally in a follow-up). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent stuff: this hits all the notes I was hoping for here. Great docs, clear and unsurprising API, easy conversion back into a normal vec, simple construction from trusted set-like data structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I left a few questions and suggestions, but nothing blocking.
| /// This type is best obtained by its `FromEntitySetIterator` impl, via either | ||
| /// `EntityIterator::collect_set` or `UniqueEntityVec::from_entity_iter`. | ||
| /// | ||
| /// While this type can be constructed via `Iterator::collect`, doing so is inefficient, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The O(N^2) FromIterator and Extend implementations seem like a bit of a performance footgun. Would it make sense to only offer them as named functions so that folks don't use them accidentally?
... Oh, I see, FromIterator is a supertrait of FromEntitySetIterator. Why is that a supertrait? UniqueEntityVec seems like an example of a type that should be FromEntitySetIterator but should not be FromIterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FomEntitySetIterator is a subtrait of FromIterator because it can't exist without it!
It represents "This can be constructed from an iterator, however we can do better by skipping validation".
Any call to collect/from_iter really should evaluate to the FromEntitySetIterator version. And that is what supertrait item shadowing is for! That RFC has been accepted, and is in the process of moving through the rust dev pipeline.
The O(n^2) impls aren't necessarily wrong either, given that this struct is not bounded on either Ord or Hash. While rare, we want to leave the possibility for T to only implement Eq open.
Performance-wise, these impls can make sense for very low N, where the construction cost of things like a HashSet outweigh the few direct comparisons.
I did document that normal collect should be avoided on the struct itself, and the very concept of the struct should tell that extending this struct is not free, however I've added further documentation to these trait method impls.
It'd be nice if unintentional use of collect over collect_set could be linted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FomEntitySetIteratoris a subtrait ofFromIteratorbecause it can't exist without it!
It represents "This can be constructed from an iterator, however we can do better by skipping validation".
I don't follow. Why couldn't it instead represent "this can only be constructed from an iterator of unique values"?
Not every function that accepts an EntitySetIterator can accept a plain Iterator. That's why we have these traits in the first place! So it seems perfectly reasonable to have a collection that can be created from an EntitySetIterator but not from a plain Iterator. If you want to build one from a non-unique iterator then you should call a method to validate or de-duplicate it first.
given that this struct is not bounded on either
OrdorHash
It's bounded on TrustedEntityBorrow, though, so we can always get the Entity and hash or sort that.
I did document that normal
collectshould be avoided on the struct itself
Yeah, but not everyone is going to read the documentation. :) They'll just write .collect() to make the type checker happy and assume it's good enough. It would be good to push them into the pit of success instead!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FomEntitySetIteratoris a subtrait ofFromIteratorbecause it can't exist without it!
It represents "This can be constructed from an iterator, however we can do better by skipping validation".I don't follow. Why couldn't it instead represent "this can only be constructed from an iterator of unique values"?
Not every function that accepts an
EntitySetIteratorcan accept a plainIterator. That's why we have these traits in the first place! So it seems perfectly reasonable to have a collection that can be created from anEntitySetIteratorbut not from a plainIterator. If you want to build one from a non-unique iterator then you should call a method to validate or de-duplicate it first.
At first I also thought to exclude these FromIterator and Extend impls, however what made add them back is the interaction with the greater ecosystem: Rust generally structures things by composing them, and excluding these impls means we lose on composition:
What is a collection? Different crates and traits define this differently. Sometimes everything that impls FromIterator, i.e. can be "collected" is a collection. Sometimes, if something can be made into an iterator (IntoIterator), it is a collection.
Sometimes if can be extended (Extend) it is a collection, and often, it is some composition of those three.
By excluding these trait impls, UniqueEntityVec will not qualify as a "collection" in other places.
An important example is RelationshipSourceCollection: This trait defines "collection" differently than the EntitySet logic does. Whether that is the best way to do it remains to be seen! However right now, it has a safe add method.
If UniqueEntityVec does not have a safe way to add elements, then it should not be RelationshipSourceCollection, because the safety contract cannot be propagated to a safe trait method.
The same goes for other traits: If a FromIterator impl does not exist, then an unsatisfied bound might make it not at all be possible to use UniqueEntityVec. In many cases a default/provided trait method impl can be overridden, collect is just one of the unfortunate exceptions.
I agree that this a footgun, but it is a temporary, documented, and a lintable one! (I've added UniqueEntityVec::from_iter and EntitySetIterator::collect::<UniqueEntityVec> to disallowed-methods in clippy-toml).
It is definitely a trade-off, but I lean towards not making useful things impossible, and linting/documenting sketchy uses instead.
| /// This type is best obtained by its `FromEntitySetIterator` impl, via either | ||
| /// `EntityIterator::collect_set` or `UniqueEntityVec::from_entity_iter`. | ||
| /// | ||
| /// While this type can be constructed via `Iterator::collect`, doing so is inefficient, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would it be useful to have a method that does a safe O(N log N) collection using sort and dedup? It can't be the default since it reorders the input, but I think that will often be the most efficient approach for untrusted input, and I don't think the current API exposes a way to do it without unsafe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do have another trait I want to add, ToEntitySetIterator. Since all iterators can be made unique, it is simplest to do so with a trait, it would do this with a HashSet though, since that can be built iteratively, and you could define a set of entities to exclude (like parent).
That would be the main way to get a trusted EntitySet from untrusted input. I think there are still specific cases where a sort + dedup would be preferred, but I'd consider them rare enough that it is okay to leave them up to the user to then do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more, I bet what we'll actually want for sort + dedup is something like SortedVecSet from #16784. Then you could collect into a SortedVecSet want that kind of deduplication, and do a cheap trusted conversion to UniqueEntityVec if you need that instead.
(That's all future stuff, obviously, and shouldn't affect this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think about it, "all elements are in sorted order" is a property a collection holds, just like "all elements are unique".
As long as a property belongs to a collection, it can be managed and preserved the same way we now do for uniqueness.
Although, for "sortedness" you'd additionally need to carry along "what it is sorted by".
Having that information would then be useful for insertion, modification and again, skipping validation.
You could even compose these properties, like in a SortedUniqueVec.
Definitely useful, though whether that should be done can be left for the future!
…ds, clarify reason
# Objective In bevyengine#16547, we added `EntitySet`s/`EntitySetIterator`s. We can know whenever an iterator only contains unique entities, however we do not yet have the ability to collect and reuse these without either the unsafe `UniqueEntityIter::from_iterator_unchecked`, or the expensive `HashSet::from_iter`. An important piece for being able to do this is a `Vec` that maintains the uniqueness property, can be collected into, and is itself `EntitySet`. A lot of entity collections are already intended to be "unique", but have no way of expressing that when stored, other than using an aforementioned `HashSet`. Such a type helps by limiting or even removing the need for unsafe on the user side when not using a validated `Set` type, and makes it easier to interface with other infrastructure like f.e. `RelationshipSourceCollection`s. ## Solution We implement `UniqueEntityVec`. This is a wrapper around `Vec`, that only ever contains unique elements. It mirrors the API of `Vec`, however restricts any mutation as to not violate the uniqueness guarantee. Meaning: - Any inherent method which can introduce new elements or mutate existing ones is now unsafe, f.e.: `insert`, `retain_mut` - Methods that are impossible to use safely are omitted, f.e.: `fill`, `extend_from_within` A handful of the unsafe methods can do element-wise mutation (`retain_mut`, `dedup_by`), which can be an unwind safety hazard were the element-wise operation to panic. For those methods, we require that each individual execution of the operation upholds uniqueness, not just the entire method as a whole. To be safe for mutable usage, slicing and the associated slice methods require a matching `UniqueEntitySlice` type , which we leave for a follow-up PR. Because this type will deref into the `UniqueEntitySlice` type, we also offer the immutable `Vec` methods on this type (which only amount to a handful). "as inner" functionality is covered by additional `as_vec`/`as_mut_vec` methods + `AsRef`/`Borrow` trait impls. Like `UniqueEntityIter::from_iterator_unchecked`, this type has a `from_vec_unchecked` method as well. The canonical way to safely obtain this type however is via `EntitySetIterator::collect_set` or `UniqueEntityVec::from_entity_set_iter`. Like mentioned in bevyengine#17513, these are named suboptimally until supertrait item shadowing arrives, since a normal `collect` will still run equality checks.
# Objective Follow-up to #17549 and #16547. A large part of `Vec`s usefulness is behind its ability to be sliced, like sorting f.e., so we want the same to be possible for `UniqueEntityVec`. ## Solution Add a `UniqueEntitySlice` type. It is a wrapper around `[T]`, and itself a DST. Because `mem::swap` has a `Sized` bound, DSTs cannot be swapped, and we can freely hand out mutable subslices without worrying about the uniqueness invariant of the backing collection! `UniqueEntityVec` and the relevant `UniqueEntityIter`s now have methods and trait impls that return `UniqueEntitySlice`s. `UniqueEntitySlice` itself can deref into normal slices, which means we can avoid implementing the vast majority of immutable slice methods. Most of the remaining methods: - split a slice/collection in further unique subsections/slices - reorder the slice: `sort`, `rotate_*`, `swap` - construct/deconstruct/convert pointer-like types: `Box`, `Arc`, `Rc`, `Cow` - are comparison trait impls As this PR is already larger than I'd like, we leave several things to follow-ups: - `UniqueEntityArray` and the related slice methods that would return it - denoted by "chunk", "array_*" for iterators - Methods that return iterators with `UniqueEntitySlice` as their item - `windows`, `chunks` and `split` families - All methods that are capable of actively mutating individual elements. While they could be offered unsafely, subslicing makes their safety contract weird enough to warrant its own discussion. - `fill_with`, `swap_with_slice`, `iter_mut`, `split_first/last_mut`, `select_nth_unstable_*` Note that `Arc`, `Rc` and `Cow` are not fundamental types, so even if they contain `UniqueEntitySlice`, we cannot write direct trait impls for them. On top of that, `Cow` is not a receiver (like `self: Arc<Self>` is) so we cannot write inherent methods for it either.
# Objective Follow-up to bevyengine#17549 and bevyengine#16547. A large part of `Vec`s usefulness is behind its ability to be sliced, like sorting f.e., so we want the same to be possible for `UniqueEntityVec`. ## Solution Add a `UniqueEntitySlice` type. It is a wrapper around `[T]`, and itself a DST. Because `mem::swap` has a `Sized` bound, DSTs cannot be swapped, and we can freely hand out mutable subslices without worrying about the uniqueness invariant of the backing collection! `UniqueEntityVec` and the relevant `UniqueEntityIter`s now have methods and trait impls that return `UniqueEntitySlice`s. `UniqueEntitySlice` itself can deref into normal slices, which means we can avoid implementing the vast majority of immutable slice methods. Most of the remaining methods: - split a slice/collection in further unique subsections/slices - reorder the slice: `sort`, `rotate_*`, `swap` - construct/deconstruct/convert pointer-like types: `Box`, `Arc`, `Rc`, `Cow` - are comparison trait impls As this PR is already larger than I'd like, we leave several things to follow-ups: - `UniqueEntityArray` and the related slice methods that would return it - denoted by "chunk", "array_*" for iterators - Methods that return iterators with `UniqueEntitySlice` as their item - `windows`, `chunks` and `split` families - All methods that are capable of actively mutating individual elements. While they could be offered unsafely, subslicing makes their safety contract weird enough to warrant its own discussion. - `fill_with`, `swap_with_slice`, `iter_mut`, `split_first/last_mut`, `select_nth_unstable_*` Note that `Arc`, `Rc` and `Cow` are not fundamental types, so even if they contain `UniqueEntitySlice`, we cannot write direct trait impls for them. On top of that, `Cow` is not a receiver (like `self: Arc<Self>` is) so we cannot write inherent methods for it either.
Objective
In #16547, we added
EntitySets/EntitySetIterators. We can know whenever an iterator only contains unique entities, however we do not yet have the ability to collect and reuse these without either the unsafeUniqueEntityIter::from_iterator_unchecked, or the expensiveHashSet::from_iter.An important piece for being able to do this is a
Vecthat maintains the uniqueness property, can be collected into, and is itselfEntitySet.A lot of entity collections are already intended to be "unique", but have no way of expressing that when stored, other than using an aforementioned
HashSet. Such a type helps by limiting or even removing the need for unsafe on the user side when not using a validatedSettype, and makes it easier to interface with other infrastructure like f.e.RelationshipSourceCollections.Solution
We implement
UniqueEntityVec.This is a wrapper around
Vec, that only ever contains unique elements. It mirrors the API ofVec, however restricts any mutation as to not violate the uniqueness guarantee. Meaning:insert,retain_mutfill,extend_from_withinA handful of the unsafe methods can do element-wise mutation (
retain_mut,dedup_by), which can be an unwind safety hazard were the element-wise operation to panic. For those methods, we require that each individual execution of the operation upholds uniqueness, not just the entire method as a whole.To be safe for mutable usage, slicing and the associated slice methods require a matching
UniqueEntitySlicetype , which we leave for a follow-up PR.Because this type will deref into the
UniqueEntitySlicetype, we also offer the immutableVecmethods on this type (which only amount to a handful). "as inner" functionality is covered by additionalas_vec/as_mut_vecmethods +AsRef/Borrowtrait impls.Like
UniqueEntityIter::from_iterator_unchecked, this type has afrom_vec_uncheckedmethod as well.The canonical way to safely obtain this type however is via
EntitySetIterator::collect_setorUniqueEntityVec::from_entity_set_iter. Like mentioned in #17513, these are named suboptimally until supertrait item shadowing arrives, since a normalcollectwill still run equality checks.