- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Remove the ParamSpace separation from formal and actual generics in rustc. #35605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ☔ The latest upstream changes (presumably #35592) made this pull request unmergeable. Please resolve the merge conflicts. | 
cdcc1cd    to
    7060002      
    Compare
  
    | ☔ The latest upstream changes (presumably #35138) made this pull request unmergeable. Please resolve the merge conflicts. | 
907d823    to
    9e5f42f      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit yucky but...ok. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should look at this at the end of the PR ;).
| ☔ The latest upstream changes (presumably #35162) made this pull request unmergeable. Please resolve the merge conflicts. | 
| OK, @eddyb, I read (or at least skimmed) the code here and it's lovely as usual. That said, when I've contemplated generalizing  For example, one thing I was thinking about was adopting more of a debruijn-index-like approach: in this case, we would have instead of Space an index, counting backwards (so, on a trait impl, 0 might be the fn substs, 1 might be the self substs, 2 the trait substs, etc). Then the  Admittedly I probably should have chatted with you some to hash out the desired approach better before you opened this whole PR. That said, I think you were like 75% of the way through before I found out you were working on it. 👅 Anyway, I would be happy to land this PR as is, and consider refinements (like maybe the one I just proposed) separately. It seems like overall the refinements you've made (e.g., the helper methods for creating substs) are good, and would presumably make it easier to pursue other variations. Thoughts? Do you agree that this code will be easier to adapt in the future if we choose a different strategy? | 
| @nikomatsakis With something like  However, I think that it doesn't bring us anything, we don't translate anything other than  But yes, many of the changes here are essential to making any other design changes wrt generics. | 
| 
 I am thinking specifically of a world where  r+ from me. | 
| @bors r=nikomatsakis p=1 (this should really not bitrot) | 
| 📌 Commit 9453d9b has been approved by  | 
| ⌛ Testing commit 9453d9b with merge d6d0590... | 
Remove the ParamSpace separation from formal and actual generics in rustc. This is the first step towards enabling the typesystem implemented by `rustc` to be extended (with generic modules, HKT associated types, generics over constants, etc.). The current implementation splits all formal (`ty::Generics`) and actual (`Substs`) lifetime and type parameters (and even `where` clauses) into 3 "parameter spaces": * `TypeSpace` for `enum`, `struct`, `trait` and `impl` * `SelfSpace` for `Self` in a `trait` * `FnSpace` for functions and methods For example, in `<X as Trait<A, B>>::method::<T, U>`, the `Substs` are `[[A, B], [X], [T, U]]`. The representation uses a single `Vec` with 2 indices where it's split into the 3 "parameter spaces". Such a simplistic approach doesn't scale beyond the Rust 1.0 typesystem, and its existence was mainly motivated by keeping code manipulating generic parameters correct, across all possible situations. Summary of changes: * `ty::Generics` are uniformly stored and can be queried with `tcx.lookup_generics(def_id)` * the `typeck::collect` changes for this resulted in a function to lazily compute the `ty::Generics` for a local node, given only its `DefId` - this can be further generalized to other kinds of type information * `ty::Generics` and `ty::GenericPredicates` now contain only their own parameters (or `where` clauses, respectively), and refer to their "parent", forming a linked list * right now most items have one level of nesting, only associated items and variants having two * in the future, if `<X as mod1<A>::mod2<B>::mod3::Trait<C>>::Assoc<Y>` is supported, it would be represented by item with the path `mod1::mod2::mod3::Trait::Assoc`, and 4 levels of generics: `mod1` with `[A]`, `mod2` with `[B]`, `Trait` with `[X, C]` and `Assoc` with `[Y]` * `Substs` gets two new APIs for working with arbitrary items: * `Substs::for_item(def_id, mk_region, mk_type)` will construct `Substs` expected by the definition `def_id`, calling `mk_region` for lifetime parameters and `mk_type` for type parameters, and it's guaranteed to *always* return `Substs` compatible with `def_id` * `substs.rebase_onto(from_base_def_id, to_base_substs)` can be used if `substs` is for an item nested within `from_base_def_id` (e.g. an associated item), to replace the "outer parameters" with `to_base_substs` - for example, you can translate a method's `Substs` between a `trait` and an `impl` (in both directions) if you have the `DefId` of one and `Substs` for the other * trait objects, without a `Self` in their `Substs`, use *solely* `ExistentialTraitRef` now, letting `TraitRef` assume it *always* has a `Self` present * both `TraitRef` and `ExistentialTraitRef` get methods which do operations on their `Substs` which are valid only for traits (or trait objects, respectively) * `Substs` loses its "parameter spaces" distinction, with effectively no code creating `Substs` in an ad-hoc manner, or inspecting them, without knowing what shape they have already Future plans: * combine both lifetimes and types in a single `Vec<Kind<'tcx>>` where `Kind` would be a tagged pointer that can be `Ty<'tcx>`, `&'tcx ty::Region` or, in the future, potentially-polymorphic constants * this would require some performance investigation, if it implies a lot of dynamic checks * introduce an abstraction for `(T, Substs)`, where the `Substs` are even more hidden away from code manipulating it; a precedent for this is `Instance` in trans, which has `T = DefId`; @nikomatsakis also referred to this, as "lazy substitution", when `T = Ty` * rewrite type pretty-printing to fully take advantage of this to inject actual in the exact places of formal generic parameters in any paths * extend the set of type-level information (e.g. beyond `ty::Generics`) that can be lazily queried during `typeck` and introduce a way to do those queries from code that can't refer to `typeck` directly * this is almost unrelated but is necessary for DAG-shaped recursion between constant evaluation and type-level information, i.e. for implementing generics over constants r? @nikomatsakis cc @rust-lang/compiler cc @nrc Could get any perf numbers ahead of merging this?
…akis
Combine types and regions in Substs into one interleaved list.
Previously, `Substs` would contain types and regions, in two separate vectors, for example:
```rust
<X as Trait<'a, 'b, A, B>>::method::<'p, 'q, T, U>
/* corresponds to */
Substs { regions: ['a, 'b, 'p, 'q], types: [X, A, B, T, U] }
```
This PR continues the work started in rust-lang#35605 by further removing the distinction.
A new abstraction over types and regions is introduced in the compiler, `Kind`.
Each `Kind` is a pointer (`&TyS` or `&Region`), with the lowest two bits used as a tag.
Two bits were used instead of just one (type = `0`, region = `1`) to allow adding more kinds.
`Substs` contain only a `Vec<Kind>`, with `Self` first, followed by regions and types (in the definition order):
```rust
Substs { params: [X, 'a, 'b, A, B, 'p, 'q, T, U] }
```
The resulting interleaved list has the property of being the concatenation of parameters for the (potentially) nested generic items it describes, and can be sliced back into those components:
```rust
params[0..5] = [X, 'a, 'b, A, B] // <X as Trait<'a, 'b, A, B>>
params[5..9] = ['p, 'q, T, U] // <_>::method::<'p, 'q, T, U>
```
r? @nikomatsakis
    Don't let a type parameter named "Self" unchanged past HIR lowering. Fixes #36638 by rewriting `Self` type parameters (which are a parse error) to a `gensym("Self")`. Background: #35605 introduced code across rustc that determines `Self` by its keyword name. Reverting the sanity checks around that would inadvertently cause confusion between the true `Self` of a `trait` and other type parameters named `Self` (which have caused parse errors already). I do not like to use `gensym`, and we may do something different here in the future, but this should work.
This is the first step towards enabling the typesystem implemented by
rustcto be extended(with generic modules, HKT associated types, generics over constants, etc.).
The current implementation splits all formal (
ty::Generics) and actual (Substs) lifetime and type parameters (and evenwhereclauses) into 3 "parameter spaces":TypeSpaceforenum,struct,traitandimplSelfSpaceforSelfin atraitFnSpacefor functions and methodsFor example, in
<X as Trait<A, B>>::method::<T, U>, theSubstsare[[A, B], [X], [T, U]].The representation uses a single
Vecwith 2 indices where it's split into the 3 "parameter spaces".Such a simplistic approach doesn't scale beyond the Rust 1.0 typesystem, and its existence was mainly motivated by keeping code manipulating generic parameters correct, across all possible situations.
Summary of changes:
ty::Genericsare uniformly stored and can be queried withtcx.lookup_generics(def_id)typeck::collectchanges for this resulted in a function to lazily compute thety::Genericsfor a local node, given only itsDefId- this can be further generalized to other kinds of type informationty::Genericsandty::GenericPredicatesnow contain only their own parameters (orwhereclauses, respectively), and refer to their "parent", forming a linked list<X as mod1<A>::mod2<B>::mod3::Trait<C>>::Assoc<Y>is supported, it would be represented by item with the pathmod1::mod2::mod3::Trait::Assoc, and 4 levels of generics:mod1with[A],mod2with[B],Traitwith[X, C]andAssocwith[Y]Substsgets two new APIs for working with arbitrary items:Substs::for_item(def_id, mk_region, mk_type)will constructSubstsexpected by the definitiondef_id, callingmk_regionfor lifetime parameters andmk_typefor type parameters, and it's guaranteed to always returnSubstscompatible withdef_idsubsts.rebase_onto(from_base_def_id, to_base_substs)can be used ifsubstsis for an item nested withinfrom_base_def_id(e.g. an associated item), to replace the "outer parameters" withto_base_substs- for example, you can translate a method'sSubstsbetween atraitand animpl(in both directions) if you have theDefIdof one andSubstsfor the otherSelfin theirSubsts, use solelyExistentialTraitRefnow, lettingTraitRefassume it always has aSelfpresentTraitRefandExistentialTraitRefget methods which do operations on theirSubstswhich are valid only for traits (or trait objects, respectively)Substsloses its "parameter spaces" distinction, with effectively no code creatingSubstsin an ad-hoc manner, or inspecting them, without knowing what shape they have alreadyFuture plans:
Vec<Kind<'tcx>>whereKindwould be a tagged pointer that can beTy<'tcx>,&'tcx ty::Regionor, in the future, potentially-polymorphic constants(T, Substs), where theSubstsare even more hidden away from codemanipulating it; a precedent for this is
Instancein trans, which hasT = DefId; @nikomatsakis also referred to this, as "lazy substitution", whenT = Tyty::Generics) that can be lazily queried duringtypeckand introduce a way to do those queries from code that can't refer totypeckdirectlyr? @nikomatsakis
cc @rust-lang/compiler
cc @nrc Could get any perf numbers ahead of merging this?