-
Notifications
You must be signed in to change notification settings - Fork 25
Compute layouts on demand #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
One thing that should be discussed: |
|
This is encoding a huge amount of logic which in the end doesn't compute the same layout as rustc, which would be observable from e.g. the output of If what you need are generic layouts, the inevitable discrepancy with e.g. offset_of is a problem for me. We can't be ouputting a generic layout if it doesn't end up matching data that could be observed from running the program. |
|
It should compute the same layout as rustc, up to my explicitly states assumptions. And I can't tell yet what specific problem I need this for, since I'm still in the middle of working on it (for reference: https://codeberg.org/firefighterduck/Brush_rs). However, I can say that I need as much and as precise information as possible, since biabduction is known to work best with precise information and to my knowledge symbolic layouts have never used with it before. I have not tried the new |
|
So basically this PR wants to compute the generic layout? I.e., the expression of the layout info? E.g., a type Btw, just out of curiosity, how will you represent DST layout? Will you compute the "prefix length" corresponding to the fields before the final (potential) DST field? |
No, the facility is meant to collect more available precise layout information that Charon currently can't use. E.g. for types such as |
|
OK, thanks for the explanation. So you mean to fix two things for Charon: 1. Charon currently provides layout for only the |
|
I just remembered an important feature that's still missing: the layout computer should be able to check whether a type has representation annotations, e.g. |
de51d8e to
28009b0
Compare
charon/src/ast/layout_computer.rs
Outdated
| /// Compute the layout of a tuple as precise as possible. | ||
| /// | ||
| /// If the elements have different sizes and thus might need padding, | ||
| /// it returns a hint, otherwise the precise layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust gives no guarantees about tuple layout, not even that the fields are in the expected order. Also size isn't the only thing at play, types can have the same size and different alignment (which is likely to cause reorderings). I guess for a tuple with all same size and alignment we can reasonably expect that the fields are in order, but I'd rather ask rustc directly if possible.
| /// Computes the layout based on the simple algorithm for C types. | ||
| fn compute_c_layout_from_fields( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C layout doesn't mean "all fields in order in the obvious way". There are weird alignment shenanigans on windows for example, hence this recent RFC: rust-lang/rfcs#3845. We must therefore only provide layout hints for repr(C).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your layout hints! Happy to integrate that. Otherwise I'd prefer to never construct a Layout ourselves except for pointers and literals (where the layout is guaranteed and trivial), because everything about layouts is subtle and soundness-critical.
charon/src/ast/layout_computer.rs
Outdated
| let mut max_align = 1; | ||
| let mut uninhabited_part = false; | ||
|
|
||
| // If it is a new-type struct, we should be able to get a layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no guarantee that a newtype struct has the same layout as its contents (unless repr(transparent)). In fact it's easy to change that:
#[repr(align(64))]
struct S(u8);
charon/src/ast/layout_computer.rs
Outdated
| min_size: total_min_size, | ||
| min_alignment: max_align, | ||
| possibly_uninhabited: uninhabited_part, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For enum variants, if there's an uninhabited field the whole variant may have size zero (rust-lang/unsafe-code-guidelines#443). Hence min_size should be zero if possibly_uninhabited is true and we're in an enum variant. I don't know the rule for alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if repr(packed) then the alignment can be wrong so min_alignment should be 1 I think.
Let me give an updated overview over how different types are handled (not yet pushed). The following types will get a precise layout:
For the following types, only a hint is returned:
How does this sound? |
39f6799 to
a573fea
Compare
Sounds very good! For Box, beware that it has a second type parameter, it's not just a pointer if |
|
Not having |
Good point, I kinda had in mind, that the allocator API was only usable in rustc yet, and therefore didn't matter for user code. |
I've added a sizedness check already, so that we can also get this information in charon.
That's a good point. In my PR, I've tried to add facilities to charon to get layout information after the translation to (U)LLBC, thus invoking rustc isnt't really an option. However, we could also think about adding something similar to the translation to also be able to query rustc. |
Unfortunately it's available on nightly so users can use it. Wait actually charon has a |
| fn substitute_frees(self, generics: &GenericArgs) -> Self { | ||
| self.substitute_with_self_frees(generics, &TraitRefKind::SelfId) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, that reminds me I should make Free variables optional, they were added to ease the transition to De Bruijn indices. Could you document this function to explain what it does?
29104f4 to
6250f63
Compare
|
So I've been thinking that maybe I could try to move the layout computation from happening as part of the |
|
It would be rather easy to add layout info to Is that what you're asking for though? You seem to say you'd want the layout of a given |
Exactly, since that is what tools actually need (some more, some less). They need to have information about all types that occur in the program, not just the ones available from definitions. And it's not just/primarily the layout, oftentimes some related properties are of great interest, e.g. whether a type is/can be a ZST, a DST, whether it's FFI-safe, whether it's uninhabited, whether two types can be transmuted into each other, etc. Some of these properties can be approximated from the type alone, but it would be more convenient to have all such information in one place such that all downstream tools can easily access it. |
|
Setting aside what some other tools may want, what do you personally need? In particular, do you need this info for non-monomorphized types too? My thinking is that we clearly won't precompute that for every type, but we could store some kind of logical representation for each type declaration. If needed we could give a "whether two types can be transmuted into each other" is sufficiently complicated that there's been an unfinished project for years to let rustc know about that in a trait (safe transmute). Even with the ability to query rustc that's hard to do right. That particular one I propose we have a separate issue for it if you want it. |
|
Here's a list of information bits my tool currently queries from my charon fork (the branch in this PR):
And yes, I also need all of this information for non-monomorphized types, since my tool is meant to verify the memory safety of libraries and functions in a compositional way (one logical representation for all possible type arguments, that then gets instantiated/refined at the call site). About transmute: I don't necessarily want to query rustc but just get enough information about the memory layout to prove in separation logic that a specific transmute is memory safe (and upholds the validity invariants of the types involved). |
Do you need inhabitedness/ZST as a property of the final layout or of what a user is allowed to assume? E.g. a struct may be a ZST today but with private fields so the ZST-ness may not be a guarantee and thus unsound to rely on. Same for inhabitedness. Also consider If it's just the final layout, I think these two properties are straightforward to compute? We could add a method to Note that a similar question applies to the layouts we have today, and to inter-transmutability. If you want to prove soundness you should not be relying on an accidental layout choice. In my head this calls for two different notions of layout: there's the implemented layout (which we have today for mono types), and the specified layout which we'll have to construct by reading the Rust Reference and seeing what's guaranteed or not. The specified layout would make sense for poly types. It would be extremely cool to have something like that.
The |
I need it for the the exact type used at time of verification in the code (be it a specific type, a type variable, a generic type, ...). I don't need to handle specific types with
I expect the logical contract for transmutability to express some constant layout requirements that would not be satisfied by all layout choices unless its really guaranteed to work. So the proof would necessarily fail if just some layout choice makes it possible.
That sounds quite intriguing. Most of the time I'd probably want to use the specified layout, anyway, and only utilize the implemented layout where it coincides with the specified one (especially for primitive types and
On a first glance, it might do. |
Ok, that makes sense to me. I assume however that you want a verification result to not depend (too much) on the version of rustc used, and therefore not depend on layout choices that may change in the future. If so then the way forward I have in mind looks like:
|
Description
To take another step to having full layout information for as many types as possible, this PR introduces a new
LayoutComputer(name up for discussion), which can be used to compute memoized layouts (or partial layouts I callLayoutHints) for all kinds of types.Goal
This is meant to cater to the need of tools like my own (now in the process of being rewritten in Rust and using
charon_lib) which want to know layout information about many types, not just those with a type declaration.Status
The PR contains a lot of assumptions about when we can get which information and many of these should be discussed in greater detail. Right now it tries to supports the following cases:
All other types are given no layout or hint. This is especially unfortunate for opaque ADTs from
stdfor which we could do a lot (e.g.Result<NonZero<u32>,()>doesn't get a layout becauseNonZerois opaque). We could probably try to hardcode some layout schema for some of these types.