Skip to content

[CIR] Add special type and new operations for vptrs #1745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

andykaylor
Copy link
Collaborator

@andykaylor andykaylor commented Jul 15, 2025

This change introduces a new type, cir.vptr, and two new operations, cir.vtable.get_vptr and cir.vtable.get_virtual_fn_addr to make operations involving vptrs more explicit. This also replaces cases where cir.vtable.address_point was being used as a general GEP-like operation and not actually returning the address point of a vtable.

Copy link
Collaborator

@xlauko xlauko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comments – potentially worth clarifying in documentation, naming, or as future improvements:

  • Is this type intended only for top-level vtable manipulation, i.e., not for representing specific vtable elements? It's currently unclear whether pointer arithmetic is meant to be allowed here.

    If the intention is to also represent individual vtable elements (e.g., a pointer to a function pointer), a name like !cir.vtable_element (or similar) might be more descriptive. As it stands, !cir.ptr<!cir.vtable> suggests that another level of indirection or some additional operation is required to access the actual function pointer.

    On the other hand, I dislike use of !cir.vtable_element in cases like dynamic_cast, where a different abstraction may be more appropriate.

    Perhaps it would make sense to introduce two distinct types:

    • !cir.vtable_ptr (instead of !cir.ptr<!cir.vtable>) to disallow arbitrary pointer arithmetic on vtables.
    • A separate type to represent specific vtable elements.

I’m basing this feedback primarily on what’s visible in this PR—I haven’t done a deep dive into where and how vtables are intended to be used across the broader system. Nonetheless, these use cases might be worth clarifying in the type documentation.

  • It might be worth considering the addition of dedicated vtable-related operations in the future, to avoid relying on long chains of casts, loads, and stores?

  • How does this representation relate to cir::MethodType? Some clarification around this would be helpful, especially in terms of how method types are expected to interact with vtables. But this is probably for further discussion as method types feels half baked.

Also, I’ve been reviewing how other high-level IRs handle similar concepts. One worth examining is Swift SIL.

Comment on lines 270 to 272
def CIR_AnyVTableType : CIR_TypeBase<"::cir::VTableType",
"vtable type">;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def CIR_AnyVTableType : CIR_TypeBase<"::cir::VTableType",
"vtable type">;
def CIR_AnyVTableType : CIR_TypeBase<"::cir::VTableType", "vtable type">;

Comment on lines 350 to 352
def CIR_VTableType : CIR_Type<"VTable", "vtable",
[DeclareTypeInterfaceMethods<DataLayoutTypeInterface>]> {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def CIR_VTableType : CIR_Type<"VTable", "vtable",
[DeclareTypeInterfaceMethods<DataLayoutTypeInterface>]> {
def CIR_VTableType : CIR_Type<"VTable", "vtable", [
DeclareTypeInterfaceMethods<DataLayoutTypeInterface>
]> {

@andykaylor
Copy link
Collaborator Author

@xlauko Thanks for the review! It does seem as though I've greatly oversimplified this. I'm going to put it aside for now, because it does need some more thought. I'll definitely add more documentation when I get back to it.

@lanza lanza force-pushed the main branch 2 times, most recently from d2c4ab8 to 8f89224 Compare July 23, 2025 17:04
This change introduces a new type, cir.vtable, to be used with operations
that return a pointer to a class' vtable.
@andykaylor andykaylor requested review from xlauko and erichkeane July 30, 2025 17:45
@andykaylor andykaylor changed the title [CIR] Add special type for vtables [CIR] Add special type and new operations for vptrs Jul 30, 2025
@andykaylor
Copy link
Collaborator Author

@tommymcm I couldn't add you as a reviewer, but I'd like to hear your thoughts on this change.

@tommymcm
Copy link
Contributor

tommymcm commented Jul 30, 2025

Thanks for the ping. Overall, I like the design. Here's some questions/notes.

Does a cir.vptr point to the start of the virtual functions, or the top of the vtable contents? Similarly, where does a cir.ptr<cir.vtable> point to? Is there a difference between the two?

Clarifying question about offsets: If I have the class definitions:

class Base { virtual int foo(); virtual int bar(); };
class Derived : Base { int foo() override; int bar() override; };

Do I get a pointer to Derived::bar() with cir.vtable.get_virtual_fn_addr(%vptr, index = 1)? Or am I indexing from somewhere else in the vtable?

Question: Does this work with multiple inheritance? For example:

class Duck { virtual ~Duck(); virtual int quack(); };
class Dog { virtual ~Dog(); virtual int woof(); };
class DuckDog : Duck, Dog { int quack() override; }

How do I get the vptr for Dog from a DuckDog? Do I need to gep into the Dog sub-object first?

Similar question: Does this work out of the box for virtual inheritance? Or will things need to change to support that?

@andykaylor
Copy link
Collaborator Author

Thanks for the ping. Overall, I like the design. Here's some questions/notes.

Does a cir.vptr point to the start of the virtual functions, or the top of the vtable contents? Similarly, where does a cir.ptr<cir.vtable> point to? Is there a difference between the two?

That depends on what is returned by cir.vtable.address_point (or cir.vtt.address_point in the case of classes with virtual bases) which in turn depends on the ABI (though the current implementation of cir.vtable.address_point seems to encode a bit of ABI information). For the Itanium ABI and the relative offset ABI, the vptr points to the start of the virtual function table (unless a VTT is required).

I've eliminated the cir.vtable type that I had introduced in the first draft of this PR. It could be worth introducing new types for VTables and VTTs, but I'm not attempting that here.

Also, I should mention that this PR doesn't change the physical layout of these objects at all. I'm just changing the abstractions used to access the objects.

Clarifying question about offsets: If I have the class definitions:

class Base { virtual int foo(); virtual int bar(); };
class Derived : Base { int foo() override; int bar() override; };

Do I get a pointer to Derived::bar() with cir.vtable.get_virtual_fn_addr(%vptr, index = 1)? Or am I indexing from somewhere else in the vtable?

cir.vtable.get_virtual_fn_addr(%vptr, index = 1) will index into the virtual function table provided by the %vptr operand, so it may point to Derived::bar() or Base::bar(). It depends on the dynamic type of the object that was used to get the vptr.

Possible dragon: Does this work with multiple inheritance? For example:

class Duck { virtual ~Duck(); virtual int quack(); };
class Dog { virtual ~Dog(); virtual int woof(); };
class DuckDog : Duck, Dog { int quack() override; }

How do I get the vptr for Dog from a DuckDog? Do I need to gep into the Dog sub-object first?

Good question. You would use cir.base_class_addr to get a pointer to the Dog base within DuckDog and then use cir.get_vptr to get the Dog vptr.

Similar question: Does this work out of the box for virtual inheritance? Or will things need to change to support that?

Virtual inheritance requires the use of VTTs, which are implemented, but may benefit from a similar change in abstractions. I haven't dug into that much yet.

@tommymcm
Copy link
Contributor

Wicked, thanks for the answers and sorry for some of them being a bit vague. I think this is a good change and fixes a lot of the headaches I have had when debugging C++ support in CIRGen. It might be good to add some C++ examples in the op descriptions to document what you described above.

Comment on lines +78 to +80
// AFTER-NEXT: %[[#ELEM_PTR:]] = cir.cast(bitcast, %[[#VPTR]] : !cir.vptr), !cir.ptr<!s64i>
// AFTER-NEXT: %[[#MINUS_TWO:]] = cir.const #cir.int<-2> : !s64i
// AFTER-NEXT: %[[#BASE_OFFSET_PTR:]] = cir.ptr_stride(%[[#ELEM_PTR]] : !cir.ptr<!s64i>, %[[#MINUS_TWO:]] : !s64i), !cir.ptr<!s64i>
Copy link
Member

@Lancern Lancern Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sequence of operations effectively get the vtable element at address point -2 which records the base-to-complete offset. I believe we will have a dedicated op for this operation (and also for other special address points such as the RTTI ptr) later so we don't have to bit-cast beween !cir.vptr and !cir.ptr<...>, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is my intent. The old sequence here did the same thing, but it was overloading the cir.vtable.address_point operation in a way that was rejected by the verifier when I added the constraint that cir.vtable.address_point should return a pointer to cir.vptr. I plan to add an operations in a future change that explicitly retrieves a pointer to the base-to-complete offset entry. I'm thinking something like cir.vtable.get_complete_offset_addr which will take a vptr as an input and return a pointer to integer.

Copy link
Collaborator

@xlauko xlauko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this seems reasonable, I just have few technical and style comments

@@ -2624,6 +2625,101 @@ def CIR_VTableAddrPointOp : CIR_Op<"vtable.address_point",[
let hasVerifier = 1;
}

//===----------------------------------------------------------------------===//
// VTableGetVptr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: to be consistent with type name this should be VTableGetVPtr

Comment on lines +2650 to +2652
let arguments = (ins
Arg<CIR_PointerType, "the vptr address", [MemRead]>:$src);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let arguments = (ins
Arg<CIR_PointerType, "the vptr address", [MemRead]>:$src);
let arguments = (ins
Arg<CIR_PointerType, "the vptr address", [MemRead]>:$src
);

Arg<CIR_PointerType, "the vptr address", [MemRead]>:$src);

let results = (outs CIR_PtrToVPtr:$vptr_ty);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

let arguments = (ins
Arg<CIR_PointerType, "the vptr address", [MemRead]>:$src);

let results = (outs CIR_PtrToVPtr:$vptr_ty);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actual pointer result not a type, so $result or $vptr or $results_vptr is more appropriate name.

Suggested change
let results = (outs CIR_PtrToVPtr:$vptr_ty);
let results = (outs CIR_PtrToVPtr:$result);

Comment on lines +2671 to +2675
The `vtable.get_virtual_fn_addr` operation retrieves the address of a
virtual function pointer from an object's vtable (__vptr).
This is an abstraction to perform the basic pointer arithmetic to get
the address of the virtual function pointer, which can then be loaded and
called.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reference relatoin to !cir.vptr here.

Comment on lines +270 to +271
def CIR_AnyVPtrType : CIR_TypeBase<"::cir::VPtrType",
"vptr type">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def CIR_AnyVPtrType : CIR_TypeBase<"::cir::VPtrType",
"vptr type">;
def CIR_AnyVPtrType : CIR_TypeBase<"::cir::VPtrType", "vptr type">;

Comment on lines +2707 to +2720
let builders = [
OpBuilder<(ins "mlir::Type":$type,
"mlir::Value":$value,
"unsigned":$index),
[{
mlir::APInt fnIdx(64, index);
build($_builder, $_state, type, value, fnIdx);
}]>
];

let extraClassDeclaration = [{
/// Return the index of the record member being accessed.
uint64_t getIndex() { return getIndexAttr().getZExtValue(); }
}];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change type of $index from IndexAttr to I64Attr in arguments these all can be deleted, as it will auto generate getIndex returning underlying uint64_t and is buildable from unsigned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a nice improvement. I was following the example of GetMemberOp and a few others, which can also be cleaned up in the way you suggest.

Comment on lines +350 to +352
def CIR_VPtrType : CIR_Type<"VPtr", "vptr",
[DeclareTypeInterfaceMethods<DataLayoutTypeInterface>]> {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def CIR_VPtrType : CIR_Type<"VPtr", "vptr",
[DeclareTypeInterfaceMethods<DataLayoutTypeInterface>]> {
def CIR_VPtrType : CIR_Type<"VPtr", "vptr", [
DeclareTypeInterfaceMethods<DataLayoutTypeInterface>
]> {

cir::AddressPointAttr::get(CGF.getBuilder().getContext(), 0,
VTableIndex));
auto VTableSlotPtr =
CGF.getBuilder().create<cir::VTableGetVirtualFnAddrOp>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add auto &builder = CGM.getBuilder(); at top of the function to clean up multiple uses of the builder

auto fnTy = cir::FuncType::get({}, intTy);

auto resTy = cir::PointerType::get(cir::PointerType::get(fnTy));
auto resTy = cir::PointerType::get(cir::VPtrType::get(getContext()));

if (resultType != resTy)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suggest to add let cppFunctionName = "isPtrToVPtrType"; into CIR_PtrToVPtr constraint which can be used in places like this if (isPtrToVPtrType(resultType)) without need to explicitly construct cir::PointerType::get(cir::VPtrType::get(getContext())); before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is even needed here, since the return type is now encoded as CIR_PtrToVPtr in the op definition, but I'll keep this tip in mind for future uses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants