-
Notifications
You must be signed in to change notification settings - Fork 5
[examples][mlir] Basic MLIR compilation and execution example #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Adds a simple end-to-end example demonstrating programatic transform schedule creation, MLIR JIT compilation, execution, and numerical verification of the result. Additionally, 'utils' submodule is added with basic tools to simplify creation of ctype arguments in format accepted by jitted function.
|
CC: @tkarna |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy for this to go as is. It's not how we want to go long term, but it does show the basic steps to create a pipeline and get some results in the end.
Next steps will be to explode that script into modules to read the various inputs (PyTorch, XLA, ONNX, MLIR), produce the various schedules and pipelines (static, dynamic, etc) and execute on the selected target (CPU, GPU, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, could go in as is. Provides a minimal example of how to grab together the different things we have been working on/towards.
Basically just one overall comment, in general for our use of the Python bindings in this project:
As a general principle we can now set for ourselves for the project right at the start, in case the upstream bindings already have a workable snake_case version of an op, lets use that over the CamelCaseOp version. The crux of the argument being that this will make the Python code look closer to a terse version of the MLIR textual format.
|
|
||
| # Create entry point transformation sequence. | ||
| with ir.InsertionPoint(schedule.body): | ||
| named_seq = transform.NamedSequenceOp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe transform.named_sequence should already work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, snake cases expose and call only the base class of the ops.
In many cases, it is enough. And I think it's good to prioritize their usage overall.
However, for more complex ops, it means we miss out on the QoL overloads and extra initialization steps.
For named sequence, creating transform.NamedSequenceOp gives direct access to easier to use APIs and also initializes the region's blocks. So, I think in such cases it's better to use camel case ops instead.
Similar thing goes for ApplyPatternsOp or TileUsingForOp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, snake cases expose and call only the base class of the ops.
This is only the case if the person adding a derived class forgot to also add a new wrapper alongside their derived class (the old wrapper will indeed only wrap the base class). This is a bug and should be fixed upstream. Whenever we encounter this, lets track all instances in an issue and I will fix it.
For named sequence, creating transform.NamedSequenceOp gives direct access to easier to use APIs and also initializes the region's blocks. So, I think in such cases it's better to use camel case ops instead.
I don't follow. In Python, transform.named_sequence is defined as follows:
def named_sequence(sym_name, function_type, *, sym_visibility=None, arg_attrs=None, res_attrs=None, loc=None, ip=None) -> NamedSequenceOp:
return NamedSequenceOp(sym_name=sym_name, function_type=function_type, sym_visibility=sym_visibility, arg_attrs=arg_attrs, res_attrs=res_attrs, loc=loc, ip=ip)Could you explain how using transform.NamedSequenceOp gives you more flexibility? (In my experience this hasn't been the case, that is, I have been using transform.named_sequence(...).body etc without issue in my code.)
(I know some of the wrappers return TransformOp(...).result rather than TransformOp(...). I believe we should treat that as a bug as well and track and fix the instances we encounter.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I tried using transform.named_sequence, it directs to the above wrapper as you show. When I follow further calls, this constructor seems to call the base version:
@_ods_cext.register_operation(_Dialect)
class NamedSequenceOp(_ods_ir.OpView):
instead of the derived:
@_ods_cext.register_operation(_Dialect, replace=True)
class NamedSequenceOp(NamedSequenceOp):
which initializer takes care of building function_type (less verbose user code), initialized the region self.regions[0].blocks.append(*input_types) and gives convenient access methods:
@property
def body(self) -> Block:
return self.regions[0].blocks[0]
@property
def bodyTarget(self) -> Value:
return self.body.arguments[0]
@property
def bodyExtraArgs(self) -> BlockArgumentList:
return self.body.arguments[1:]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still pretty green to MLIR's python world so might be I'm just doing sth really stupid 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example, in case of tiling it's just verbosity:
structured.TileUsingForOp(op, sizes=[1, 32])
vs
structured.structured_tile_using_for(
anytype,
[anytype, anytype],
op,
dynamic_sizes=[],
static_sizes=[1, 32],
scalable_sizes=[False, False],
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! transform.named_sequence is only wrapping the base class version! My mistake, should have checked (my code does access body but does so as a Region (as that is what the base class exposes)) -- sorry.
I will put up a PR upstream to fix this for the main transform ops, such as named_sequence, ASAP. Here's an upstream issue to track this: llvm/llvm-project#166765 Let's update it whenever we come across these issues. Please add other ops you encounter such issues with!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will do 👍
I'm still all in for pushing for snake case.
Camel case usage in lighthouse will also serve as a small tracker of what needs to be tweaked.
Adds a simple end-to-end example demonstrating programmatic transform schedule creation, MLIR JIT compilation, execution, and numerical verification of the result.
Additionally, 'utils' submodule is added with basic tools to simplify creation of ctype arguments in format expected by MLIR jitted functions.
Co-authored-by: Rolf Morel [email protected]
Co-authored-by: Tuomas Kärnä [email protected]
Co-authored-by: Frank Schlimbach [email protected]