-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
This is a global tracking issue for landing the initial PRs of TensorIR scheduling. The original RFC can be found here. Please feel free to bring in more questions to the corresponding thread.
To make the improvement manageable, we will be breaking the code into several steps. The main architecture scaffolding and
data structure changes will come first, then we will upstream individual scheduling primitives in a incremental way, where each component are self-contained.
Key Data Structures
The following key data structures will be introduced to handle the support of TensorIR.
Logically the data structures belongs to the following layers:
- D0: Block and BlockRealize: these are nodes to enable the block declarations in the stmt
- D1: ScheduleState: an auxiliary data structure to encode additional dependency information used for schedule transformations
- Most schedule primitives can be viewed as a ScheduleState => ScheduleState transformations
- We will also introduce SRef, a reference pointer to the AST (to refer to loop and blocks) to facilitate transformations
- D2: Schedule: interface for running schedule transformations, like the current te.schedule
These data structures will form the initial scafolds. The transformation primitives will then become like passes, that only depends on D0 and D1. The each primitive will be exposed to an user facing API via D2.
Steps
- M1a PR0: The additional data structure needed for TensorIR, in particular the Block structure, changes to related fields. [TensorIR][M1a] Introduce Block and BlockRealize #7553
- M1a PR1: Changes to the tvmscript parser, printer to allow us to ingest and print the new data structures. [TensorIR][M1a] TVMScript Parser/Printer #7630
- M1b PR0: Initial scaffolding of ScheduleState and Schedule data structure with basic utils and no primitives
- Enhancing visitors [TIR] Add PreOrderVisit and VisitPrimFuncs #7627
- Arithmetic analysis on iter-affine-map: [ARITH] normalize iter affine map expr to PrimExpr #7759 [ARITH] Subspace division #7760
- Core data structure: ScheduleState [TensorIR][M1b] Scaffolding ScheduleState data structure #7765
- The schedule class [TensorIR][M1b] Schedule class #7847
- M1c PRs: Overall lowering API to lower a new TIR function to the codegen.
- LowerInitBlock: [TensorIR][M1c] LowerInitBlock #7806
- LCA detection for buffer allocation site: [TensorIR][M1c] LCA detector #7848
- Plan and update buffer allocation location: [TensorIR][PASS][M1c] PlanUpdateBufferAllocationLocation #7873
- Compact buffer allocation: [TensorIR][PASS][M1c] CompactBufferAllocation #7923
- Flatten buffer: Flatten buffer access into 1D (similar to storage flatten in the mainline) [TensorIR][Pass][M1c] FlattenBuffer #7962
- Lower and build TensorIR: [TensorIR][M1c] Lower and build TensorIR #8044
- Sub-region buffer match: [TensorIR] Support for match_buffer from subregion #8585
- VerifyBufferMatch (optional for now, verify that buffer layout constraints are met in sub region matches)
- Create TIR from TE: [TensorIR] CreatePrimFunc from TE #7987
- M2a PRs: multiple PRs to support scheduling primitives, one PR per primitive, with related test-cases.
- Checks for region cover property [TensorIR][M2a] Verification of cached flags #8114
- Structural error reporting in schedule primitives [TensorIR][M2a] Structural Error Reporting #8121
- Primitive: compute-inline & reverse-compute-inline [TensorIR][M2a] Compute-Inline,Reverse-Compute-Inline #8170
- Primitive: fuse & split [TensorIR][M2a] Fuse, Split #8467
- Primitive: rfactor [TensorIR][M2a] Reduction Factoring (RFactor) #8544
- Primitive: storage-align [TensorIR][M2a] Storage Align #8693
- Primitive: vectorize & unroll & bind thread [TensorIR][M2a] Parallel, Vectorize, Bind & Unroll #8716
- Primitive: reorder [TensorIR][M2a] Reorder #8767
- Primitive: cache-read & cache-write [TensorIR][M2a] CacheRead/Write #8863
- Primitive: compute-at & reverse-compute-at [TensorIR][M2a] Compute-At #8943
- Primitive: decompose-reduction [TensorIR][M2a] Decompose-Reduction #9041