- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
Expand capability of CoreCLR Interpreter #113292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The opcode name, its argument type and a method for decoding the opcode enum value from IL byte stream.
| Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch | 
These will be used for arg/local loads/stores and various other move operations.
With some floating point exceptions that require special handling
487a2d3    to
    6ad4cd1      
    Compare
  
    This is done as an initial iteration over the IL code, with the main objective of creating the initial set of basic blocks and mapping from il offset to basic block. In future changes, when we compile the code, these basic blocks will be linked together, according to the control flow resulting from IL instructions.
The interpreter will require additional data structures in the future: linked list, hashtable, bitset.
6ad4cd1    to
    113ae1d      
    Compare
  
    From the initial set of basic blocks we will start generating code for the current method. Each basic block has a stack state, representing the state of the IL stack at the moment when we enter. Codegen starts from a dummy entry bblock which has an empty stack state. As we iterate over instructions, we check if a new bblock starts at this offset (this is determined based on the m_ppOffsetToBB mapping created in a previous commit). If we are starting generating code inside a new bblock, various things need to be done. We need to check if the basic block is a fall through from the previous bblock (in which case we initialize the stack state of this new bblock). Also if the bblock is not fall through, we can only generate code into it if the stack state is initialized. Otherwise, we will skip through its instructions and, once the traverse the entire IL, we will do a reiteration generating code only in the bblocks that haven't yet been emitted. When emitting the final code, one basic block at a time, we update the real native offset of each basic block. When we need to emit a branch, in case we know the offset of the target bblock, we emit it directly in the code. Otherwise we add a relocation record in an array. Once we finish generating code for all basic blocks, we traverse the relocation array and patch all recorded instruction slots with the now known branch offset.
Add a new InterpTypeByRef, to be used to precisely track vars that might contain interior pointers and need to be reported to the GC. We didn't have this type on mono.
squash into branches
Unlike other instructions, INTOP_CALL doesn't have a fixed number of sVars, since the method to be called can have any signature. Instead it will hold a -1 terminated array of sVars. The arguments will be placed one after the other, at an offset referred to as callArgsOffset, representing the only source offset for the call instruction. Vars that are sources for a call will be allocated in this sequential position by the var offset allocator, which is not yet included as part of this commit. INTOP_CALL instruction receives an additional fixed argument, which is a tagged `MethodDesc*`. Rather than embedding the pointer into the instruction stream, this is passed as an index into a table. This will both reduce code memory use, since identical data will be stored at the same index, as well as provide us with an unique slot to atomically patch the data. During first execution of a call, we would observe that the data item is a tagged pointer, suggesting this is a MethodDesc*, in which case we try to obtain the actual method code, triggering method compilation if necessary. Once this is done, the MethodDesc pointer will be replaced with the actual code pointer. In the future, this pointer can be either interpreter IR or jit compiled code. Given we currently only have support to specify a single method to be interpreted, in order to expand on this, this commit introduces a temporary hack behavior where once we enter the interpreter, we don't exit it for methods from the same module. In the future we might want to switch this to passing an env var specifying which assembly to be interpreted and which to be jitted.
The var offset allocator allocates for each var a positive offset. All instructions will end up referencing this offset instead of the var index. During execution, these offsets are added to the frame's stack pointer to obtain the actual location on the interpreter stack for the value. We have three logical areas on the interpreter stack where vars reside. The global vars space, the local vars space followed at last by the param area space. In the global vars space, each global var will have an offset allocated to it, that its constant throughout method execution. This space is first filled by the argument vars (which must be at the beginning of the stack, according to the interpreter ccall convention). Then we allocate each IL local in this space. Once we finish generating code for the method, we enter the actual var offset allocator. First we detect which vars are used in multiple basic blocks, these vars will be marked as global and have an unique offset allocated. The space following the global vars will be used for allocating vars used in a single basic block. We will traverse each bblock and, as a local var is defined, we will allocate it at the current offset. We have liveStart and liveEnd computed for each local var (measured as instruction indexes in the corresponding basic block). Based on these liveness markers, we determine when to pop vars from the set of active vars (vars that are currently live). Variables that are arguments to a call will be allocated instead in the param area. Given we don't know the exact offset of the param area during bblock iteration (because it follows the local vars space, which is determined only after traversing all bblocks), these vars will be initially allocated with an offset relative within the param area. At the very end, we will offset these with the param area offset. Vars passed to calls have to die immediately following the call. If the call arg var can't satisify this constraint (for example the var is global, or a local var that could be referenced following the call), we will create a new temporary var that we copy the arg into, and this new var will serve as the call arg instead.
This will have to be enabled via an env var, rather than statically in code.
Because we currently don't have support for interoping between interpreter and jit, in order to signal test failure in interpreter test we will just crash instead.
Minor fixes to ensure tests work. Add support for returns of any type. Add opcode for loading full I4.
113ae1d    to
    0dc9815      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR expands the CoreCLR Interpreter tests by adding support for a range of arithmetic operations and control flow through new test functions.
- Added tests for summing numbers recursively, multiplying four numbers, and testing a switch-based operation.
- Introduced a power computation loop test to validate branching and arithmetic behavior.
Reviewed Changes
| File | Description | 
|---|---|
| src/tests/JIT/interpreter/Interpreter.cs | Adds new test methods for arithmetic and branching operations in the interpreter. | 
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (3)
src/tests/JIT/interpreter/Interpreter.cs:20
- Consider providing a descriptive message in the 'Environment.FailFast' call to facilitate debugging in failure scenarios.
Environment.FailFast(null);
src/tests/JIT/interpreter/Interpreter.cs:22
- Consider providing a descriptive message in the 'Environment.FailFast' call to facilitate debugging in failure scenarios.
Environment.FailFast(null);
src/tests/JIT/interpreter/Interpreter.cs:65
- Consider providing a descriptive message in the 'Environment.FailFast' call to facilitate debugging in failure scenarios.
Environment.FailFast(null);
| break; | ||
| case INTOP_CONV_U8_R4: | ||
| case INTOP_CONV_U8_R8: | ||
| // TODO | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these TODOs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The int/float conversions are not as trivial as one would hope. We've had had quite a bit of issues when we switched to running tests on mono interpreter from the runtime repo. For example the implementation for this conversion on mono is
runtime/src/mono/mono/mini/jit-icalls.c
Line 929 in d15a82e
| mono_fconv_u8 (double v) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These helpers are implemented as JIT helpers in CoreCLR:
runtime/src/coreclr/vm/jithelpers.cpp
Lines 375 to 432 in 890de13
| HCIMPL1_V(int64_t, JIT_Dbl2Lng, double val) | |
| { | |
| FCALL_CONTRACT; | |
| #if defined(TARGET_X86) || defined(TARGET_AMD64) || defined(TARGET_ARM) | |
| const double int64_min = -2147483648.0 * 4294967296.0; | |
| const double int64_max = 2147483648.0 * 4294967296.0; | |
| return (val != val) ? 0 : (val <= int64_min) ? INT64_MIN : (val >= int64_max) ? INT64_MAX : (int64_t)val; | |
| #else | |
| return (int64_t)val; | |
| #endif | |
| } | |
| HCIMPLEND | |
| /*********************************************************************/ | |
| HCIMPL1_V(uint32_t, JIT_Dbl2UInt, double val) | |
| { | |
| FCALL_CONTRACT; | |
| #if defined(TARGET_X86) || defined(TARGET_AMD64) | |
| const double uint_max = 4294967295.0; | |
| // Note that this expression also works properly for val = NaN case | |
| return (val >= 0) ? ((val >= uint_max) ? UINT32_MAX : (uint32_t)val) : 0; | |
| #else | |
| return (uint32_t)val; | |
| #endif | |
| } | |
| HCIMPLEND | |
| /*********************************************************************/ | |
| HCIMPL1_V(int32_t, JIT_Dbl2Int, double val) | |
| { | |
| FCALL_CONTRACT; | |
| #if defined(TARGET_X86) || defined(TARGET_AMD64) | |
| const double int32_min = -2147483648.0; | |
| const double int32_max_plus_1 = 2147483648.0; | |
| return (val != val) ? 0 : (val <= int32_min) ? INT32_MIN : (val >= int32_max_plus_1) ? INT32_MAX : (int32_t)val; | |
| #else | |
| return (int32_t)val; | |
| #endif | |
| } | |
| HCIMPLEND | |
| /*********************************************************************/ | |
| HCIMPL1_V(uint64_t, JIT_Dbl2ULng, double val) | |
| { | |
| FCALL_CONTRACT; | |
| #if defined(TARGET_X86) || defined(TARGET_AMD64) | |
| const double uint64_max_plus_1 = 4294967296.0 * 4294967296.0; | |
| // Note that this expression also works properly for val = NaN case | |
| return (val >= 0) ? ((val >= uint64_max_plus_1) ? UINT64_MAX : (uint64_t)val) : 0; | |
| #else | |
| return (uint64_t)val; | |
| #endif | |
| } | |
| HCIMPLEND | 
I expect that you will need functionality that is implemented as a JIT helper in number of places in the executor, so we will need to invent the preferred pattern to do that. The easiest option is to just call the JIT helper from the interpreter.
| ip += 3; | ||
| break; | ||
| case INTOP_CONV_I1_R4: | ||
| LOCAL_VAR(ip[1], int32_t) = (int8_t)(int32_t)LOCAL_VAR(ip[2], float); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the floating point -> int conversions will need more work to handle the overflow cases deterministically according to the current .NET spec.
| I added a new commit containing all the changes from PR suggestions. Let me know if I missed something. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
| Don't recall this failing before: @janvorli I would assume the correct fix for this would be, in  | 
| 
 Just  The reason why you haven't seen it before is that the contracts are only checked on Windows. | 
Adds support for a few binary and unary operators, loading args/locals, branching operations and basic block support, direct calls and adds an offset allocator for vars so everything can work.