Array interface method devirtualization

The jit cannot devirtualize interface calls made through arrays, because this aspect of arrays has a special implementation in the runtime. I noticed this a while back but never wrote up anything. I was reminded of this again by #62266.

Consider a simple program like:
```c#
using System;
using System.Collections.Generic;

class X {
    public static void Main() {
        IList<int> list = new int[10];
        for (int i = 0; i < 100; i++) {
            foreach (var _ in list) {
                // nothing
            }
        }
    }
}
```
Here all types are known and we should be able to de-abstract this and come close to the equivalent for loop.

Currently none of the interface calls devirtualize or inline, the method allocates, and the try/finally can't be optimized away, despite the empty `Dispose` from the enumerator. 

Fixing all this requires resolving a number of issues:
1. `resolveVirtualMethodHelper` on the jit interface calls `CanCastToInterface` -- this bypasses special checks in `CanCastTo` that handle interface methods for arrays.
2.  Once that is fixed, the method that gets devirtualized to is `SZArrayHelper.GetEnumerator<T>` which is a generic method, not an instance method on a generic class. So the context handling in `resolveVirtualMethodHelper` needs to be updated here and also over in `impDevirtualizeMethod` -- currently we always assume class context.
3. Similar changes are needed in crossgen2 (haven't looked into this yet)
4. Once that is fixed, the jit can devirtualize the `GetEnumerator` call, but this happens "too late" to devirtualize the subsequent `MoveNext` and `get_Current` calls through the enumerator. The fix here is likely to mark these `GetEnumerator` methods as intrinsics and extend `getSpecialIntrinsicExactReturnType` to propagate the right type downstream during importation.
5. The `GetEnumerator` call is too big to inline by default. Likely we can just mark this with `AggressiveInlining`. But pragmatically we might want to boost inlining for methods that return a type (especially a sealed type) that is more derived that the declared return type. Doing so likely requires resolving more tokens during the pre-inline scan, but perhaps we're already doing this for `newobj` and it might not be too costly.
6. If we inline (and haven't fixed (4) yet) then we can late devirtualize the `MoveNext` and `Dispose` as we see the improved type flowing out of the inlined method -- but not the call to `get_Current` -- not sure why just yet.
7. The enumerator is a class, not a struct; if we made it a struct we might have a shot at removing the boxing (rather than proving we can stack allocate a class).  But this box would have multiple uses, so we'd also need to tackle #9118.
8. It remains to be seen if we actually do inline all those methods and undo the boxing whether we can then enable struct promotion and remove the overhead of iterating via an enumerator.

I will point out that the "equivalent" direct code (with loop non-empty to avoid it being optimized away entirely)
```c#
    public static int M_enum() 
    {
        int[] a = new int[1000];
        int result = 0;
        foreach (var v in a)
        {
           result ^= v;
        }
        return result;
    } 
```

is marginally less efficient than the for loop version:
```c#
    public static int M() 
    {
        int[] a = new int[1000];
        int result = 0;
        for (int i = 0; i < a.Length; i++)
        {
           result ^= a[i]; 
        }
        return result;
    }
```

In the more general case where we don't know the exact collection class, we might hope PGO would get us to an equivalent place. But here there are additional challenges.

9. The enumerator GDV doesn't allow us to know the enumerator type in the subsequent loop. We ought to notice a correlation between the type of the enumerator returned and the class type seen in the loop interface calls (or the fact that the latter are invariants) and decide to clone the loop based on a type test. That would remove the GDV tests from within the loop.
10. Ideally then we see that the loop cloning check is partially redundant based on the outcome of the `GetEnumerator` GDV test and simplify that path further. So, in the "good" case we do one type test to see what kind of collection we have, and then branch to a specialized loop that knows exactly what kind of enumerator we have.

This is perhaps the simplest case of de-abstraction. Other collection types have more complex enumerators.



category:cq
theme:devirtualization
skill-level:expert
cost:small
impact:medium

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Array interface method devirtualization #62457

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Array interface method devirtualization #62457

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions