-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
change @allocated again
#59278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change @allocated again
#59278
Conversation
|
From the test failures:
So I want to mark the test functions as constprop :none, but I'm a bit worried this will break many package tests since some of them probably depend on the constprop, possibly intentionally. Unfortunately the old implementation of If we want the constprop behavior, then I think the right thing is to wrap the given expression in a closure every time to make sure it is always optimized as a function on its own. But then you are going to get a fair number of false negatives. |
af3f6e4 to
6b38358
Compare
@allocated x = f(...) work as before@allocated x = f(...) more reliable and handle assignment syntax
@allocated x = f(...) more reliable and handle assignment syntax@allocated more reliable and handle assignment syntax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not fix #58780, since the example there looks like this:
a = @allocated if d_rotor[end] > 0
RPl, RPw = discretizeRotor(floris.rotor_points)
else
RPl = [0.0 0.0 0.0]
RPw = [1.0]
end
(small nit: making the first line of the commit message say "fix #xxxxx" is not useful, since the number is meaningless to others, but it reduces the amount of useful descriptive text you can put in that line before it may get truncated, and you have to repeat that number in the body anyways)
|
Where is that from? That example is not in the issue. And unfortunately I don't see any way to make that work. |
|
It is the code posted for the issue to be reopened (#58780 (comment)) |
|
Goodness, I never imagined anyone would use this in-line with their code. It's intended for testing and profiling. I don't think there's any way to support both use cases except to have two macros. In that example, what you want is basically where every byte allocated in that region, for whatever reason, is captured. But that is not what most users of I agree that the first use is compatible with the docstring, which says "during evaluation of the expression". But there are insane numbers of tests out there assuming a different meaning. I propose:
|
I think we had changed that because that will break a lot of tests |
|
In that case the only way to fix it would be to have some way to avoid the latestworld updates between statements. |
|
Actually this was only a problem inside testsets, since the code they generate is so complex. Maybe there is a workaround possible there. Update: turns out this is only caused by our deliberately inserting world updates in testsets. I don't know if we can just remove that, or maybe make it an option? |
|
OK here's another proposal: use the new implementation for call expressions, for anything more complex use the old implementation. That might reduce the failing cases to an acceptably low level. |
6b38358 to
36ccb0e
Compare
|
I believe that was tried before, and failed several packages also. "Just a call" is far more restrictive than people assume, since it is just exactly one call--not even fusion is possible. I also have a branch that supports the generalized version which does support multiple calls, before I concluded that heuristic mess is a bad idea |
|
Yeah, any sort of heuristic will never fully work. I think long term the best thing is to have two macros, one that works like the old In the meantime, I believe this could be fixed by fixing #58511; then we would not need world age updates in testsets and it will generally work much more like it did before. |
|
Why is it that |
|
It looks like the compiler is able to delete the allocation of the ReshapedArray in one case but not the other, amazingly, due to the slightly different code in the two macros. I'll see if I can fix this. |
36ccb0e to
1b6b8c5
Compare
|
In the linked MWE, the trouble is that Ready for review. |
1b6b8c5 to
9f7c9f4
Compare
@allocated more reliable and handle assignment syntax@allocated and add @allocated inside
|
It looks like we might need to subtract allocations from the compiler. But, I'm not sure it's worth it since in those cases you probably get argument boxing as well so |
3fe18cd to
c8fd1e4
Compare
7793572 to
fc63221
Compare
@allocated and add @allocated inside@allocated
fc63221 to
3919825
Compare
|
OK I came up with another good trick and now I think the behavior is very accurate and predictable (see tests). You will get different numbers sometimes, but that was going to happen anyway. The good part is you now have pretty good control over what happens, without any new features. If you were relying on the call site getting optimized to get 0 allocations, you can just wrap in an extra closure. If you get spurious new allocations, refactor to measure call sites individually or wrap in a closure. If you want to see spurious allocations from argument boxing, wrap your call in a begin...end block 😛 |
|
IIRC, I didn't use this form since it didn't fix most of the PkgEval failures, so we should check on that |
@allocated@allocated again
|
There was also a more advanced version of this PR in the original PR (#58057 (comment)) that did fix more PkgEval failures, but I thought it was getting a bit too complicated and magical (I think I have a branch somewhere with that comment turned into a working example) |
01e3bfa to
102ad25
Compare
Also add `@constprop :none` and `@noinline` to prevent the measuring code from interfering with the subject code. Fixes #58780
102ad25 to
23633c2
Compare
Also add
donotdeleteto ensure the code actually runs.fixes #58780