Generate proper stacktrace when calling reraise in Elixir #1804

mat-hek · 2025-08-20T08:38:05Z

👋 @bettio @pguyot I need your help on this one, as I'm not sure if it doesn't break anything, and it relies on at least one hack, described in the comments. Also, I'm not sure how to include the reproduction (snippet below) in the tests - I can translate it to Erlang, but it requires defining several modules and each test is currently a single module AFAIK.

The problem occurs when running the Elixir compiler and can be reproduced in the following way:

defmodule Raiser do
  def raise_error() do
    raise "foo"
  end
end

defmodule IntermediateCaller do
  def call_raiser() do
    Raiser.raise_error()
    :ok
  end
end

defmodule Reraiser do
  def reraise_error() do
    try do
      IntermediateCaller.call_raiser()
    rescue
      e -> reraise e, __STACKTRACE__
    end

    :ok
  end
end

try do
  Reraiser.reraise_error()
rescue
  e -> {e, __STACKTRACE__}
end

Currently, the stacktrace returned only includes Reraiser.reraise_error/0. After the fixes in this PR, it also includes IntermediateCaller.call_raiser/0 and Raiser.raise_error/0, as it should.

I found out that the reason is that nif_erlang_raise ignores its third argument.
The problem only occurs when the functions are in separate modules - when they're in one module, everything works as expected, so I suppose the compiler generates op_raise instead of nif_erlang_raise in that case (?).

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@bblaszkow06

Instead of ignoring the stacktrace passed to `:erlang.raise/3` it properly (?) handles it. Thans to that, the following code: ```elixir try do # This needs a fix BTW inspect(fn x -> x + 1 end) rescue e -> {e, __STACKTRACE__} end ``` that used to return: ```elixir {%ArgumentError{message: "argument error"}, [ {:elixir, :eval_external_handler, 3, [file: ~c"patches/estdlib/elixir.erl", line: 145]}, {:erl_eval, :avmo_do_apply, 4, [file: ~c"erl_eval", line: 754]}, {:erl_eval, :avmo_do_apply, 7, [file: ~c"erl_eval", line: 750]}, {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1052]}, {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1060]}, {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1077]}, {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 364]}, {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 374]}, {Module.ParallelChecker, :verify, 1, [file: ~c"lib/module/parallel_checker.ex", line: 112]}, {Module.ParallelChecker, :verify, 1, [file: ~c"lib/module/parallel_checker.ex", line: 111]}, {Code, :avmo_validated_eval_string, 3, [file: ~c"lib/code.ex", line: 572]}, {RunExpr, :run, 1, [file: ~c"tmp/modules/117000009/_build/code.ex", line: 17]}, {RunExpr, :start, 0, [file: ~c"tmp/modules/117000009/_build/code.ex", line: 6]} ]} ``` now returns: ```elixir {%ArgumentError{message: "argument error"}, [ {Inspect.Function, :uniq, 1, [file: ~c"lib/inspect.ex", line: 497]}, {Inspect.Function, :default_inspect, 2, [file: ~c"lib/inspect.ex", line: 479]}, {Inspect.Algebra, :to_doc, 2, [file: ~c"lib/inspect/algebra.ex", line: 396]}, {Kernel, :inspect, 2, [file: ~c"lib/kernel.ex", line: 2381]}, {:elixir_eval, :__FILE__, 1, [ file: ~c"/Users/matheksm/ewr/fission_lib/tmp/modules/117000009/_build/code.ex", line: 18 ]}, {:elixir, :eval_external_handler, 3, [file: ~c"patches/estdlib/elixir.erl", line: 98]}, {:elixir, :eval_external_handler, 3, [file: ~c"patches/estdlib/elixir.erl", line: 99]}, {:erl_eval, :avmo_do_apply, 4, [file: ~c"erl_eval", line: 754]}, {:erl_eval, :avmo_do_apply, 7, [file: ~c"erl_eval", line: 750]}, {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1052]}, {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1060]}, {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1077]}, {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 364]}, {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 374]}, {Module.ParallelChecker, :verify, 1, [file: ~c"lib/module/parallel_checker.ex", line: 112]}, {Module.ParallelChecker, :verify, 1, [file: ~c"lib/module/parallel_checker.ex", line: 111]}, {Code, :avmo_validated_eval_string, 3, [file: ~c"lib/code.ex", line: 572]}, {RunExpr, :run, 1, [file: ~c"tmp/modules/117000009/_build/code.ex", line: 17]} ]} ``` However, I'm not 100% sure this doesn't break something ¯\\_(ツ)\_/¯ I plan to upstream this shortly and hopefully get some feedback. cc @bblaszkow06 @FKubisSWM --- These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later Signed-off-by: Mateusz Front <[email protected]>

mat-hek · 2025-08-20T08:39:34Z

src/libAtomVM/opcodesswitch.h

+                    } else {
+                        x_regs[0] = stacktrace_exception_class(stacktrace);
+                        x_regs[1] = exc_value;
+                        x_regs[2] = stacktrace_create_raw(ctx, mod, saved_pc - code, x_regs[0]);


Here I don't understand why we create a stacktrace, while it's already created and we use it to retrieve the exception class and value 🤔

mat-hek · 2025-08-20T08:44:53Z

src/libAtomVM/opcodesswitch.h

+                        //  FIXME: This is a temporary solution as in some niche cases of reraise the x_regs[0] is
+                        //  overwritten and it does not represent exception class
+                        if (!is_exception_class(x_regs[0])) {
+                            x_regs[0] = ERROR_ATOM;
+                        }


As described in the comment, the exception class may not be in x_regs[0] at this point, and then the error/exception is effectively lost. To avoid that, we set the exception class to error, but it may have been exit or throw as well. We can't get the exception class from the stacktrace, as it's already built, so I'm not sure where to get it from. Generally, keeping the exception class as part of the raw stacktrace feels a bit hacky to me, but I'm not sure where to keep it instead @bettio @pguyot

pguyot · 2025-08-20T20:14:39Z

You definitely can have a test with several modules. We already do. See catch_from_other_module.erl and raise_badmatch.erl.

Signed-off-by: Mateusz Front <[email protected]>

mat-hek · 2025-08-25T12:42:08Z

@pguyot Ok, I added the test. As it turns out, for reproduction it's necessary to wrap erlang:raise with erlang:error, as Elixir's reraise does, or basically do some calls before erlang:raise

bettio

I left a comment about an odd predicate function. However the most important change right now is making sure that code compiles on CI and all tests pass (right now it fails on some targets).

About your questions, I remember that Elixir code was relying a lot on raw_raise, did you check that? I think it is involved with reraise.

bettio · 2025-09-09T21:51:59Z

src/libAtomVM/stacktrace.c

+    } else if (term_is_list(stack_info_or_stacktrace)) {
+        return true;
+    } else {
+        fprintf(stderr, "Error: invalid stack_info or stacktrace passed to stacktrace_is_built");


This functions looks like a bit odd:

This function might assert (and never returns), but under other conditions it returns false, it's kind of unusual behavior

nitpicking: I don't expect sideffects (such ad I/O) in a predicate fuction

Made it abort if passed an invalid argument, is that what you meant? Left the printf so it's easier to debug.

Signed-off-by: Mateusz Front <[email protected]>

bettio · 2025-10-02T17:05:43Z

Let's try to summarize everything a little bit:

I think this PR is a workaround for a simple reason: as far as I understand, registers are not meant to hold error information while returning from the function that raised the error. So it is likely building on something quite wrong.

This is likely a wrong pattern:

  70 #define RAISE(a, b)  \
  71     ctx->x[0] = (a); \
  72     ctx->x[1] = (b); \
  73     return term_invalid_term();

We should store relevant information as part of Context, so they don't get clobbered by mistake.

I think this kind of workaround, that @mat-hek did, explains pretty well the issue (I think) we have:

static bool is_exception_class(term t)
{
    return t == ERROR_ATOM || t == LOWERCASE_EXIT_ATOM || t == THROW_ATOM;
}
[...]
if (stacktrace_is_built(stacktrace)) {
    //  FIXME: This is a temporary solution as in some niche cases of reraise the x_regs[0] is
    //  overwritten and it does not represent exception class
    if (!is_exception_class(x_regs[0])) {
        x_regs[0] = ERROR_ATOM;
    }

@mat-hek by the way, do you have any code sample where this corner case is visible and reproducible?

Any drawback of adding additional fields to Context?

I think we should be able to manage this adding to Context something like:

context_flags_t flags; // error, exit, throw, ... (space for more flags)
term reason;
term stacktrace;

In the future we might even add:

term value;

This can be set from VALIDATE_VALUE macro for improved debugging.
Also avoiding overwriting x[0]...x[2] registers, allows us to have improved stacktraces, than can contain function arguments and so on.

What we should do now?

I suggest finding if merging this PR (as a temporary workaround) is doable. Can we update this PR to JIT changes with a reasonable effort? If so, Popcorn can benefit this, since they can use an updated AtomVM. Otherwise it would be hard for them to stay on par with upstream (until we find some kind of better error management).

Also, I really think we should instead revamp error handling as soon as we have less moving parts on JIT side.

@pguyot your opinion is important here

Comments?

bettio

Let's discuss the proposal I did on my previous comment.

mat-hek · 2025-10-03T10:07:53Z

@mat-hek by the way, do you have any code sample where this corner case is visible and reproducible?

@bettio To be clear, this problem only occurs if you bring the changes from this PR without the fix. The problem is reproducible with popcorn, see this PR. It would require running popcorn master with fissionvm without the fix. I can set it up for you if needed.

Also, this PR doesn't fix any critical bugs - it just improves generated stacktraces, thus we can run popcorn without it. It'll be harder to debug though, so it may delay abandoning fissionvm indeed.

mat-hek · 2025-10-06T14:01:57Z

@bettio BTW, if you decide on how error handling should be eventually implemented and provide some guidance, we can try to contribute it ;)

mat-hek commented Aug 20, 2025

View reviewed changes

mat-hek changed the title ~~Generate proper stacktrace when using reraise in Elixir~~ Generate proper stacktrace when calling reraise in Elixir Aug 20, 2025

mat-hek force-pushed the mf/fix-reraise-stacktrace branch 6 times, most recently from 0a5605a to 171d517 Compare August 25, 2025 10:54

Fixed reraise not caught issue

256c096

Signed-off-by: Mateusz Front <[email protected]>

mat-hek force-pushed the mf/fix-reraise-stacktrace branch from 171d517 to c950098 Compare August 25, 2025 11:02

add tests

553054e

Signed-off-by: Mateusz Front <[email protected]>

mat-hek force-pushed the mf/fix-reraise-stacktrace branch from c950098 to 553054e Compare August 25, 2025 11:02

bettio requested changes Sep 9, 2025

View reviewed changes

CR

1cf1710

Signed-off-by: Mateusz Front <[email protected]>

mat-hek requested a review from bettio September 12, 2025 09:04

bettio reviewed Oct 2, 2025

View reviewed changes

mat-hek mentioned this pull request Oct 30, 2025

Adjust to the downstreamed AtomVM software-mansion/popcorn#301

Open

bettio mentioned this pull request Nov 1, 2025

Rework error handling #1951

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generate proper stacktrace when calling reraise in Elixir #1804

Generate proper stacktrace when calling reraise in Elixir #1804

Uh oh!

mat-hek commented Aug 20, 2025 •

edited

Loading

Uh oh!

mat-hek Aug 20, 2025

Uh oh!

mat-hek Aug 20, 2025

Uh oh!

pguyot commented Aug 20, 2025

Uh oh!

mat-hek commented Aug 25, 2025

Uh oh!

bettio left a comment

Uh oh!

bettio Sep 9, 2025

Uh oh!

mat-hek Sep 12, 2025

Uh oh!

bettio commented Oct 2, 2025

Uh oh!

bettio left a comment

Uh oh!

mat-hek commented Oct 3, 2025 •

edited

Loading

Uh oh!

mat-hek commented Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Generate proper stacktrace when calling reraise in Elixir #1804

Are you sure you want to change the base?

Generate proper stacktrace when calling reraise in Elixir #1804

Uh oh!

Conversation

mat-hek commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mat-hek Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

mat-hek Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

pguyot commented Aug 20, 2025

Uh oh!

mat-hek commented Aug 25, 2025

Uh oh!

bettio left a comment

Choose a reason for hiding this comment

Uh oh!

bettio Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

mat-hek Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

bettio commented Oct 2, 2025

Uh oh!

bettio left a comment

Choose a reason for hiding this comment

Uh oh!

mat-hek commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mat-hek commented Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mat-hek commented Aug 20, 2025 •

edited

Loading

mat-hek commented Oct 3, 2025 •

edited

Loading