Skip to content

Conversation

@mat-hek
Copy link
Contributor

@mat-hek mat-hek commented Aug 20, 2025

👋 @bettio @pguyot I need your help on this one, as I'm not sure if it doesn't break anything, and it relies on at least one hack, described in the comments. Also, I'm not sure how to include the reproduction (snippet below) in the tests - I can translate it to Erlang, but it requires defining several modules and each test is currently a single module AFAIK.

The problem occurs when running the Elixir compiler and can be reproduced in the following way:

defmodule Raiser do
  def raise_error() do
    raise "foo"
  end
end

defmodule IntermediateCaller do
  def call_raiser() do
    Raiser.raise_error()
    :ok
  end
end

defmodule Reraiser do
  def reraise_error() do
    try do
      IntermediateCaller.call_raiser()
    rescue
      e -> reraise e, __STACKTRACE__
    end

    :ok
  end
end

try do
  Reraiser.reraise_error()
rescue
  e -> {e, __STACKTRACE__}
end

Currently, the stacktrace returned only includes Reraiser.reraise_error/0. After the fixes in this PR, it also includes IntermediateCaller.call_raiser/0 and Raiser.raise_error/0, as it should.

I found out that the reason is that nif_erlang_raise ignores its third argument.
The problem only occurs when the functions are in separate modules - when they're in one module, everything works as expected, so I suppose the compiler generates op_raise instead of nif_erlang_raise in that case (?).

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

Instead of ignoring the stacktrace passed to `:erlang.raise/3` it
properly (?) handles it. Thans to that, the following code:
```elixir
try do
  # This needs a fix BTW
  inspect(fn x -> x + 1 end)
rescue
  e -> {e, __STACKTRACE__}
end
```

that used to return:

```elixir
{%ArgumentError{message: "argument error"},
 [
   {:elixir, :eval_external_handler, 3,
    [file: ~c"patches/estdlib/elixir.erl", line: 145]},
   {:erl_eval, :avmo_do_apply, 4, [file: ~c"erl_eval", line: 754]},
   {:erl_eval, :avmo_do_apply, 7, [file: ~c"erl_eval", line: 750]},
   {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1052]},
   {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1060]},
   {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1077]},
   {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 364]},
   {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 374]},
   {Module.ParallelChecker, :verify, 1,
    [file: ~c"lib/module/parallel_checker.ex", line: 112]},
   {Module.ParallelChecker, :verify, 1,
    [file: ~c"lib/module/parallel_checker.ex", line: 111]},
   {Code, :avmo_validated_eval_string, 3, [file: ~c"lib/code.ex", line: 572]},
   {RunExpr, :run, 1,
    [file: ~c"tmp/modules/117000009/_build/code.ex", line: 17]},
   {RunExpr, :start, 0,
    [file: ~c"tmp/modules/117000009/_build/code.ex", line: 6]}
 ]}
```

now returns:

```elixir
{%ArgumentError{message: "argument error"},
 [
   {Inspect.Function, :uniq, 1, [file: ~c"lib/inspect.ex", line: 497]},
   {Inspect.Function, :default_inspect, 2,
    [file: ~c"lib/inspect.ex", line: 479]},
   {Inspect.Algebra, :to_doc, 2, [file: ~c"lib/inspect/algebra.ex", line: 396]},
   {Kernel, :inspect, 2, [file: ~c"lib/kernel.ex", line: 2381]},
   {:elixir_eval, :__FILE__, 1,
    [
      file: ~c"/Users/matheksm/ewr/fission_lib/tmp/modules/117000009/_build/code.ex",
      line: 18
    ]},
   {:elixir, :eval_external_handler, 3,
    [file: ~c"patches/estdlib/elixir.erl", line: 98]},
   {:elixir, :eval_external_handler, 3,
    [file: ~c"patches/estdlib/elixir.erl", line: 99]},
   {:erl_eval, :avmo_do_apply, 4, [file: ~c"erl_eval", line: 754]},
   {:erl_eval, :avmo_do_apply, 7, [file: ~c"erl_eval", line: 750]},
   {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1052]},
   {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1060]},
   {:erl_eval, :avmo_try_clauses, 10, [file: ~c"erl_eval", line: 1077]},
   {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 364]},
   {:elixir, :avmo_eval_forms, 4, [file: ~c"src/elixir.erl", line: 374]},
   {Module.ParallelChecker, :verify, 1,
    [file: ~c"lib/module/parallel_checker.ex", line: 112]},
   {Module.ParallelChecker, :verify, 1,
    [file: ~c"lib/module/parallel_checker.ex", line: 111]},
   {Code, :avmo_validated_eval_string, 3, [file: ~c"lib/code.ex", line: 572]},
   {RunExpr, :run, 1,
    [file: ~c"tmp/modules/117000009/_build/code.ex", line: 17]}
 ]}
```

However, I'm not 100% sure this doesn't break something ¯\\_(ツ)\_/¯ I
plan to upstream this shortly and hopefully get some feedback.

cc @bblaszkow06 @FKubisSWM

---

These changes are made under both the "Apache 2.0" and the "GNU Lesser
General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Signed-off-by: Mateusz Front <[email protected]>
} else {
x_regs[0] = stacktrace_exception_class(stacktrace);
x_regs[1] = exc_value;
x_regs[2] = stacktrace_create_raw(ctx, mod, saved_pc - code, x_regs[0]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I don't understand why we create a stacktrace, while it's already created and we use it to retrieve the exception class and value 🤔

Comment on lines +3753 to +3757
// FIXME: This is a temporary solution as in some niche cases of reraise the x_regs[0] is
// overwritten and it does not represent exception class
if (!is_exception_class(x_regs[0])) {
x_regs[0] = ERROR_ATOM;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described in the comment, the exception class may not be in x_regs[0] at this point, and then the error/exception is effectively lost. To avoid that, we set the exception class to error, but it may have been exit or throw as well. We can't get the exception class from the stacktrace, as it's already built, so I'm not sure where to get it from. Generally, keeping the exception class as part of the raw stacktrace feels a bit hacky to me, but I'm not sure where to keep it instead @bettio @pguyot

@mat-hek mat-hek changed the title Generate proper stacktrace when using reraise in Elixir Generate proper stacktrace when calling reraise in Elixir Aug 20, 2025
@pguyot
Copy link
Collaborator

pguyot commented Aug 20, 2025

You definitely can have a test with several modules. We already do. See catch_from_other_module.erl and raise_badmatch.erl.

@mat-hek mat-hek force-pushed the mf/fix-reraise-stacktrace branch 6 times, most recently from 0a5605a to 171d517 Compare August 25, 2025 10:54
@mat-hek mat-hek force-pushed the mf/fix-reraise-stacktrace branch from 171d517 to c950098 Compare August 25, 2025 11:02
Signed-off-by: Mateusz Front <[email protected]>
@mat-hek mat-hek force-pushed the mf/fix-reraise-stacktrace branch from c950098 to 553054e Compare August 25, 2025 11:02
@mat-hek
Copy link
Contributor Author

mat-hek commented Aug 25, 2025

@pguyot Ok, I added the test. As it turns out, for reproduction it's necessary to wrap erlang:raise with erlang:error, as Elixir's reraise does, or basically do some calls before erlang:raise

Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment about an odd predicate function. However the most important change right now is making sure that code compiles on CI and all tests pass (right now it fails on some targets).

About your questions, I remember that Elixir code was relying a lot on raw_raise, did you check that? I think it is involved with reraise.

} else if (term_is_list(stack_info_or_stacktrace)) {
return true;
} else {
fprintf(stderr, "Error: invalid stack_info or stacktrace passed to stacktrace_is_built");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functions looks like a bit odd:

  • This function might assert (and never returns), but under other conditions it returns false, it's kind of unusual behavior
  • nitpicking: I don't expect sideffects (such ad I/O) in a predicate fuction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it abort if passed an invalid argument, is that what you meant? Left the printf so it's easier to debug.

Signed-off-by: Mateusz Front <[email protected]>
@mat-hek mat-hek requested a review from bettio September 12, 2025 09:04
@bettio
Copy link
Collaborator

bettio commented Oct 2, 2025

Let's try to summarize everything a little bit:

  1. I think this PR is a workaround for a simple reason: as far as I understand, registers are not meant to hold error information while returning from the function that raised the error. So it is likely building on something quite wrong.

This is likely a wrong pattern:

  70 #define RAISE(a, b)  \
  71     ctx->x[0] = (a); \
  72     ctx->x[1] = (b); \
  73     return term_invalid_term();

We should store relevant information as part of Context, so they don't get clobbered by mistake.

I think this kind of workaround, that @mat-hek did, explains pretty well the issue (I think) we have:

static bool is_exception_class(term t)
{
    return t == ERROR_ATOM || t == LOWERCASE_EXIT_ATOM || t == THROW_ATOM;
}
[...]
if (stacktrace_is_built(stacktrace)) {
    //  FIXME: This is a temporary solution as in some niche cases of reraise the x_regs[0] is
    //  overwritten and it does not represent exception class
    if (!is_exception_class(x_regs[0])) {
        x_regs[0] = ERROR_ATOM;
    }

@mat-hek by the way, do you have any code sample where this corner case is visible and reproducible?

  1. Any drawback of adding additional fields to Context?

I think we should be able to manage this adding to Context something like:

context_flags_t flags; // error, exit, throw, ... (space for more flags)
term reason;
term stacktrace;

In the future we might even add:

term value;

This can be set from VALIDATE_VALUE macro for improved debugging.
Also avoiding overwriting x[0]...x[2] registers, allows us to have improved stacktraces, than can contain function arguments and so on.

  1. What we should do now?

I suggest finding if merging this PR (as a temporary workaround) is doable. Can we update this PR to JIT changes with a reasonable effort? If so, Popcorn can benefit this, since they can use an updated AtomVM. Otherwise it would be hard for them to stay on par with upstream (until we find some kind of better error management).

Also, I really think we should instead revamp error handling as soon as we have less moving parts on JIT side.

@pguyot your opinion is important here

Comments?

Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss the proposal I did on my previous comment.

@mat-hek
Copy link
Contributor Author

mat-hek commented Oct 3, 2025

@mat-hek by the way, do you have any code sample where this corner case is visible and reproducible?

@bettio To be clear, this problem only occurs if you bring the changes from this PR without the fix. The problem is reproducible with popcorn, see this PR. It would require running popcorn master with fissionvm without the fix. I can set it up for you if needed.

Also, this PR doesn't fix any critical bugs - it just improves generated stacktraces, thus we can run popcorn without it. It'll be harder to debug though, so it may delay abandoning fissionvm indeed.

@mat-hek
Copy link
Contributor Author

mat-hek commented Oct 6, 2025

@bettio BTW, if you decide on how error handling should be eventually implemented and provide some guidance, we can try to contribute it ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants