- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Add libbacktrace support for Apple platforms (resubmitted) #44251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This programs compiled without -g on macOS still provide the resolve to actual symbols, instead of `<unknown>` everywhere.
| (rust_highfive has picked a reviewer for you, use r? to override) | 
| @bors: r+ | 
| 📌 Commit 7169fe5 has been approved by  | 
| ⌛ Testing commit 7169fe5 with merge caa5f7fc5b3e84b57827a21219f196a570afeb4d... | 
| 💔 Test failed - status-travis | 
| 
  | 
| ⌛ Testing commit 7169fe5 with merge e3761e1b85e0019044513195cda81e93f3e51cf4... | 
| 💔 Test failed - status-travis | 
| 
 (Now trying to reproduce locally.) Update: Reproduced. Looks like the line numbers are just missing in  | 
| @alexcrichton  The particular test case is fixed in 6466475b98c9019b7bcdd0a1eec346cccae23cea (tested locally on both  | 
| Hm I'm getting pretty nervous about this, we're adding a pretty big wad of umaintained C to the standard library for all Rust programs on OSX and we've already found a bug in it? I'd personally feel more comfortable if we could get this upstream first, or if we know that this had gone through something like fuzzers and whatnot (or had it off by default maybe?) | 
| 
 That's hard to do, as the upstream has no code updates for 1 year already, with both ianlancetaylor/libbacktrace#2 and ianlancetaylor/libbacktrace#6 idling 😅. Maybe rust-lang needs to fork the repo, or switch to a more maintained library ( 
 There is a test file in https://github.com/ianlancetaylor/libbacktrace/blob/master/btest.c but I haven't tried to run it. Not sure if it was tested in #43422. I think it's not fuzzed 😨. | 
| I've run the btest program but it's not even as comprehensive as Rust's existing test. I'm not sure the current fix is fixing it in the right place: macho_add should not return 0 if it fails to find a dsym file or symbol table. It should only do so on a hard error condition, which is why I originally passed the error up the chain. Since the test is passing when this is bypassed it appears macho_add is failing on one of the loaded dylibs, and that this dylib is either only loaded by 32-bit executables or only the 32-bit slice throws the code for a loop. I'll try to pin down which one. | 
| @JohnColanduoni The problem is that when loading the linked system libraries ( Still, if we compare with the ELF code, I think how 6466475 handles it is closer in spirit. (Snippet of the ELF code)int
backtrace_initialize (struct backtrace_state *state, int descriptor,
		      backtrace_error_callback error_callback,
		      void *data, fileline *fileline_fn)
{
  // ...
  dl_iterate_phdr (phdr_callback, (void *) &pd);
  // ...
}
static int
phdr_callback (struct dl_phdr_info *info, size_t size ATTRIBUTE_UNUSED,
	       void *pdata)
{
  // ...
  if (elf_add (pd->state, descriptor, info->dlpi_addr, pd->error_callback,
	       pd->data, &elf_fileline_fn, pd->found_sym, &found_dwarf, 0))
    {
      if (found_dwarf)
	{
	  *pd->found_dwarf = 1;
	  *pd->fileline_fn = elf_fileline_fn;
	}
    }
  return 0;
}Update: On i686 the NATIVE_CPU_TYPE is 7 (good) but the computed  Update 2: ಠ_ಠ @@ -288,7 +288,7 @@ macho_get_commands (struct backtrace_state *state, int descriptor,
 
       archs_total_size = arch_count * sizeof (struct fat_arch);
 
-      if (!backtrace_get_view (state, descriptor, sizeof (fat_header),
+      if (!backtrace_get_view (state, descriptor, sizeof (struct fat_header),
                                archs_total_size, error_callback,
                                data, &fat_archs_view))Fixing this now gives the  | 
6466475    to
    e53638c      
    Compare
  
    | @kennytm It looks like the problem is lazy linking on i386 causing the virtual address slide for those system libraries to be zero. Adding a  | 
| @JohnColanduoni The vmslide is also zero when ASLR is disabled, e.g. when running under LLDB. I don't think that's a valid solution. | 
| @kennytm Ah you're right. That should only be for the main executable though, any dylibs will need to be loaded somewhere else even without ASLR (and the main executable will never be lazy loaded anyway). Perhaps  | 
e53638c    to
    aa6bd11      
    Compare
  
    | @JohnColanduoni Added to aa6bd11 The remaining problem is to demonstrate that the code is correct and safe (#44251 (comment)). | 
| Thanks for the investigation and the fix here @kennytm and @JohnColanduoni! To be clear I don't want to necessarily block this from landing until it's 100% perfect. To some degree the problem here is basically "we need to decide to take on the maintenance burden". Using libbacktrace at all has been a maintenance burden many times before, but if we went back and did it all over again I feel like we'd still decide to include it because line numbers are just so damn useful. In that sense I think we're basically commited to "we really need this in one form or another." There's a lot of OSX users and they'll be quite happy to start seeing line numbers in backtraces I imagine! We basically, I think, just need to think about possible mitigations. Some mitigations for making this as low-impact as possible are: 
 Y'all have already finished (1), I don't think we'd get much benefit from (2), and I don't think that (3) is well-defined enough to ever actually be finished/completed. In that sense while I still do personally worry about this I think it's something we should land and take under our maintenance wing. We've effectively done that for MinGW and libbacktrace, and this isn't really adding all that much extra from that point of view. Perhaps one day we can switch to  What do y'all think of that? | 
| 
 You mean an option in additional to  | 
| In theory, yes, but again I don't think it's worth it. | 
| @alexcrichton How hard would it be to use the  | 
| @JohnColanduoni in theory not too hard, but we've never done it yet so I suspect it would take quite some time to integrate and get it fully implemented. I also think those aren't the parts that we'd need here? I thought that what this PR does is adding support for basically locating the  | 
| @JohnColanduoni Note that  I think a RIIR  | 
| @alexcrichton Yes, and ideally  @kennytm Yeah that was my thought as well. Is there a reason not to use DbgHelp when dealing with PDBs? | 
| @kennytm and @JohnColanduoni do y'all still think we should land this at this moment? (I think so, myself) | 
| @JohnColanduoni Thanks. I just searched for  @alexcrichton Yes I think so. | 
| @bors: r+ | 
| 📌 Commit aa6bd11 has been approved by  | 
| Any way to get line numbers in OSX is better than no way. There's nothing to stop switching to a better implementation later. Line numbers are pretty backward compatible. (If we had to have this behind runtime flags / compile time flags that would do - we just need those line numbers). Really looking forward to this in nightly! | 
Add libbacktrace support for Apple platforms (resubmitted) Resubmitting #43422 rebased on the current master (cc @JohnColanduoni). I have added an additional commit to fallback to `dladdr`-based `resolve_symbol` if `libbacktrace` returns `None`, otherwise the stack trace will be full of `<unknown>` when you forget to pass the `-g` flag (actually it seems — at least on macOS — the `dladdr` symbol is more accurate than the `libbacktrace` one).
| ☀️ Test successful - status-appveyor, status-travis | 
| CC #24346 | 
| Could this be the cause of #45731, where on MacOS "cargo test with RUST_BACKTRACE causes SIGBUS"? | 
| FYI I've made  | 
| @whitequark Awesome! I'll try to take a crack at making this work this weekend. I'll sleep easier knowing if I introduced any more bugs like #45731 nobody is likely to find them 😉 | 
Resubmitting #43422 rebased on the current master (cc @JohnColanduoni).
I have added an additional commit to fallback to
dladdr-basedresolve_symboliflibbacktracereturnsNone, otherwise the stack trace will be full of<unknown>when you forget to pass the-gflag (actually it seems — at least on macOS — thedladdrsymbol is more accurate than thelibbacktraceone).