Skip to content

Conversation

@DinoKengi
Copy link

@DinoKengi DinoKengi commented Sep 23, 2025

This section should be able to ABI-Parse binary files that are fed.
Note: Is currently being fed a file, will need to connect ELF parser in the future.
Another note: rebasing last time made an oopsies so anything after the 'fixed abi parsing' commits shouldn't have happened.

DinoKengi and others added 19 commits April 17, 2025 10:37
…he elf header

Completed Section and Program header parsing

implement function to store lsda into a binary file and another to close the elf file

fixed formatting

fixed syntax errors
…, move loading of the headers to the constructor, reformat to fit libhal formatting standard

move headers and section data to be stored in unordered_maps and vectors

Added Doxygen documentation and reorganized folders

fix formatting to match libhal styling guide
…he elf header

Completed Section and Program header parsing

implement function to store lsda into a binary file and another to close the elf file

fixed formatting

fixed syntax errors
…, move loading of the headers to the constructor, reformat to fit libhal formatting standard

move headers and section data to be stored in unordered_maps and vectors

Added Doxygen documentation and reorganized folders

fix formatting to match libhal styling guide
* Created POC GCC IPA callgraph parser
Script is designed for testing and workshopping. Script is written in python and documented on how it works

* 👷 Added debugging settings and defaulted to Debug builds

* ❇️ Started to port the parsing code for gcc
No API yet
@PhazonicRidley
Copy link
Collaborator

Resolve your conflicts and rebase with CI

@PhazonicRidley PhazonicRidley self-requested a review October 26, 2025 01:02
Copy link
Member

@kammce kammce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a good idea to pick a style for naming across the code base. But you can leave that for later.

Mchan2003 and others added 7 commits November 3, 2025 10:54
…he elf header

Completed Section and Program header parsing

implement function to store lsda into a binary file and another to close the elf file

fixed formatting

fixed syntax errors
…, move loading of the headers to the constructor, reformat to fit libhal formatting standard

move headers and section data to be stored in unordered_maps and vectors

Added Doxygen documentation and reorganized folders

fix formatting to match libhal styling guide
* Created POC GCC IPA callgraph parser
Script is designed for testing and workshopping. Script is written in python and documented on how it works

* 👷 Added debugging settings and defaulted to Debug builds

* ❇️ Started to port the parsing code for gcc
No API yet
PhazonicRidley and others added 12 commits November 3, 2025 10:55
* fixed missing directory

* 💚 Created basic CI
Only works on linux currently
TODO: Add windows and mac pipelines
TODO: Add lint checks
TODO: Add deployment

---------

Co-authored-by: Mchan2003 <[email protected]>
…he elf header

Completed Section and Program header parsing

implement function to store lsda into a binary file and another to close the elf file

fixed formatting

fixed syntax errors
…, move loading of the headers to the constructor, reformat to fit libhal formatting standard

move headers and section data to be stored in unordered_maps and vectors

Added Doxygen documentation and reorganized folders

fix formatting to match libhal styling guide
* Created POC GCC IPA callgraph parser
Script is designed for testing and workshopping. Script is written in python and documented on how it works

* 👷 Added debugging settings and defaulted to Debug builds

* ❇️ Started to port the parsing code for gcc
No API yet
Copy link
Collaborator

@PhazonicRidley PhazonicRidley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. Please make a unit test with your current logic in your main.cpp.

std::ignore = argv;
std::println("Hello C++: {}", __cplusplus);
// temporarily reading LSDA file before merging with main
const char* path = (argc >= 2 ? argv[1] : "LSDA/lsda");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test, should be done in a unit test program. We want to use Michael's main

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is for a test, move to testing folders and out of root, same goes for the other files in this folder

int64_t next; // next action in list
};

class Abi_parser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Seems like we're going with PascalCase for our classes

private:
void check(size_t n) const; // checks that n bytes remains in LSDA buffer before read

const std::vector<uint8_t>& data; // raw LSDA byte data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the ABI parser class doesn't own the raw data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll chime in and say that you should not store references within the fields of a C++ class. This class should own the vector that owns the data. This is asking for pointer invalidation.

while (i < buf.size()) {
uint8_t byte = buf[i++];
result |= static_cast<uint64_t>(byte & 0x7F) << shift;
if ((byte & 0x80) == 0) break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Personally, I prefer {} around any if/while/for clauses. nbd if you wanna keep it this way

}

if (indirect) {
throw std::runtime_error("indirect doesn't support raw LSDA");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix formatting, just run clang format via vs code

start_enc = read8();
if (start_enc != 0xFF) {
const uint64_t pcrel_base = static_cast<uint64_t>(index);
(void) r_encode(start_enc, pcrel_base); // reads and ignores
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::ignore instead of (void)

Action a{};
a.type = read_sleb128(data, index);
if (index > limit_end)
throw std::runtime_error("action parsing went over limit");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{} around single line if blocks, very easy to have an intent error and have the if block attach to things you dont want

Copy link
Member

@kammce kammce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this parser provides to the code base. Does it just parse the entire exception data table? If so, how is this associated with the functions it belongs to? Also, the expanded data of the call site and action table isn't what we need, its the association between scopes and caught types which does not exist in this commit. But thats fine if that'll be done in a future PR.

private:
void check(size_t n) const; // checks that n bytes remains in LSDA buffer before read

const std::vector<uint8_t>& data; // raw LSDA byte data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll chime in and say that you should not store references within the fields of a C++ class. This class should own the vector that owns the data. This is asking for pointer invalidation.

Comment on lines +37 to +38
for (int i = 0; i < 2; ++i)
v |= static_cast<uint16_t>(data[index++]) << (i * 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I highly recommend using curly braces for all blocks.

int64_t next; // next action in list
};

class Abi_parser
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name of the this class is very vague. Rename this to something like gcc_lsda_parser.

uint64_t start; // start of protected range
uint64_t length; // length of protected range
uint64_t landing_pad; // landing pad address
int64_t action; // throw to action table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int64_t action; // throw to action table
int64_t action; // offset into action table

Comment on lines +26 to +30
struct Action
{
int64_t type; // index type
int64_t next; // next action in list
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is whats within the action table, it makes more sense to associate the call site with the ordered list of types caught by that scope. Then you can use that to compare thrown type against caught type.

{
public:
explicit Abi_parser(const std::vector<uint8_t>& lsda_data);
void parse();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not parse at construction? Why do we need another API for starting the parse?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants