-
Notifications
You must be signed in to change notification settings - Fork 356
Optimize multicore critical section impl #797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
12d9855 to
049166c
Compare
e6c7bcc to
0ac8876
Compare
|
Looks like there are either a few relocations too many in the direct boot LDs, or a few too few. Hard to do without actually knowing anything about linker scripts, but the massaging done in aab8e77 doesn't change the output of |
MabezDev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
* Optimize multicore critical section impl * Assert reserved bits, explain bit choices, remove redundant checks * Don't assume the bit reads as 0 * Simplify code generated for thread_id() * Use non-0 value for unlocked * Optimise release * Assume reserved bits read as 0 * Add changelog entry * Clean up warning * Fix direct boot ld
The impl is optimized for the first acquire.
Multi-core lock is now initialized to a value so that we only have to extract a single bit from the PRID register (Xtensa) to create a thread id value. Previous implementation branched on the bit, or incremented the value read after masking, neither of which is necessary.
We no longer count reentries (as lock_api did) because it doesn't matter. We don't have to count locks because the contract of critical_section requires that every acquire be followed by a respective release, nested acquires and releases happening properly paired. This means we have to differentiate the first entry from reentries, but not any two subsequent reentries from each other.
Assembly before (67 + 14 instructions)
Assembly after (25 + 9 instructions)
Assembly if I'm allowed to assume that reserved bits read as 0 (23 + 9 instructions)