-
Notifications
You must be signed in to change notification settings - Fork 492
Spend less time preparing to write log lines when the logger is disabled #352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spend less time preparing to write log lines when the logger is disabled #352
Conversation
In callers like HashiCorp Terraform, the plugin logger is completely disabled (hclog.Off) by default, but previously go-plugin would nonetheless waste time trying to parse log lines to decide what log level to use when calling into the logger, creating memory allocation garbage and spending time in the JSON parser only to immediately discard the results. The hclog.Logger API exposes information about the log level so that callers can skip doing expensive preparation work unless it will actually be used. We'll now use that to totally skip all of the log preparation when the logger is completely disabled, and to skip _some_ of the work if the log line's level is not sufficient to match the logger's level.
|
Hi @jbardin! I'm sorry for the very rude nudge, but I think you've typically been interested in the way go-plugin logging interacts with Terraform (based on the changes made to this function in the past) and so maybe you'd find this change interesting. |
|
Thanks for the heads up @apparentlymart. This all makes sense to me, so I'll give it a more thorough review shortly. I also noticed that someone committed #292, which breaks Terraform's plugin panic handling, so I'll also have to fix that before a new release is tagged. |
|
If you're going to be potentially making other changes in this area anyway... While I was working on this I noticed that a plugin can make its client crash by writing a JSON log line where Lines 47 to 65 in a938e9a
Since all commonly-used plugins are presumably currently generating their JSON logs also using (I didn't do anything about that in this PR just because I wanted to focus on one concern at a time.) |
03310e0 to
6d5fc6f
Compare
|
When I looked at this again today I noticed I neglected to push a final set of changes to correct a test failure I'd caused; the latest version of this which I just forced-pushed is what I had intended to submit. Sorry for the noise. |
Previously the JSON log line parsing code would allocate separately for each additional key/value pair in a log line. Using a flat slice of logEntryKV instead of indirection through pointers amortizes that memory allocation cost a little better, for a marginal decrease in allocations per log line. The resulting logEntryKV objects are always passed and used all together anyway, so there is no clear advantage to splitting each into a separate memory allocation.
6d5fc6f to
a0c0381
Compare
jbardin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT!
|
@hashicorp/team-ip-compliance, waiting on review please |
Description
In callers like HashiCorp Terraform, the plugin logger is completely disabled (
hclog.Off) by default, but previously go-plugin would nonetheless waste time trying to parse log lines to decide what log level to use when calling into the logger, creating memory allocation garbage and spending time in the JSON parser only to immediately discard the results.The
hclog.LoggerAPI exposes information about the log level so that callers can skip doing expensive preparation work unless it will actually be used. This now uses that to totally skip all of the log preparation when the logger is completely disabled, and to skip some of the work if the log line's level is not sufficient to match the logger's level.Modern Terraform plugins tend to generate quite a lot of log messages and it seems that at least some of them generate it by default even when none of the logging environment variables are set. I'm not sure if that's intentional, but nonetheless this PR reduces the overhead of processing and discarding those unsolicited log lines enough to make a measurable difference for some providers.
The second commit removes a level of indirection in the internal modelling of additional key/value pairs in the JSON logs, so that instead of performing one allocation per JSON property these allocations are amortized by gradually growing the backing array of the slice. This is a far less significant improvement than the first commit, but it's also a pretty easy and non-invasive change that saves at least a few memory allocations per log line.
The benchmark added in the first commit demonstrates both the most happy path of having the logger disabled entirely, and the slightly-less-happy path of having the logger enabled but filtering out certain levels so it does still need to perform JSON parsing to check whether the log line needs to be logged.
I believe that the "off" case above represents what Terraform's behavior would be in the default case where the
TF_LOG/etc environment variables are not set. The "error" case representsTF_LOG=error, and "trace" representsTF_LOG=trace. As far as I know, Terraform currently behaves like this "trace" case regardless of what the environment variables are set to.How Has This Been Tested?
Aside from the new benchmark mentioned above, I also ran the existing test suite and it passed without any modifications.