-
Notifications
You must be signed in to change notification settings - Fork 912
fix(sdk-trace-base): nanosecond precision of span start time #5804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(sdk-trace-base): nanosecond precision of span start time #5804
Conversation
|
9c84977
to
e1a43e6
Compare
e1a43e6
to
0fe3c79
Compare
if (typeof input === 'number') { | ||
if (input <= otperformance.now()) { | ||
// must be the result of `performance.now`, which is the time since program start | ||
// translate it to an actual timestamp with hight resolution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// translate it to an actual timestamp with hight resolution | |
// translate it to an actual timestamp with high resolution |
Reviewing this now but I have to warn you there be demons in this part of the code. Time handling on different platforms is not at all consistent or obvious. Most notably, the performance timer cannot be counted on to run continuously. It is monotonic, so it never goes backwards, but it is paused in all sorts of non-obvious situations. |
Yes, I've found out that this is a very tricky subject... But the current implementation makes things a bit weird for any operation with sub millisecond duration. Because it breaks actual order of spans (all spans starting during the same millisecond are seen as starting at the start of this millisecond, which will lead to confusion that the shortest span ended first, which is not always true). |
I've for example seen that on CloudFlare Workers, |
Which problem is this PR solving?
If no
startTime
is provided at span creation, the SpanstartTime
will have a millisecond precision. OTEL spec allows up to nanosecond precision, which is the default for all major languages implementations.Fixes # (issue)
Short description of the changes
The problem reside in the Clock Drifting correction logic.
To avoid clock drifting, we try to avoid the use of HR times. While it provides an nanosecond precision level, it can drift from actual NTP time by multiple seconds.
The problem is that by default, if nothing is provided by the user, the time used is
Date.now()
instead ofperformance.now()
, even if there is a whole logic to avoid drifting.I have refactored the logic to clearly separate the handling of user input and the drift correction. This way, it's clearer what the code is doing (I had a very hard time understanding what was going one, I had to go read the PR comments).
Now, the start time is in nanosecond precision by default.
Type of change
How Has This Been Tested?
I've tested this with my otel integration from Hive Gateway (graphql-hive/gateway#1343).
Checklist:
[ ] Documentation has been updated