Add get AppContext properties diagnostic IPC command #86610

jander-msft · 2023-05-22T21:20:00Z

These changes allow diagnostic tools to enumerate the AppContext properties of a running application over the diagnostic communication channel for CoreCLR (I'm not sure if Mono has the AppContext notion and have opted out of implementing it there by returning DS_IPC_E_NOTSUPPORTED; similarly for NativeAOT).

The returned payload is exactly the same as the DS_PROCESS_COMMANDID_GET_PROCESS_ENV command. Each property name and value is concatenated together with an equal sign character (=) and delimited with a null-terminator; this represents a single entry in the AppContext.

Corresponding diagnostic repo change is dotnet/diagnostics#3901

Partially addresses #83756

cc @dotnet/dotnet-monitor

ghost · 2023-05-22T21:20:19Z

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

These changes allow diagnostic tools to enumerate the AppContext properties of a running application over the diagnostic communication channel for CoreCLR (I'm not sure if Mono has the AppContext notion and have opted out of implementing it there by returning DS_IPC_E_NOTSUPPORTED; similarly for NativeAOT).

The returned payload is exactly the same as the DS_PROCESS_COMMANDID_GET_PROCESS_ENV command. Each property name and value is concatenated together with an equal sign character (=) and delimited with a null-terminator; this represents a single entry in the AppContext.

Corresponding diagnostic repo change is dotnet/diagnostics#3901

Partially addresses #83756

cc @dotnet/dotnet-monitor

Author:	jander-msft
Assignees:	-
Labels:	`area-Diagnostics-coreclr`, `community-contribution`
Milestone:	-

src/coreclr/vm/eventing/eventpipe/ds-rt-coreclr.h

jander-msft · 2023-05-22T21:55:06Z

src/coreclr/vm/eventing/eventpipe/ds-rt-coreclr.h

+	STATIC_CONTRACT_NOTHROW;
+	EP_ASSERT (props_array != NULL);
+
+	return Configuration::EnumerateKnobs([](const LPCWSTR& name, const LPCWSTR& value, void* context) {


I've tapped into the Configuration class for this enumeration since it is the owner of the app config properties starting here and the diagnostic server/command handling is started later via here.

The other statically available places to access this information were possibly from the host or coreclr exports and contracts, but I felt that it was not appropriate to add the enumeration API there for external consumption; also, based on discussion in #83756, there may be some desire to filter the properties before the AppContext gets to report them, which feels like it would naturally fit in at configuration initialization. Finally, I will have a follow up PR that allows modification of these values, which the Configuration class can be more easily made to do than say the host and coreclr properties since those are already "sealed" at this point in time.

wiktork · 2023-05-23T00:24:56Z

src/coreclr/vm/eventing/eventpipe/ds-rt-coreclr.h

+	STATIC_CONTRACT_NOTHROW;
+	EP_ASSERT (props_array != NULL);
+
+	return Configuration::EnumerateKnobs([](const LPCWSTR& name, const LPCWSTR& value, void* context) {


Are there any properties that we might want to omit from here?

Here's the list for a very basic .NET 8 console app:

FX_DEPS_FILE

TRUSTED_PLATFORM_ASSEMBLIES

NATIVE_DLL_SEARCH_DIRECTORIES

PLATFORM_RESOURCE_ROOTS

APP_CONTEXT_BASE_DIRECTORY

APP_CONTEXT_DEPS_FILES

PROBING_DIRECTORIES

RUNTIME_IDENTIFIER

System.Reflection.Metadata.MetadataUpdater.IsSupported

HOST_RUNTIME_CONTRACT

STARTUP_HOOKS

Other identifiers are added to this list when the app contains them in the runtimeOptions:configProperties section in the *.runtimeconfig.json file.

Probably HOST_RUNTIME_CONTRACT can be skipped because its value is a in-memory address to the host runtime contract. Everything else seems to be directory and file paths of various types.

My default position would be to only filter out HOST_RUNTIME_CONTRACT and leave everything else in in the list for reading values. In a future change that may allow modification of values, I can see some more of these being effectively made read-only, but I would have to defer to others on what's desirable to allowed to be read.

@vitek-karas and @elinor-fung, do you have suggestions on what to filter (if any and where to do it) for this initial PR that enables reading the app config values?

The total size of all these switches can be in 10's kB. I assume that this event is going to be typically enabled by default. Do you really want to send 10's kB over the diagnostic channel during startup by default?

I would ask a different question. What do you expect that this data is going to be used for?

The total size of all these switches can be in 10's kB. I assume that this event is going to be typically enabled by default. Do you really want to send 10's kB over the diagnostic channel during startup by default?

There's already an existing diagnostic command for collecting the entire environment block of the target process. I would hazard a guess that it's has similar size characteristics. We haven't seen any significant slowdown during startup nor have we received any customer feedback about it.

The only one that I see right now that is significantly large is TRUSTED_PLATFORM_ASSEMBLIES, which we could probably do without. Maybe we can categorize the data and have those categories as part of the request payload on the diagnostic command in order to limit the response to the interested data.

I would ask a different question. What do you expect that this data is going to be used for?

Originally, we only wanted to get startup hook information, and append our own startup hook to the list which the process is held up during the diagnostics suspension (which occurs just before managed code execution is started); this request is #83756. It was suggested that this be generalized to get / set app context properties. Expanding it to this general concept also allows us to:

read RUNTIME_IDENTIFIER to determine what type of ICorProfilerCallback implementation we should load in the target process, especially differentiating on glibc vs musl-libc implementations.

The following are hypothetical but likely candidates for use:

read System.StartupHookProvider.IsSupported and determine if we should bother with the startup hook at all. This would also allow us to automatically tell users if features that depend on the startup hook will work. Other app context switches may be useful; not sure how to determine the full set of "in box" switches though.

read BUNDLE_PROBE \ HOSTPOLICY_EMBEDDED \ PINVOKE_OVERRIDE to check if the process is single file; some diagnostic scenarios do not work in single file applications. It would be good to know whether we bother trying to set up for those scenarios or be able to give the user a better indication why the diagnostic scenario didn't work (e.g. "sorry, this feature is not compatible with single file applications").

read APP_CONTEXT_BASE_DIRECTORY and tell our profiler what that is; this may help in discovering application symbol information that was published with the application so that we can report that back to the .NET Monitor process in order to get SourceLink information. We can use that information in combination with diagnostic artifacts that are collected to help guide users or automated tooling back to real source code.

(this is a stretch) read / set APP_PATHS / PROBING_DIRECTORIES to get our managed assemblies loadable without having to rely on managed assembly resolving and LoadFrom context.

Those are the obvious ones that I can think of right now; I can probably dig out more scenarios where some of this data may be useful.

For my own reference, some (all?) feature switches seem to be documented here and surfaced in the SDK here.

Re single-file crash dump: Wouldn't it be cleaner if the runtime responded via the IPC channel if the dump could be collected or not? After all, the runtime knows the answer... using hints like single-file is just guessing. I understand that guessing is all we have right now, but if we're introducing new functionality why not design it for the original purpose?

I agree. I'm dropping the "other" bonus usage and won't mention it anymore unless others keep poking at it. I just want startup hooks #83756 and runtime identifiers #74476

Startup hooks only execute at startup... they are never evaluated again after the first processing of the startup hooks at the very beginning of managed execution

We can extend this contract via diagnostic command. The startup hook instantiation is load assembly + invocation of the well-known entrypoint. Load assembly has built-in synchronization that ensures that given assembly is loaded exactly once. And once the entrypoint is executing, it can do its own synchronization if there is a chance that it can be invoked multiple times by the diagnostic command.

Once you have managed hook running in the process, you can do any inspection you want, with any precision you like. For example, you do not have to depend on guessing to figure out what kind of libc is loaded based on the RID. You can inspect the process from inside to see what kind of libc is in use. You do not need to depend on runtime adding built-in tracing for every piece of information that you need.

For example, you do not have to depend on guessing to figure out what kind of libc is loaded based on the RID.

Determining the libc type during diagnostic suspension is not for startup hook usage; it's for determining which variant of our profiler to use.

For loading the profiler, either AddStartupProfiler and AttachProfiler diagnostic commands are invoked, both of which take a full file path to the profiler implementation. The former command is only usable during the suspension and does not evaluate whether the library is loadable or not because the actual load comes later after the runtime has been resumed. There is no feedback on whether the library is loaded or not, so we can't even just try one and fallback to the other; we need to know the RID before the runtime is resumed. For the latter, we could just try one and then the other if the first fails, but that feels like a waste if we already know what the RID is due to needing it for AddStartupProfiler.

You do not need to depend on runtime adding built-in tracing for every piece of information that you need.

I have never asked for that. There is no tracing involved.

To make decisions on what needs to be loaded or enable at diagnostic suspension, we need diagnostic commands to get specific information (such as RID) because some features (such as profilers) require native-specific information and managed code is not executing at that time thus cannot help in a platform agnostic way.

we need diagnostic commands to get specific information (such as RID) because some features (such as profilers)

RID is not appropriate for this need. For example, what are you going to do when you see RID void-x64?

If you need libc type to load the right profiler flavor, we should introduce diagnostic command that returns the libc type.

Alternatively, you can find out the libc that the process has loaded using /proc file system. Is that an option?

If you need libc type to load the right profiler flavor, we should introduce diagnostic command that returns the libc type.

I logged #74476 specifically for this a 9 months ago. I'll try to get to this if the diagnostics team cannot.

Alternatively, you can find out the libc that the process has loaded using /proc file system. Is that an option?

No, .NET Monitor is typically not running in the same process namespace in its primary scenario (multi-container environments such as Kubernetes, where .NET Monitor is typically running as a sidecar container and maybe be running as a different user). We can however use that as a fallback mechanism of almost last resort if it happens to be running in the same namespace, which we may do for runtime versions lower than .NET 8.

wiktork · 2023-05-23T00:33:38Z

@jander-msft @davmason Would this subsume Proposed spec for GetProcessInfo3 IPC command by davmason · Pull Request #3476 · dotnet/diagnostics (github.com)?

jander-msft · 2023-05-23T01:47:22Z

@jander-msft @davmason Would this subsume Proposed spec for GetProcessInfo3 IPC command by davmason · Pull Request #3476 · dotnet/diagnostics (github.com)?

I would agree to that only if we are guaranteed that RUNTIME_IDENTIFIER will always be available as one of the app config properties.

lateralusX · 2023-05-23T06:58:20Z

I'm not sure if Mono has the AppContext notion

Mono supports AppContext, it gets registered here during startup, https://github.com/dotnet/runtime/blob/main/src/mono/mono/metadata/appdomain.c#L836, and kept in two internal arrays with keys and values, but after the properties have been installed into the managed app context, https://github.com/dotnet/runtime/blob/main/src/mono/mono/metadata/appdomain.c#L860, they seemed to be freed, so if we would need to keep them around in order to handle the IPC command without extracting the data from managed side, then we would need to alter the logic to not free the keys/values after they have been installed into the AppContext. The size of the AppContext data could be rather big, so its not optimal to keep double copies in memory just in case it is needed, so maybe a better alternative would be to extract the data from managed side if/when needed by this IPC command.

MichalStrehovsky · 2023-05-24T00:50:23Z

How does this relate to:

runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/RuntimeEventSource.cs

Line 129 in ce689d9

AppContext.LogSwitchValues(this);

Do we need both?

jander-msft · 2023-05-24T01:06:20Z

How does this relate to:

runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/RuntimeEventSource.cs

Line 129 in ce689d9

AppContext.LogSwitchValues(this);

Do we need both?

That's not helpful in the intended scenario. We need to be able to get/set the startup hook (this PR is only helping with the "get" portion) at the diagnostic startup suspension point, which occurs before managed execution; event sources haven't executed yet at this point.

MichalStrehovsky · 2023-05-24T01:16:02Z

How does this relate to:

runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Tracing/RuntimeEventSource.cs

Line 129 in ce689d9

AppContext.LogSwitchValues(this);

Do we need both?

That's not helpful in the intended scenario. We need to be able to get/set the startup hook (this PR is only helping with the "get" portion) at the diagnostic startup suspension point, which occurs before managed execution; event sources haven't executed yet at this point.

To be clear the question wasn't whether we can use the existing data. The question was whether we evaluate existing instrumentation for redundancy when adding new ones.

Add get AppContext properties diagnostic IPC command

7e6bab4

jander-msft requested review from MichalStrehovsky, lambdageek and lateralusX as code owners May 22, 2023 21:20

ghost added area-Diagnostics-coreclr community-contribution Indicates that the PR has been added by a community member labels May 22, 2023

jander-msft mentioned this pull request May 22, 2023

Add GetAppContextProperties diagnostic IPC command dotnet/diagnostics#3901

Closed

jander-msft commented May 22, 2023

View reviewed changes

src/coreclr/vm/eventing/eventpipe/ds-rt-coreclr.h Outdated Show resolved Hide resolved

jander-msft commented May 22, 2023

View reviewed changes

Use memcpy instead of swprintf_s

38314d0

wiktork reviewed May 23, 2023

View reviewed changes

build-analysis bot mentioned this pull request May 23, 2023

Could not load file or assembly 'Microsoft.CodeAnalysis.NetAnalyzers #84995

Closed

Changer nullptr to NULL

c644c6b

This was referenced May 23, 2023

IOException running NuGet-Migrations during tests in dotnet CLI first run #80619

Closed

Assert failure in GC/API/NoGCRegion/Callback_Svr test #86612

Closed

runfoapp bot mentioned this pull request May 23, 2023

Infra improvements for Helix #68176

Closed

jander-msft mentioned this pull request May 24, 2023

[Feature Request] Diagnostic IPC command to dynamically add startup hook at diagnostic startup suspension #83756

Closed

jander-msft closed this May 25, 2023

jander-msft mentioned this pull request Jun 15, 2023

Add ProcessInfo3 diagnostic IPC command to include libc type or equivalent #74476

Closed

ghost locked as resolved and limited conversation to collaborators Jun 24, 2023

Add get AppContext properties diagnostic IPC command #86610

Add get AppContext properties diagnostic IPC command #86610

Uh oh!

Conversation

jander-msft commented May 22, 2023

Uh oh!

ghost commented May 22, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jkotas May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wiktork commented May 23, 2023

Uh oh!

jander-msft commented May 23, 2023

Uh oh!

lateralusX commented May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichalStrehovsky commented May 24, 2023

Uh oh!

jander-msft commented May 24, 2023

Uh oh!

MichalStrehovsky commented May 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jkotas May 23, 2023 •

edited

Loading

lateralusX commented May 23, 2023 •

edited

Loading