Skip to content

DiagnosticsClient.GetPublishedProcesses returns dead processes PIDs on Linux and macOS #3699

@valco1994

Description

@valco1994

Description

Currently, DiagnosticsClient .GetPublishedProcesses produces PIDs relying on the presence of donet-diagnostic-* sockets/pipes, see the code below:

public static IEnumerable<int> GetPublishedProcesses()
{
static IEnumerable<int> GetAllPublishedProcesses(string[] files)
{
foreach (var port in files)
{
var fileName = new FileInfo(port).Name;
var match = Regex.Match(fileName, PidIpcEndpoint.DiagnosticsPortPattern);
if (!match.Success) continue;
var group = match.Groups[1].Value;
if (!int.TryParse(group, NumberStyles.Integer, CultureInfo.InvariantCulture, out var processId))
continue;
yield return processId;
}
}
try
{
string[] files = Directory.GetFiles(PidIpcEndpoint.IpcRootPath);
return GetAllPublishedProcesses(files).Distinct();
}
catch ( UnauthorizedAccessException ex)
{
if (PidIpcEndpoint.IpcRootPath.StartsWith(@"\\.\pipe"))
{
throw new DiagnosticsClientException($"Enumerating {PidIpcEndpoint.IpcRootPath} is not authorized", ex);
}
else
{
throw;
}
}
}

But there is a problem: on *nix these sockets are not deleted when .NET process was killed and "abandoned" files for UNIX domain sockets are left on the disk. So, on *nix it's necessary to additionally filter this list of PIDs checking if the process is actually alive. It would be nice to also remove abandoned sockets, but it's not necessary.

I can propose following repro for the issue:

Application 1 - SleepApp:

Thread.Sleep(TimeSpan.FromMinutes(2));

Application 2 - GetPublishedProcessesIssueRepro:

using System.Diagnostics;
using Microsoft.Diagnostics.NETCore.Client;

var appToKillPath = [INSERT PATH TO SleepApp EXECUTABLE HERE]; 
var appsToKillCount = 10; // any other value can be set here

var processes = new List<Process>(appsToKillCount);
var pids = new List<int>(appsToKillCount);
for (int i = 0; i < appsToKillCount; ++i)
{
    var process = new Process();
    process.StartInfo.FileName = appToKillPath;
    process.Start();
    processes.Add(process);
    pids.Add(process.Id);
}

// to be sure that runtime is started for all of the processes
Thread.Sleep(TimeSpan.FromSeconds(2));

foreach(var process in processes)
    process.Kill();

    
var publishedProcessesPids = DiagnosticsClient.GetPublishedProcesses();

Console.WriteLine($"Count of processes launched and killed: {pids.Count}");
Console.WriteLine($"Count of pids returned by DiagnosticsClient.GetPublishedProcesses: {publishedProcessesPids.Count()}");
Console.WriteLine($"Intersection count: {pids.Intersect(publishedProcessesPids).Count()}");

Expected behaviour - GetPublishedProcessesIssueRepro prints "Intersection count: 0".
Actual behaviour - GetPublishedProcessesIssueRepro prints "Intersection count: ${appsToKillCount}".

For example, on my mac M1 I get following result from running GetPublishedProcessesIssueRepro, if I remove all dotnet-diagnostic-* sockets from ${TMPDIR} before the start:

Count of processes launched and killed: 10
Count of pids returned by DiagnosticsClient.GetPublishedProcesses: 13
Intersection count: 10

Configuration

It's relevant for any platform that uses UNIX domain sockets for Diagnostics Port. I suppose, that Linux and macOS are the most important. Also, I don't check what happens on the platforms that use TCP sockets for this purpose.

Other information

Of course, it's possible to filter DiagnosticsClient.GetPublishedProcesses() result on the side of the application using the library. But current behaviour breaks the documented guarantee to "Get all the active processes that can be attached to".

How to fix it?

It's necessary to check if a process with found pid is actually alive. It can be done using kill(pid, 0) call. See documentation for details:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions