Skip to content

Conversation

@cshung
Copy link
Contributor

@cshung cshung commented Apr 9, 2021

In 2019, we introduced the large page support in the GC. However, this feature is not regularly tested, and we found it broken (and fixed) once in a while due to careless changes.

It is time for us to introduce automated testing to avoid future regression. I have got some initial pointers from @Maoni0 and @safern. Here I am just bootstrapping the effort, as one can see, the change is kinda empty since I am not sure what to do there.

From the GC's perspective, the large page support only changes how the GC interacts with the operating system in terms of how memory pages are acquired. Therefore, it is not particularly meaningful to run many tests. A handful of tests that exercise the various hard limits config and do some simple allocation would do.

Here are the two things that need to happen:

  • To only run tests on windows amd64 machines with the SeLockMemoryPrivilege enabled (I knew some infrastructure work is already done to enable the privilege), and
  • To specify some COMPLUS variables so tests can pick it to do the specific large page testing when we need it.

@ghost
Copy link

ghost commented Apr 9, 2021

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Apr 9, 2021

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

In 2019, we introduced the large page support in the GC. However, this feature is not regularly tested, and we found it broken (and fixed) once in a while due to careless changes.

It is time for us to introduce automated testing to avoid future regression. I have got some initial pointers from @Maoni0 and @safern. Here I am just bootstrapping the effort, as one can see, the change is kinda empty since I am not sure what to do there.

From the GC's perspective, the large page support only changes how the GC interacts with the operating system in terms of how memory pages are acquired. Therefore, it is not particularly meaningful to run many tests. A handful of tests that exercise the various hard limits config and do some simple allocation would do.

The key challenge to this work is that the tests will require special machines (e.g. on Windows, we will need the SeLockMemoryPrivilege). That is why I need to make these changes.

Author: cshung
Assignees: -
Labels:

area-GC-coreclr

Milestone: -

@Maoni0
Copy link
Member

Maoni0 commented Apr 9, 2021

I don't think I understand this part -

The key challenge to this work is that the tests will require special machines (e.g. on Windows, we will need the SeLockMemoryPrivilege). That is why I need to make these changes.

this change was already there as I mentioned on Teams. the work that's left is to add the GCLargePages env var so tests can pick it and run with it.

@safern
Copy link
Member

safern commented Apr 9, 2021

cc: @BruceForstall

Copy link
Contributor

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should add this to the "gcstress-extra" job. You should create a new job. E.g., there already is gc-longrunning and gc-simulator (I don't know if they actually run).

You should change eng\pipelines\common\templates\runtimes\run-test-job.yml. The eng\pipelines\libraries-run-test-job.yml is used (by coreclr testing) for running coreclr stress tests over the libraries test assets.

Copy link
Member

@safern safern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to add a helixQueueGroup for this as well so that we only run this tests in the windows helix queue that has this support enabled.

https://github.com/dotnet/runtime/blob/7810035f31bfd4e2f5d83cc405cf9307d346d77c/eng/pipelines/coreclr/templates/helix-queues-setup.yml#L117

@cshung cshung force-pushed the public/large-pages-testing branch from 6713137 to 6bc2ab5 Compare April 12, 2021 21:44
@cshung cshung mentioned this pull request Apr 14, 2021
@cshung cshung closed this Apr 14, 2021
@cshung
Copy link
Contributor Author

cshung commented Apr 14, 2021

We will continue this work on #51255

@cshung cshung deleted the public/large-pages-testing branch April 14, 2021 18:20
@ghost ghost locked as resolved and limited conversation to collaborators May 14, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants