Skip to content

Conversation

jerboaa
Copy link
Contributor

@jerboaa jerboaa commented Mar 11, 2024

Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (podman, docker, crio), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line:

[0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present

This state is being exposed by the Java Metrics API class using the new (still JDK internal) isContainerized() method. Example:

java -XshowSettings:system --version
Operating System Metrics:
    Provider: cgroupv1
    System not containerized.
openjdk 23-internal 2024-09-17
OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk)
OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing)

The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that OSContainer::is_containerized() == true in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed.

Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system.

Testing:

  • GHA (risc-v failure seems infra related)
  • Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests)
  • Some manual testing using cri-o

Thoughts?


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18201/head:pull/18201
$ git checkout pull/18201

Update a local copy of the PR:
$ git checkout pull/18201
$ git pull https://git.openjdk.org/jdk.git pull/18201/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18201

View PR using the GUI difftool:
$ git pr show -t 18201

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18201.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 11, 2024

👋 Welcome back sgehwolf! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 11, 2024

@jerboaa The following labels will be automatically applied to this pull request:

  • core-libs
  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 11, 2024
@mlbridge
Copy link

mlbridge bot commented Mar 11, 2024

Webrevs

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

@jerboaa This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container

Reviewed-by: stuefe, iklam

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 8 new commits pushed to the master branch:

  • d9bcf06: 8335217: Fix memory ordering in ClassLoaderData::ChunkedHandleList
  • bb18498: 8335349: jcmd VM.classloaders "fold" option should be optional
  • 8350b1d: 8335294: Fix simple -Wzero-as-null-pointer-constant warnings in gc code
  • 5d866bf: 8335252: Reduce size of j.u.Formatter.Conversion#isValid
  • 166f9d9: 8335221: Some C2 intrinsics incorrectly assume that type argument is compile-time constant
  • 3e23e9c: 8335344: test/jdk/sun/security/tools/keytool/NssTest.java fails to compile
  • 79a3554: 8335124: com/sun/management/ThreadMXBean/ThreadCpuTimeArray.java failed with CPU time out of expected range
  • 45c4eaa: 8335274: SwitchBootstraps.ResolvedEnumLabels.resolvedEnum should be final

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@jerboaa
Copy link
Contributor Author

jerboaa commented Mar 22, 2024

Anyone willing to review this?

@jerboaa
Copy link
Contributor Author

jerboaa commented Apr 11, 2024

Gentle ping.

@jankratochvil
Copy link
Contributor

IMHO is_containerized() is OK to return false even when running in a container but with no limitations set.

Container detection is IIUC/AFAIK being used to maximize resource usage by OpenJDK. But if OpenJDK runs in a container with the same limits as the hardware box OpenJDK should still use reduced resources as it is sharing them with other processes on the hardware box.

is-containerized.patch.txt

@jerboaa
Copy link
Contributor Author

jerboaa commented Apr 16, 2024

IMHO is_containerized() is OK to return false even when running in a container but with no limitations set.

The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when -XX:-UseContainerSupport should be used.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not enough of a container expert to judge if the basic approach is right - I trust you on this. This is just a technical code review.

@jankratochvil
Copy link
Contributor

The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints.

The in-container tuning means to use all the available resources. Containers in the real world have some memory limits set which is where my modified patch still correctly identifies it as a container to use all the available resources of the node which is the whole goal of the container detection code.

In order to do this, we need a reliable way to distinguish that vs. non-containerized setup.

I expect it should have been written "We need a reliable way to distinguish real world in-container vs. non-containerized setup. We do not mind behavior for artificial containers on OpenJDK development machines.". Which is what my patch does in an easier and less error-prone way.

If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when -XX:-UseContainerSupport should be used.

That behaves still the same with my patch.

Could you give a countercase where my patch behaves wrongly?

@jerboaa
Copy link
Contributor Author

jerboaa commented Apr 18, 2024

@jankratochvil I believe this boils down to what we actually want. Should OSContainer::is_containerized() return false when run inside a container? If so, when is it OK to do that? Should OSContainer::is_containerized() return true on a physical Linux deployment? IMO, the read-only property of the mount points was something that fit naturally since, we already scan those anyway for (cgv1 vs cgv2 detection). Therefore it was something to consider to make heuristics more accurate.

The truth table of the patch in this PR looks like this:

OSContainer::is_containerized() value Actual deployment scenario
true OpenJDK runs in an unprivileged container without a cpu/memory limit
true OpenJDK runs in an unprivileged container with a cpu/memory limit
true OpenJDK runs in a privileged container with a cpu/memory limit
false OpenJDK runs in a privileged container without a cpu/memory limit
false OpenJDK runs in a systemd slice without a cpu/memory limit
true OpenJDK runs in a systemd slice with a cpu/memory limit
false OpenJDK runs on a physical Linux system (VM or bare metal)

As you can see, the case of "OpenJDK runs in a privileged container without a cpu/memory limit" gives the wrong result. However, I consider this a fairly uncommon setup and there isn't really anything we can do to detect this kind of setup. Even if we did manage to detect it, why would we care? It's a question of probability.

Could you give a countercase where my patch behaves wrongly?

Again, it's not a case of right or wrong IMO. Since we are in the land of heuristics, they will be wrong in some cases. We should make them so that we cover the common cases and, perhaps, are able to use those in serviceability tools to help guide diagnosis and/or further tuning. So far the existing capabilities were OK, but prevent further out-of-the-box tuning and/or accurate data collection.

Your proposed patch (it's one I had in a previous iteration too, fwiw) would also return false for the case of "OpenJDK runs in an unprivileged container without a cpu/memory limit", which seems counterintuitive if OpenJDK actually runs in a container! What's more, it seems a fairly common case. Also, there is a chance of the OpenJDK heuristics to fail memory/cpu limit detection because of bugs and new developments. It seems the safer option to know that OpenJDK is containerized (using other heuristics) in that case. Maybe that's just me.

Let's have that discussion more broadly and see if we can reach consensus!

@jankratochvil
Copy link
Contributor

Could not we rename is_containerized() to use_container_limit() ? As that is the current only purpose of is_containerized().

I did not test it but I expect the values will be:

your patch my trivial patch Actual deployment scenario
true false OpenJDK runs in an unprivileged container without a cpu/memory limit
true true OpenJDK runs in an unprivileged container with a cpu/memory limit
false false OpenJDK runs in a privileged container without a cpu/memory limit
true true OpenJDK runs in a privileged container with a cpu/memory limit
false false OpenJDK runs in a systemd slice without a cpu/memory limit
true true OpenJDK runs in a systemd slice with a cpu/memory limit
false false OpenJDK runs on a physical Linux system (VM or bare metal)

@jerboaa
Copy link
Contributor Author

jerboaa commented Apr 18, 2024

Could not we rename is_containerized() to use_container_limit() ? As that is the current only purpose of is_containerized().

I'm not sure. There is value to have is_containerized() like it would behave after this patch. Specifically the first table row difference in your comment concerns me. JVMs running in a container without limit wouldn't be detected as "containerized". That seems a large share of deployments to miss.

@mlbridge
Copy link

mlbridge bot commented Apr 19, 2024

Mailing list message from Laurence Cable on serviceability-dev:

On 4/18/24 9:38 AM, Severin Gehwolf wrote:

On Thu, 18 Apr 2024 13:27:38 GMT, Jan Kratochvil <jkratochvil at openjdk.org> wrote:

Could not we rename `is_containerized()` to `use_container_limit()` ? As that is the current only purpose of `is_containerized()`.
I'm not sure. There is value to have `is_containerized()` like it would behave after this patch. Specifically the first table row difference in [your comment](https://github.com//pull/18201#issuecomment-2063868908) concerns me. JVMs running in a container without limit wouldn't be detected as "containerized". That seems a large share of deployments to miss.

agree 100%

@mlbridge
Copy link

mlbridge bot commented Apr 19, 2024

Mailing list message from Laurence Cable on serviceability-dev:

On 4/18/24 2:54 AM, Severin Gehwolf wrote:

On Wed, 17 Apr 2024 01:07:04 GMT, Jan Kratochvil <jkratochvil at openjdk.org> wrote:

IMHO `is_containerized()` is OK to return `false` even when running in a container but with no limitations set.
The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used.
The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints.
The in-container tuning means to use all the available resources. Containers in the real world have some memory limits set which is where my modified patch still correctly identifies it as a container to use all the available resources of the node which is the whole goal of the container detection code.

In order to do this, we need a reliable way to distinguish that vs. non-containerized setup.
I expect it should have been written "We need a reliable way to distinguish real world in-container vs. non-containerized setup. We do not mind behavior for artificial containers on OpenJDK development machines.". Which is what my patch does in an easier and less error-prone way.

If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used.
That behaves still the same with my patch.

Could you give a countercase where my patch behaves wrongly?
@jankratochvil I believe this boils down to what we actually want. Should `OSContainer::is_containerized()` return `false` when run *inside* a container? If so, when is it OK to do that? Should `OSContainer::is_containerized()` return `true` on a physical Linux deployment? IMO, the read-only property of the mount points was something that fit naturally since, we already scan those anyway for (cgv1 vs cgv2 detection). Therefore it was something to consider to make heuristics more accurate.

The truth table of the patch in this PR looks like this:
| `OSContainer::is_containerized()` value | Actual deployment scenario |
| ------------- | ------------- |
| `true` | OpenJDK runs in an unprivileged container **without** a cpu/memory limit |
| `true` | OpenJDK runs in an unprivileged container **with** a cpu/memory limit |
| `true` | OpenJDK runs in a privileged container **with** a cpu/memory limit |
| `false` | OpenJDK runs in a privileged container **without** a cpu/memory limit |
| `false` | OpenJDK runs in a systemd slice **without** a cpu/memory limit |
| `true` | OpenJDK runs in a systemd slice **with** a cpu/memory limit |
| `false` | OpenJDK runs on a physical Linux system (VM or bare metal) |

As you can see, the case of "OpenJDK runs in a privileged container *without* a cpu/memory limit" gives the wrong result. However, I consider this a fairly uncommon setup and there isn't really anything we can do to detect this kind of setup. Even if we did manage to detect it, why would we care? It's a question of probability.

Could you give a countercase where my patch behaves wrongly?
Again, it's not a case of right or wrong IMO. Since we are in the land of heuristics, they will be wrong in some cases. We should make them so that we cover the common cases and, perhaps, are able to use those in serviceability tools to help guide diagnosis and/or further tuning. So far the existing capabilities were OK, but prevent further out-of-the-box tuning and/or accurate data collection.

I think (I am agreeing with you Severin) that the goal of the heuristic
is to inform the JVM (and any associated serviceability tools) that the
JVM is in a resource constrained/managed execution context...

Rgds

- Larry

@jerboaa
Copy link
Contributor Author

jerboaa commented Apr 19, 2024

Thanks for your input Larry!

@jankratochvil
Copy link
Contributor

I think (I am agreeing with you Severin) that the goal of the heuristic is to inform the JVM (and any associated serviceability tools) that the JVM is in a resource constrained/managed execution context...

"resource constrained" (my patch) vs. "managed" (this patch) is the difference of the two patches being discussed.

Anyway in this patch one could unify naming across variables/parameters, the same value is called _is_ro, is_read_only, ro_opt, read_only, ro.

@jerboaa
Copy link
Contributor Author

jerboaa commented Jun 12, 2024

tools/javac/annotations/typeAnnotations/api/ArrayCreationTree test failure in GHA on 32 bit Linux seems unrelated.

@jerboaa
Copy link
Contributor Author

jerboaa commented Jun 20, 2024

@tstuefe Do you mind to take another look? Thank you!

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay. I don't understand the depths of V1 vs V2 controller files as well as you do, @jerboaa , but I trust you there. The mechanics seem fine.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 20, 2024
@jankratochvil
Copy link
Contributor

Currently this patch conflicts a lot with #19085 (jerboaa:jdk-8331560-cgroup-controller-delegation).

@jerboaa
Copy link
Contributor Author

jerboaa commented Jun 25, 2024

Currently this patch conflicts a lot with #19085 (jerboaa:jdk-8331560-cgroup-controller-delegation).

Yes, I'll resolve it one way or another depending on which one goes in first.

@jerboaa
Copy link
Contributor Author

jerboaa commented Jun 26, 2024

Could I get a second review on this please? @larry-cable maybe? Thanks!

@mlbridge
Copy link

mlbridge bot commented Jun 26, 2024

Mailing list message from Laurence Cable on serviceability-dev:

will do!

On 6/26/24 3:04 AM, Severin Gehwolf wrote:

@jerboaa
Copy link
Contributor Author

jerboaa commented Jun 28, 2024

Thanks for the review!

@jerboaa
Copy link
Contributor Author

jerboaa commented Jun 28, 2024

@adinn @iklam Could one of you please help with a second review, please? Not sure if @larry-cable review gets recorded with him having no OpenJDK project role :-/ Thanks in advance!

@larry-cable
Copy link
Contributor

larry-cable commented Jun 28, 2024 via email

@mlbridge
Copy link

mlbridge bot commented Jun 29, 2024

Mailing list message from Laurence Cable on serviceability-dev:

On 6/28/24 8:45 AM, Severin Gehwolf wrote:

On Thu, 27 Jun 2024 18:40:09 GMT, Larry Cable <duke at openjdk.org> wrote:

Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits:

- Refactor mount info matching to helper function
- Merge branch 'master' into jdk-8261242-is-containerized-fix
- Remove problem listing of PlainRead which is reworked here
- Merge branch 'master' into jdk-8261242-is-containerized-fix
- Merge branch 'master' into jdk-8261242-is-containerized-fix
- Add doc for mountinfo scanning.
- Unify naming of variables
- Merge branch 'master' into jdk-8261242-is-containerized-fix
- Merge branch 'master' into jdk-8261242-is-containerized-fix
- jcheck fixes
- ... and 7 more: https://git.openjdk.org/jdk/compare/baafa662...532ea33
src/hotspot/share/prims/jvm.cpp line 504:

502: JVM_LEAF(jboolean, JVM_IsContainerized(void))
503: #ifdef LINUX
504: if (OSContainer::is_containerized()) {
// nit: personal preference...

return OSContainer::isContainerized() ? JNI_TRUE : JNI_FALSE;
I've kept this as is, since the suggestion generates this code after preprocessing on Linux:

return OSContainer::is_containerized() ? JNI_TRUE : JNI_FALSE;
return JNI_FALSE;

over the existing version:

if (OSContainer::is_containerized()) {
return JNI_TRUE;
}
return JNI_FALSE;

Uugh ... makes sense (I guess) :)

Copy link
Member

@iklam iklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me

@jerboaa
Copy link
Contributor Author

jerboaa commented Jul 1, 2024

Thank you for the review!

@jerboaa
Copy link
Contributor Author

jerboaa commented Jul 1, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Jul 1, 2024

Going to push as commit 0a6ffa5.
Since your change was applied there have been 11 commits pushed to the master branch:

  • 71e3798: 8335308: compiler/uncommontrap/DeoptReallocFailure.java times out with SerialGC on Windows
  • c7e9ebb: 8331732: [PPC64] Unify and optimize code which converts != 0 to 1
  • 53242cd: 8335283: Build failure due to 'no_sanitize' attribute directive ignored
  • d9bcf06: 8335217: Fix memory ordering in ClassLoaderData::ChunkedHandleList
  • bb18498: 8335349: jcmd VM.classloaders "fold" option should be optional
  • 8350b1d: 8335294: Fix simple -Wzero-as-null-pointer-constant warnings in gc code
  • 5d866bf: 8335252: Reduce size of j.u.Formatter.Conversion#isValid
  • 166f9d9: 8335221: Some C2 intrinsics incorrectly assume that type argument is compile-time constant
  • 3e23e9c: 8335344: test/jdk/sun/security/tools/keytool/NssTest.java fails to compile
  • 79a3554: 8335124: com/sun/management/ThreadMXBean/ThreadCpuTimeArray.java failed with CPU time out of expected range
  • ... and 1 more: https://git.openjdk.org/jdk/compare/486aa11e74d0772ba84c2adc3c62fc1fcbf52604...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 1, 2024
@openjdk openjdk bot closed this Jul 1, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 1, 2024
@openjdk
Copy link

openjdk bot commented Jul 1, 2024

@jerboaa Pushed as commit 0a6ffa5.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants