Skip to content

Conversation

minborg
Copy link
Contributor

@minborg minborg commented Nov 29, 2024

This PR proposes to rewrite the StringSupport::chunkedStrlen* methods, fixing a bug in the short_strlen variant for odd offsets (offset % 2 != 0).

This PR also improves performance on modern hardware, as there is no need for pre-looping alignment. Removing this improves performance by about 30% for larger strings.

It passes the jdk_foreign test suit. Also, tier1 through tier3 passes on various platforms and configurations.

Base:

Benchmark                               (size)  Mode  Cnt    Score   Error  Units
InternalStrLen.changedElementQuad            1  avgt   30    2.057 ? 0.012  ns/op
InternalStrLen.changedElementQuad            4  avgt   30    3.776 ? 0.031  ns/op
InternalStrLen.changedElementQuad           16  avgt   30    6.690 ? 0.060  ns/op
InternalStrLen.changedElementQuad          251  avgt   30   48.581 ? 0.764  ns/op
InternalStrLen.changedElementQuad         1024  avgt   30  196.188 ? 3.484  ns/op
InternalStrLen.chunkedDouble                 1  avgt   30    1.903 ? 0.013  ns/op
InternalStrLen.chunkedDouble                 4  avgt   30    3.446 ? 0.025  ns/op
InternalStrLen.chunkedDouble                16  avgt   30    5.759 ? 0.062  ns/op
InternalStrLen.chunkedDouble               251  avgt   30   26.892 ? 0.141  ns/op
InternalStrLen.chunkedDouble              1024  avgt   30   72.940 ? 1.562  ns/op
InternalStrLen.chunkedSingle                 1  avgt   30    1.897 ? 0.015  ns/op
InternalStrLen.chunkedSingle                 4  avgt   30    5.357 ? 0.560  ns/op
InternalStrLen.chunkedSingle                16  avgt   30    3.821 ? 0.052  ns/op
InternalStrLen.chunkedSingle               251  avgt   30   19.482 ? 0.190  ns/op
InternalStrLen.chunkedSingle              1024  avgt   30   38.938 ? 0.411  ns/op
InternalStrLen.chunkedSingleMisaligned       1  avgt   30    2.230 ? 0.147  ns/op
InternalStrLen.chunkedSingleMisaligned       4  avgt   30    5.424 ? 0.688  ns/op
InternalStrLen.chunkedSingleMisaligned      16  avgt   30    9.573 ? 0.063  ns/op
InternalStrLen.chunkedSingleMisaligned     251  avgt   30   22.242 ? 0.182  ns/op
InternalStrLen.chunkedSingleMisaligned    1024  avgt   30   45.442 ? 0.252  ns/op
InternalStrLen.elementByteMisaligned         1  avgt   30    1.616 ? 0.041  ns/op
InternalStrLen.elementByteMisaligned         4  avgt   30    2.982 ? 0.018  ns/op
InternalStrLen.elementByteMisaligned        16  avgt   30    8.662 ? 0.085  ns/op
InternalStrLen.elementByteMisaligned       251  avgt   30  126.644 ? 0.902  ns/op
InternalStrLen.elementByteMisaligned      1024  avgt   30  492.736 ? 3.254  ns/op
InternalStrLen.elementDouble                 1  avgt   30    1.900 ? 0.016  ns/op
InternalStrLen.elementDouble                 4  avgt   30    3.931 ? 0.027  ns/op
InternalStrLen.elementDouble                16  avgt   30   12.310 ? 0.109  ns/op
InternalStrLen.elementDouble               251  avgt   30  203.665 ? 6.778  ns/op
InternalStrLen.elementDouble              1024  avgt   30  786.320 ? 5.104  ns/op
InternalStrLen.elementQuad                   1  avgt   30    1.922 ? 0.031  ns/op
InternalStrLen.elementQuad                   4  avgt   30    4.078 ? 0.175  ns/op
InternalStrLen.elementQuad                  16  avgt   30   12.538 ? 0.330  ns/op
InternalStrLen.elementQuad                 251  avgt   30  202.175 ? 3.537  ns/op
InternalStrLen.elementQuad                1024  avgt   30  798.846 ? 7.323  ns/op
InternalStrLen.elementSingle                 1  avgt   30    1.614 ? 0.045  ns/op
InternalStrLen.elementSingle                 4  avgt   30    2.992 ? 0.010  ns/op
InternalStrLen.elementSingle                16  avgt   30    8.773 ? 0.095  ns/op
InternalStrLen.elementSingle               251  avgt   30  126.975 ? 1.201  ns/op
InternalStrLen.elementSingle              1024  avgt   30  499.057 ? 6.561  ns/op

Patch:

Benchmark                               (size)  Mode  Cnt    Score    Error  Units
InternalStrLen.changedElementQuad            1  avgt   30    1.457 ?  0.024  ns/op
InternalStrLen.changedElementQuad            4  avgt   30    3.596 ?  0.056  ns/op
InternalStrLen.changedElementQuad           16  avgt   30    5.711 ?  0.090  ns/op
InternalStrLen.changedElementQuad          251  avgt   30   38.945 ?  0.859  ns/op
InternalStrLen.changedElementQuad         1024  avgt   30  144.427 ?  4.495  ns/op
InternalStrLen.chunkedDouble                 1  avgt   30    2.217 ?  0.018  ns/op
InternalStrLen.chunkedDouble                 4  avgt   30    2.532 ?  0.044  ns/op
InternalStrLen.chunkedDouble                16  avgt   30    3.795 ?  0.082  ns/op
InternalStrLen.chunkedDouble               251  avgt   30   22.104 ?  0.158  ns/op
InternalStrLen.chunkedDouble              1024  avgt   30   65.326 ?  0.287  ns/op
InternalStrLen.chunkedSingle                 1  avgt   30    2.206 ?  0.042  ns/op
InternalStrLen.chunkedSingle                 4  avgt   30    2.826 ?  0.024  ns/op
InternalStrLen.chunkedSingle                16  avgt   30    3.518 ?  0.067  ns/op
InternalStrLen.chunkedSingle               251  avgt   30   12.504 ?  0.062  ns/op
InternalStrLen.chunkedSingle              1024  avgt   30   34.515 ?  0.305  ns/op
InternalStrLen.chunkedSingleMisaligned       1  avgt   30    2.510 ?  0.024  ns/op
InternalStrLen.chunkedSingleMisaligned       4  avgt   30    3.296 ?  0.033  ns/op
InternalStrLen.chunkedSingleMisaligned      16  avgt   30    4.244 ?  0.351  ns/op
InternalStrLen.chunkedSingleMisaligned     251  avgt   30   13.013 ?  0.305  ns/op
InternalStrLen.chunkedSingleMisaligned    1024  avgt   30   36.129 ?  1.311  ns/op
InternalStrLen.elementByteMisaligned         1  avgt   30    1.502 ?  0.041  ns/op
InternalStrLen.elementByteMisaligned         4  avgt   30    2.836 ?  0.028  ns/op
InternalStrLen.elementByteMisaligned        16  avgt   30    8.528 ?  0.120  ns/op
InternalStrLen.elementByteMisaligned       251  avgt   30  127.409 ?  2.418  ns/op
InternalStrLen.elementByteMisaligned      1024  avgt   30  494.441 ?  7.887  ns/op
InternalStrLen.elementDouble                 1  avgt   30    1.780 ?  0.043  ns/op
InternalStrLen.elementDouble                 4  avgt   30    3.798 ?  0.096  ns/op
InternalStrLen.elementDouble                16  avgt   30   12.194 ?  0.331  ns/op
InternalStrLen.elementDouble               251  avgt   30  195.002 ?  1.206  ns/op
InternalStrLen.elementDouble              1024  avgt   30  787.829 ? 12.224  ns/op
InternalStrLen.elementQuad                   1  avgt   30    1.789 ?  0.054  ns/op
InternalStrLen.elementQuad                   4  avgt   30    3.873 ?  0.171  ns/op
InternalStrLen.elementQuad                  16  avgt   30   12.198 ?  0.284  ns/op
InternalStrLen.elementQuad                 251  avgt   30  198.470 ?  2.636  ns/op
InternalStrLen.elementQuad                1024  avgt   30  795.685 ?  6.419  ns/op
InternalStrLen.elementSingle                 1  avgt   30    1.501 ?  0.044  ns/op
InternalStrLen.elementSingle                 4  avgt   30    2.834 ?  0.039  ns/op
InternalStrLen.elementSingle                16  avgt   30    8.506 ?  0.126  ns/op
InternalStrLen.elementSingle               251  avgt   30  131.504 ?  6.495  ns/op
InternalStrLen.elementSingle              1024  avgt   30  499.206 ? 15.157  ns/op

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8345120: A likely bug in StringSupport::chunkedStrlenShort (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22451/head:pull/22451
$ git checkout pull/22451

Update a local copy of the PR:
$ git checkout pull/22451
$ git pull https://git.openjdk.org/jdk.git pull/22451/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22451

View PR using the GUI difftool:
$ git pr show -t 22451

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22451.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 29, 2024

👋 Welcome back pminborg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 29, 2024

@minborg This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8345120: A likely bug in StringSupport::chunkedStrlenShort

Reviewed-by: mcimadamore

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 4 new commits pushed to the master branch:

  • 30b8bbe: 8345060: Remove Security Manager dependencies from java.security.KeyStore and Identity APIs and implementations
  • 1ca7644: 8339480: Build static-jdk image with a statically linked launcher
  • d589baf: 8345218: Clean out references to windows-x86 in jib profiles
  • b823398: 8345267: Fix memory leak in JVMCIEnv dtor

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented Nov 29, 2024

@minborg The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@minborg
Copy link
Contributor Author

minborg commented Nov 29, 2024

Here is a chart of the performance improvements on a Mac M1 with 251 element strings:

image

@minborg minborg marked this pull request as ready for review November 29, 2024 08:08
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 29, 2024
@mlbridge
Copy link

mlbridge bot commented Nov 29, 2024

if (curr == 0) {
return offset;
@ForceInline
public static int strlenByte(final AbstractMemorySegmentImpl segment,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: clients of this method seems to always pass segment.byteSize() for toOffset. What is the value of passing it as an explicit parameter instead of having byteSize() being called inside the method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is an internal API, I tried to keep it as similar as possible to the other bulk operations. But as you say, the value used here for the last parameter is always segment.byteSize(). We could easily change it. What's your preference here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the method works correctly on all end offsets (even when the offset is different from the segment size), then I'm ok leaving it as is. I'm mostly trying to make sure that endOffset = size is not an hidden invariant we depend on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test for the end offset.

@minborg
Copy link
Contributor Author

minborg commented Dec 2, 2024

@TheRealMDoerr and @offamitkumar ; Could you take a look at this and make sure it works using BE?

@TheRealMDoerr
Copy link
Contributor

Thanks for the notification. The java/foreign suite has passed on linux on Power (Big and Little Endian) as well as on AIX (Big Endian). I can run more tests over night if needed.

@offamitkumar
Copy link
Member

I ran tier1 on s390x, didn't see any regression.

@minborg
Copy link
Contributor Author

minborg commented Dec 2, 2024

Thanks, @TheRealMDoerr and @offamitkumar, for testing on BE machines.

for (var charset : standardCharsets()) {
for (var arena: arenas()) {
try (arena) {
MemorySegment strSegment = arena.allocateFrom(testString, charset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just use String::getBytes ? E.g. you don't really need the strSegment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. But then again, I do not need the byte array either.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 2, 2024
@minborg
Copy link
Contributor Author

minborg commented Dec 3, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Dec 3, 2024

Going to push as commit 8dada73.
Since your change was applied there have been 27 commits pushed to the master branch:

  • 659f70b: 8343418: Unnecessary Hashtable usage in CSS.htmlAttrToCssAttrMap
  • 5c8cb2e: 8337199: Add jcmd Thread.vthread_scheduler and Thread.vthread_pollers diagnostic commands
  • 3eb5461: 8343791: Socket.connect API should document whether the socket will be closed when hostname resolution fails or another error occurs
  • 4ac2e47: 8343800: Cleanup definition of NULL_WORD
  • a3b58ee: 8339983: [s390x] secondary_super_cache does not scale well: C1 and interpreter
  • e023add: 8345297: test/jdk/javax/swing/Action/8133039/bug8133039.java fails in ubuntu22.04
  • 40ae469: 8235786: Javadoc for com/sun/net/httpserver/HttpExchange.java#setAttribute is unclear
  • 24983dd: 7038838: Unspecified NPE in java.net.IDN methods
  • 325366e: 8345141: Remove uses of SecurityManager in ShellFolder related classes
  • d88c7b3: 8345279: Mistake in javadoc of javax.sql.rowset.BaseRowSet#setBigDecimal
  • ... and 17 more: https://git.openjdk.org/jdk/compare/9a48e4d9d2637bf152d6611061a0a0a195cc2caf...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 3, 2024
@openjdk openjdk bot closed this Dec 3, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 3, 2024
@openjdk
Copy link

openjdk bot commented Dec 3, 2024

@minborg Pushed as commit 8dada73.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

5 participants