-
Notifications
You must be signed in to change notification settings - Fork 2.4k
BATCH-2434 Improve TransactionAwareBufferedWriter #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I have signed and agree to the terms of the SpringSource Individual Contributor License Agreement. |
Hi @marschall , Thank you for this PR!
How did you measure this? I would like to be able to measure the improvement of this PR before applying any change. |
This changes a bit with Java 9+ where Doing a heap dump while a transaction is active and looking at the retained heap of a |
Thank you for your answer, I was aware of the theory and just wanted to know how you practically measured the 50% improvement in terms of memory usage. I see the PR is opened against the I will add some inline comments about the changes. |
.../main/java/org/springframework/batch/support/transaction/TransactionAwareBufferedWriter.java
Outdated
Show resolved
Hide resolved
.../main/java/org/springframework/batch/support/transaction/TransactionAwareBufferedWriter.java
Outdated
Show resolved
Hide resolved
.../main/java/org/springframework/batch/support/transaction/TransactionAwareBufferedWriter.java
Show resolved
Hide resolved
.../main/java/org/springframework/batch/support/transaction/TransactionAwareBufferedWriter.java
Outdated
Show resolved
Hide resolved
.../main/java/org/springframework/batch/support/transaction/TransactionAwareBufferedWriter.java
Outdated
Show resolved
Hide resolved
abcccb2
to
694b99e
Compare
...tch-infrastructure/src/main/java/org/springframework/batch/item/file/FlatFileItemWriter.java
Outdated
Show resolved
Hide resolved
.../main/java/org/springframework/batch/support/transaction/TransactionAwareBufferedWriter.java
Outdated
Show resolved
Hide resolved
this.writer = new OutputStreamWriter(outputStream, encoding); | ||
} | ||
|
||
int length() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is used in org.springframework.batch.support.transaction.TransactionAwareBufferedWriter#getBufferSize
. According to its Javadoc, the method should return the size of unflushed buffered data. However, the method length
flushes the writer before returning the size of the outputStream. Do we need to flush the writer to get the length of the buffer? A get
method should be side-effect free IMO. I tried to remove the writer.flush
but this makes several tests to fail. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that OutputStreamWriter
has an internal buffer in sun.nio.cs.StreamEncoder
. If we want to know how many bytes we will write this will be the bytes in the ByteArrayOutputStream
plus the bytes in the OutputStreamWriter
/ sun.nio.cs.StreamEncoder
. There is no way of accessing the number of bytes in the internal buffer in OutputStreamWriter
/ sun.nio.cs.StreamEncoder
. The only way to be sure is to flush those bytes to the ByteArrayOutputStream
.
@marschall Thank you for all these updates! While I like the improvement suggested here and the build is passing with this PR, I still have a couple of concerns:
@marschall What do you think? I also would like to have a second opinion on that PR, @mminella What are your thoughts? Anything I could have missed or overlooked? Thank you both. |
@benas can you be a bit more specific what changes you would like to see or which approach you would prefer? A deprecated constructor is provided for backwards comparability. Should I change the class Javadoc as well? |
|
||
private FileChannel channel; | ||
|
||
private OutputStream outputStream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would still require a FileOutputStream
given the dependency on the FileChannel
so I'd make this be a FileOutputStream
We have been discussing this with Michael and here are our thoughts. Since the PR contains two improvements, I will address them separately. 1. buffer in a ByteArrayOutputStream instead of StringBuilder
Indeed, this is basically JEP 254. Since v4.3 will be the last version in the v4 line (as announced last year) and since the next major version will have a Java9+ baseline anyway, we think the deprecation + additional complexity here is not worth it for the remaining short lifetime of v4. 2. overwrite #write(String) methods to avoid copiesThis enhancement makes sense and could be merged. I tried it here and the build passes without any regression. @marschall Could you please update the PR to keep only the second enhancement? It should be good to merge with that. Thank you upfront. |
@benas yes I can do this |
TransactionAwareBufferedWriter offers a number of optimization potentials. First it creates an unnecessary local, temporary char[] in * avoid local, temporary char[] in #write(char[], int, int) * overwrite #write(String) methods to avoid copies Together these two changes should help to reduce allocation rate. Issue: BATCH-2434 https://jira.spring.io/browse/BATCH-2434
I pushed the new version. |
LGTM. Rebased and merged as 10fd371. Thank you very much for your contribution! |
TransactionAwareBufferedWriter offers a number of optimization
potentials. First it buffers at the char level. Buffering at the byte
level instead saves about 50% memory usage in common cases. Second it
does not overwrite any of the #write(String) methods leading to
unnecessary intermediate copies.
Together these two changes should help to reduce both live set size and
allocation rate.
Issue: BATCH-2434
https://jira.spring.io/browse/BATCH-2434