-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-11601 FileStatus.getBlocksize() >0 for non-empty files #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
steveloughran
wants to merge
4
commits into
apache:trunk
from
steveloughran:stevel/HADOOP-11601-min-blocksize
Closed
HADOOP-11601 FileStatus.getBlocksize() >0 for non-empty files #50
steveloughran
wants to merge
4
commits into
apache:trunk
from
steveloughran:stevel/HADOOP-11601-min-blocksize
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1929671 to
f42f554
Compare
Member
|
+1 Will commit this week. |
timmyyao
pushed a commit
to timmyyao/hadoop
that referenced
this pull request
May 8, 2017
timmyyao
pushed a commit
to timmyyao/hadoop
that referenced
this pull request
May 8, 2017
fix apache#50 add pull namespace logic
aw-was-here
pushed a commit
to aw-was-here/hadoop
that referenced
this pull request
Jul 10, 2018
Changed readme after round of testing.
shanthoosh
pushed a commit
to shanthoosh/hadoop
that referenced
this pull request
Oct 15, 2019
SAMZA-1102: Added ZKController and ZkControllerImpl Author: Boris Shkolnik <[email protected]> Author: navina <[email protected]> Reviewers: Navina Ramesh <[email protected]>, Fred Ji <[email protected]>, Xinyu Liu <[email protected]> Closes apache#50 from sborya/ZkController
Arnoldosmium
pushed a commit
to Arnoldosmium/hadoop
that referenced
this pull request
Feb 4, 2020
saxenapranav
added a commit
to saxenapranav/hadoop
that referenced
this pull request
May 16, 2023
…xes_rename_config Test fixes for Config changes, azcopy, where config is given with dfs endpoint; forceBlobconversion is enabled. ABFSDriver#50
singer-bin
pushed a commit
to singer-bin/hadoop
that referenced
this pull request
Dec 19, 2024
This work is based on the GSOC project from the summer of 2014. We have expanded on it to fix bugs and change the write path to use ByteBuffers as well. This PR replaces the earlier PRs apache#6, apache#49 and apache#50 Author: Jason Altekruse <[email protected]> Author: sunyu <[email protected]> Author: adeneche <[email protected]> Author: Jacques Nadeau <[email protected]> Author: Parth Chandra <[email protected]> Author: [email protected] <[email protected]> Author: Jason Altekruse <[email protected]> Author: dsy <[email protected]> Author: Steven Phillips <[email protected]> Author: Gera Shegalov <[email protected]> Author: Ryan Blue <[email protected]> Closes apache#267 from jaltekruse/1.6.0rc3-drill-r0.3-merge and squashes the following commits: 56316d0 [Jason Altekruse] An exception out of the read method doesn't necessarily mean something is very wrong, so it shouldn't get wrapped in a ShouldNeverHappenException. This invocationTargetException will wrap any kind of exception coming out of the method, including an IOException. 58340d8 [Jason Altekruse] Fix CompatibilityUtil, primary issue was a small error in the package name for the class that was being used to detect if the Hadoop 2.x API was available. 96e19a8 [Jason Altekruse] Properly set the byte buffer position when reading out of a filesystem that does not implement the byte buffer based read method in the Hadoop 2.x API. 269daef [Jason Altekruse] Make CodecFactory public bd7aa97 [Jason Altekruse] Remove unused imports, one of which has been moved to package private and is no longer accessible in this class. a44fdba [Jason Altekruse] Fix logging and restrict access to classes inside of CodecFactory. 723701c [Jason Altekruse] Adding isDirect interface to ByteBufferAllocator to add a restriction on the allocators used by a DirectCodecFactory. 10b5ba3 [Jason Altekruse] Remove unneeded TODO 57491a2 [Jason Altekruse] Delete older version of test file, all of these tests look to be covered in the newer version. d6501b1 [Jason Altekruse] Thought I had fixed this double deallocation earlier, guess the change got lost somewhere. a8d2dc1 [Jason Altekruse] Address review comments. 40714a4 [Jason Altekruse] Move pageSize to the constructor of codecfactory rather than the method for getting a compressor. df7fd9c [Jason Altekruse] Limit access to classes and methods used for reflection based access to Hadoop 2.0 compression APIs. 192c717 [Jason Altekruse] Fix error message 1a47767 [Jason Altekruse] Address review comments 5869156 [Jason Altekruse] Move fallback classes from HeapCodecFactory to the DirectCodecFactory 3945674 [Jason Altekruse] Switch to using the DirectCodecFactory everywhere, one test is failing form the command line that is passing in intellij. e7f7f7f [Jason Altekruse] WIP - removing unneeded generics form CodecFactories 659230f [Jason Altekruse] Remove second version of the class ByteBufferBytesInput that was nested in DirectCodecFactory. Replace with the one that was declared in the BytesInput class. c305984 [Jason Altekruse] Adding back code generation for method to take a byte array as well as the new implementation that takes a Bytebuffer. b8f54c2 [Jason Altekruse] Add a unit test for ByteBufferBackedBinary. ae58486 [Jason Altekruse] Changing argument lists that previously included both an allocator and a ParquetProperties object. b4266fb [Jason Altekruse] Add license header to new class f8e5988 [Jason Altekruse] Added javadocs, removed unused code in DirectCodecFactory d332ca7 [Jason Altekruse] Add test for UnsignedVarIntBytesInput b7a6457 [Jason Altekruse] fix license leader 8ff878a [Jason Altekruse] Addressing review comments 862eb13 [Jason Altekruse] Fix usage of old constructor in Thrift module that caused a compilation failure. I had been skipping this module entirely during my work as the tests will fail to compile without a binary version of thrift 0.7, which seems hard to come by or compile yourself on Mac OS X. 0496350 [Jason Altekruse] Add unit test for direct codec factory. da1b52a [Jason Altekruse] Moving classes into parquet from Drill. 2f1a6c7 [Jason Altekruse] Consolidate a little more code 8f66e43 [Jason Altekruse] Create utility methods to transform checked exceptions to unchecked when using reflection. f217e6a [Jason Altekruse] Restore old interfaces d5536b6 [Jason Altekruse] Restore original name of CapacityByteArrayOutputStream to keep compatibility with 1.7 4c3195e [Jason Altekruse] Turn back on SemVer 2e95915 [Jason Altekruse] Addressing minor review comments, comments out code, star import, formatting a793be8 [Jason Altekruse] Add closeQuietly method to convert checked IOExceptions from classless into runtime exceptions. Remove a bunch of unused imports from when there were previously try catch blocks that did this wrapping themselves (many actually were refactored to remove any need for special exception handling in an earlier commit, only one is actually using the new method). fdb689c [Jason Altekruse] Remove unnecessary copy writing a Binary to an OutputStream if it is backed by a byte array. d4819b4 [Jason Altekruse] remove methods now unneccesary as same implementation has been moved to the base class. ad58bbe [Jason Altekruse] Addressing small review comments, unused imports, doc cleanup, etc. 9fb65dd [Jason Altekruse] Rename method to get a dictionary page to clarify that the dictionary will be closed and not available for further insertion. e79684e [Jason Altekruse] Review comments - fixing use of ParquetProperties and removing unused interfaces on PageWriter b1040a8 [Jason Altekruse] Remove code used to debug a test that was failing after the initial merge. 9dccb94 [Jason Altekruse] Add new method to turn BytesInput into an InputStream. f0e31ec [Jason Altekruse] revert small formatting and renaming changes, TODO make sure these result in a net diff of no changes (or only intended functional changes) 0098b1c [Jason Altekruse] Remove unused method 8c6e4a9 [Jason Altekruse] Addressing review comments, moving code out of generated class into abstract base class. 29cc747 [Jason Altekruse] Factor out common code 6959db7 [Jason Altekruse] addressing review comments, avoiding unnecessary copies when creating ByteBuffers fec4242 [Jason Altekruse] Address review comments - factoring out code in tests 104a1d1 [Jason Altekruse] Remove test requiring a hard-coded binary file. This was actually a bad file being produced by Drill because we were not flushing the RecordConsumer. 86317b0 [Jason Altekruse] Address review comments, make field in immutable ParquetProperties object final, make an interface now expecting a ByteBuffer deprecated for the version that takes a byte[]. 1971fc5 [Jason Altekruse] Fixes made while debugging drill unit tests ebae775 [Jason Altekruse] Fix issue reading page data into an off-heap ByteBuffer 705b864 [Jason Altekruse] Rename CapacityByteArrayOutputStream to CapacityByteBufferOutputStream to reflect new implementation internals. Add close method to CapacityByteBufferOutputStream and a few other classes. 35d8386 [Jason Altekruse] Move call to getBytes() on dictionaryPages to remove the need to cache a list of dictionaryEncoders to be closed later. d40706b [Jason Altekruse] Get rid of unnecessary calls to Bytebuffer.wrap(byte[]), as an interface that takes a byte array is still available. fddd4af [Jason Altekruse] WIP - removing copies from the ByteBufferBasedBinary equals, compareTo, hashCode methods. Current tests are passing, but I should add some new ones. 829af6f [Jason Altekruse] WIP - getting rid of unnecessary copies in Binary.java 23ad48e [Jason Altekruse] WIP - addressing review comments 7e252f3 [Jason Altekruse] WIP - addressing review comments 1f4f504 [Jason Altekruse] WIP - addressing review comments ab54c4e [Jason Altekruse] Moving classes out of the old packages. 45cadee [Jason Altekruse] Cleaning up code in Binary after merge. 864b011 [Jason Altekruse] Simplifying how buffer allocators are passed when creating ValuesWriters. 2b8328b [Jason Altekruse] I all of the tests are now passing after the merge. 1bfa3a0 [Jason Altekruse] Merge branch 'master' into 1.6.0rc3-drill-r0.3-merge 9bbc269 [Jacques Nadeau] Update to 1.6.0rc3-drill-r0.3 9f22bd7 [Jacques Nadeau] Make CodecFactory pluggable 4a9dd28 [Jacques Nadeau] update pom version 173aa25 [Jacques Nadeau] Set max preferred slab size to 16mb c98ec2a [adeneche] bumped version to 1.6.0rc3-drill-r0.1 51cf2f1 [Ryan Blue] cherry pick pull#188 e1df3b9 [adeneche] disabled enforcer and changed version to -drill 6943536 [adeneche] fixing bug related to testDictionaryError_419 48cceef [Steven Phillips] Fix allocation in DictionaryValuesWriter 98b99ea [Parth Chandra] Revert readFooter to not use ZeroCopy path. a6389db [Steven Phillips] Make constructor for PrimitiveType that takes decimalMetadata public. e488924 [adeneche] after merge code cleanup 35b10af [Parth Chandra] Use ByteBuffers in the Write path. Allow callers to pass in an allocator to allocate the ByteBuffer. 2187697 [Jacques Nadeau] Update Binary to make a copy of data for initial statistics. 8143174 [adeneche] update pig.version to build with Hadoop 2 jars 2c2b183 [Parth Chandra] Remove Zero Copy read path while reading footers 7bc2a4d [Parth Chandra] Make a copy of Min and Max values for BinaryStatistics so that direct memory can be released before stats are written. 5bc8774 [Parth Chandra] Update Snappy Codec to implement DirectDecompressionCodec interface Add compatibility function to read directly into a byte buffer 0d22908 [adeneche] merging with master 8be638a [sunyu] Address tsdeng's comments 861e541 [dsy] enable enforcer check. 912cbaf [sunyu] fix a bug in equals in ByteBuffer Binary with offset and length 016e89c [sunyu] remove some unncessary codes. add compatible method initFromPage in ValueReaders. add toByteBuffer method in ByteBufferInputStream. add V21FileAPI class to encapsulate v21 APIs and make it a singlton. add ByteBuffer based equal and compareto method in Binary. 26dc879 [dsy] disable enforcer to pass build. a7bcfbb [sunyu] Make BytePacker consume ByteBuffer directly. 01c2ae5 [sunyu] Implement FSDISTransport in Compatible layer. Fix bugs in Binary. 47b177d [sunyu] Move CompatibilityUtil to parquet.hadoop.util. Use reflect to call new API to keep compatible. 970fc8b [[email protected]] Add a Hadoop compatible layer to abstract away the zero copy API and old API. 4f399aa [[email protected]] Add original readIntLittleEndian function to keep compatible with previous verision. 7ac1df5 [[email protected]] Using Writable Channel to replace write to OutputStream one by one. 36aba13 [sunyu] Read from ByteBuffer instead of ByteArray to avoid unnecessary array copy through read path. 53500d4 [sunyu] Add ByteBufferInputStream and modify Chunk to consume ByteBuffer instead of byte array. df1ad93 [[email protected]] Reading chunk using zero-copy API 2d32f49 [Gera Shegalov] Reading file metadata using zero-copy API 686d598 [Gera Shegalov] Use ByteBuf-based api to read magic.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enhance FS spec & tests to mandate FileStatus.getBlocksize() >0 for non-empty files