- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
Description
Description
The TarHelpers.GetTrimmedUtf8String() is called here to extract strings from Tar headers:
runtime/src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHeader.Read.cs
Line 387 in 9e5e6aa
| name: TarHelpers.GetTrimmedUtf8String(buffer.Slice(FieldLocations.Name, FieldLengths.Name)), | 
Internally, TarHelpers.GetTrimmedUtf8String() calls TarHelpers.TrimEndingNullsAndSpaces() that, as the name implies, trims trailing '\0' (0x00) and ' ' (0x20) characters.
According to these specs, these fields are null-terminated character strings.
The
name,linkname,magic,uname, andgnameare null-terminated character strings.
So the correct thing to do would be to keep as many characters as possible, and stop at the null-terminator character.
Not following this practice causes issues with this tar which contains extra bytes after the null-terminator.
Here's a hex-dump of the header for one of the entries:

.NET interprets this name as "python/bin/idle3.13\0dle3.13" which becomes "python/bin/idle3.13_dle3.13" once extracted. The correct name would be "python/bin/idle3.13".
Reproduction Steps
Download cpython-3.13.5+20250702-x86_64-unknown-linux-gnu-install_only_stripped.tar.gz and place it in the appropriate directory.
Extract it with the following code:
using Stream fileStream = File.OpenRead(@"cpython-3.13.5+20250702-x86_64-unknown-linux-gnu-install_only_stripped.tar.gz");
Directory.CreateDirectory(@"~temp");
using GZipStream tarStream = new(fileStream, CompressionMode.Decompress);
await TarFile.ExtractToDirectoryAsync(tarStream, @"~temp", false);Expected behavior
File names should respect null-terminated strings, such as in this example:
 
Actual behavior
File names of the extracted files are incorrect:
 
Regression?
No response
Known Workarounds
No response
Configuration
.NET 9.0.300
Other information
No response