- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
          Add WidenLatin1ToUtf16_MisalignedAddress
          #90319
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Closed
      
      
    
  
     Closed
                    Changes from all commits
      Commits
    
    
            Show all changes
          
          
            3 commits
          
        
        Select commit
          Hold shift + click to select a range
      
      
    File filter
Filter by extension
Conversations
          Failed to load comments.   
        
        
          
      Loading
        
  Jump to
        
          Jump to file
        
      
      
          Failed to load files.   
        
        
          
      Loading
        
  Diff view
Diff view
There are no files selected for viewing
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the need of this, I am seeing that we already align data in
runtime/src/libraries/System.Private.CoreLib/src/System/Text/Latin1Utility.cs
Lines 990 to 1015 in 4218f4c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the address that
pUtf16Bufferpoints to is naturally aligned, then this logic should work as expected.The problem is:
which does not effectively ensure correct alignment for a misaligned
char*.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EgorBo As an alternative approach, change the following to
Sse2.Storeand accept the performance hit pre-Nehalem.runtime/src/libraries/System.Private.CoreLib/src/System/Text/Latin1Utility.cs
Line 899 in 4218f4c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just use unaligned load/store here. It may have some small hit on hardware that's more than 10-12 years old, but it simplifies things overall and will be increasingly rare to encounter such hardware over time. Plus, we don't use aligned loads/stores in most of our workloads already, so this won't be any different.
Opportunistically aligning the data is still goodness and will avoid cache line splits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like
Sse2.Storeapproach is better since unaligned char* should be rare? (e.g. char[] is always aligned)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The consideration would be that
Sse2.Storedoesn't allow containment on pre-AVX hardware and on pre-AVX hardwaremovupswas frequently slower thanmovapseven if the data was definitely aligned.But, given the age of the hardware involved and the logic we use elsewhere, it shouldn't be a huge consideration (and the cost also wasn't huge, typically an extra cycle or so; the split cache line accesses were much worse and closer to 20 cycles for loads).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WidenLatin1ToUtf16_Sse2andNarrowUtf16ToLatin1_Sse2are the only methods in the repository that useStoreAligned.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A potential case is the public API
System.Text.Encoding.GetChars(System.Byte*, System.Int32, System.Char*, System.Int32).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, and so preserving the alignment here won't make a big difference to downlevel hardware. We should just normalize to not using
StoreAlignedso the code can be simpler and robust in the face of unalignable data.