Skip to content

Conversation

@yeonsookimdev
Copy link
Contributor

@yeonsookimdev yeonsookimdev commented Jun 2, 2018

Fix #27267

write() function from 0.7.0-alpha:

function write(io::IO, c::Char) 
    # Let c == '한'
    # v == 0xed959c00
    #v = reinterpret(UInt32, c)
    
    # u = 0x009c95ed
    u = bswap(reinterpret(UInt32, c))
    
    # n |    1 |    2 |    3 |    4  
    #   | 0xed | 0x95 | 0x9c | 0x00
    n = 1
    while true
        # n == 1, write 0xed
        # n == 2, write 0x95
        # n == 3, write 0x9c
        write(io, u % UInt8)
        # return 3
        (u >>= 8) == 0 && return n
    n += 1
    end
end
julia> @time write(stdout, '한') == 3
���  0.000629 seconds (13 allocations: 400 bytes)
true

julia> @time write(stdout, '한') == 3
���  0.000445 seconds (10 allocations: 256 bytes)
true

julia> @time write(stdout, '한') == 3
���  0.000226 seconds (10 allocations: 256 bytes)
true

julia> @time write(stdout, '한') == 3
���  0.000305 seconds (10 allocations: 256 bytes)
true

julia> @time write(stdout, '한') == 3
���  0.000350 seconds (10 allocations: 256 bytes)
true

julia>

write() function from alkorang/display:

function write(io::IO, c::Char)
    # Let c == '한'
    # v == 0xed959c00
    v = reinterpret(UInt32, c)
    u = hton(v) # BIG-endian
    
    # p[0] | p[1] | p[2] | p[3]
    # 0xed | 0x95 | 0x9c | 0x00
    p = unsafe_convert(Ptr{Cvoid}, Ref{UInt32}(u))
    
    # Compare with v, not u
    if      v & 0x000000FF != 0
        unsafe_write(io, p, 4)
    elseif  v & 0x0000FFFF != 0
        # Write 0xed, 0x95, 0x9c at once, return 3
        unsafe_write(io, p, 3)
    elseif  v & 0x00FFFFFF != 0
        unsafe_write(io, p, 2)
    else
        unsafe_write(io, p, 1)
    end
end
julia> @time write(stdout, '한') == 3
한  0.000234 seconds (8 allocations: 320 bytes)
true

julia> @time write(stdout, '한') == 3
한  0.000253 seconds (5 allocations: 176 bytes)
true

julia> @time write(stdout, '한') == 3
한  0.000136 seconds (5 allocations: 176 bytes)
true

julia> @time write(stdout, '한') == 3
한  0.000183 seconds (5 allocations: 176 bytes)
true

julia> @time write(stdout, '한') == 3
한  0.000212 seconds (5 allocations: 176 bytes)
true

julia>

@appleparan
Copy link

appleparan commented Jun 2, 2018

Any tests?

@yeonsookimdev
Copy link
Contributor Author

Any tests?

It passed Base.runtests(["strings/io", "char", "iobuffer"]).

julia> Base.runtests(["strings/io", "char", "iobuffer"]);
Test   (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
strings/io  (2) |     4.76 |   0.14 |  3.0 |     326.25 |   239.07
char        (3) |     6.60 |   0.27 |  4.1 |     612.73 |   241.36
iobuffer    (2) |     5.00 |   0.13 |  2.7 |     661.03 |   260.16

Test Summary: |  Pass  Total
  Overall     | 14420  14420
    SUCCESS

julia>

@strickek
Copy link
Contributor

strickek commented Jun 3, 2018

Is speed for one byte Chars also better?

@yeonsookimdev
Copy link
Contributor Author

yeonsookimdev commented Jun 3, 2018

Is speed for one byte Chars also better?

function foo(io::IO, c::Char, n::Int)
    for _ in 1:5
        @time for _ in 1:n
            write(io, c)
        end
    end
end

To make it more clear, I tried with IOBuffer after few warming-up.

0.7.0-alpha:

julia> GC.gc(); io = IOBuffer(); foo(io, 'a', 1000)
  0.000024 seconds (5 allocations: 2.172 KiB)
  0.000025 seconds (1 allocation: 2.125 KiB)
  0.000027 seconds (1 allocation: 2.000 KiB)
  0.000022 seconds
  0.000026 seconds (1 allocation: 4.000 KiB)

julia> GC.gc(); io = IOBuffer(); foo(io, 'δ', 1000)
  0.000041 seconds (6 allocations: 4.297 KiB)
  0.000042 seconds (1 allocation: 2.000 KiB)
  0.000050 seconds (1 allocation: 4.000 KiB)
  0.000043 seconds
  0.000049 seconds (1 allocation: 8.000 KiB)

julia> GC.gc(); io = IOBuffer(); foo(io, '한', 1000)
  0.000060 seconds (7 allocations: 6.297 KiB)
  0.000055 seconds (1 allocation: 4.000 KiB)
  0.000058 seconds (1 allocation: 8.000 KiB)
  0.000051 seconds
  0.000056 seconds

julia>

alkorang/display:

julia> GC.gc(); io = IOBuffer(); foo(io, 'a', 1000)
  0.000030 seconds (5 allocations: 2.172 KiB)
  0.000030 seconds (1 allocation: 2.125 KiB)
  0.000033 seconds (1 allocation: 2.000 KiB)
  0.000028 seconds
  0.000033 seconds (1 allocation: 4.000 KiB)

julia> GC.gc(); io = IOBuffer(); foo(io, 'δ', 1000)
  0.000034 seconds (6 allocations: 4.297 KiB)
  0.000034 seconds (1 allocation: 2.000 KiB)
  0.000033 seconds (1 allocation: 4.000 KiB)
  0.000035 seconds
  0.000043 seconds (1 allocation: 8.000 KiB)

julia> GC.gc(); io = IOBuffer(); foo(io, '한', 1000)
  0.000039 seconds (7 allocations: 6.297 KiB)
  0.000035 seconds (1 allocation: 4.000 KiB)
  0.000036 seconds (1 allocation: 8.000 KiB)
  0.000030 seconds
  0.000030 seconds

julia>

Is there any better way to measure them?

@nalimilan
Copy link
Member

Is there any better way to measure them?

Use @btime from BenchmarkTools.

@nalimilan nalimilan added the io Involving the I/O subsystem: libuv, read, write, etc. label Jun 3, 2018
@yeonsookimdev
Copy link
Contributor Author

yeonsookimdev commented Jun 3, 2018

Use @btime from BenchmarkTools.

Great package!

0.7.0-alpha:

julia> GC.gc(); io = IOBuffer(); @btime write(io, 'a');
  47.265 ns (0 allocations: 0 bytes)

julia> GC.gc(); io = IOBuffer(); @btime write(io, 'δ');
  63.404 ns (0 allocations: 0 bytes)

julia> GC.gc(); io = IOBuffer(); @btime write(io, '한');
  79.160 ns (0 allocations: 0 bytes)

julia>

alkorang/display:

julia> GC.gc(); io = IOBuffer(); @btime write(io, 'a');
  55.335 ns (0 allocations: 0 bytes)

julia> GC.gc(); io = IOBuffer(); @btime write(io, 'δ');
  54.950 ns (0 allocations: 0 bytes)

julia> GC.gc(); io = IOBuffer(); @btime write(io, '한');
  54.950 ns (0 allocations: 0 bytes)

julia>

base/io.jl Outdated
n += 1
v = reinterpret(UInt32, c)
u = hton(v) # BIG-endian
p = unsafe_convert(Ptr{Cvoid}, Ref{UInt32}(u))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this Ref needs a GC.@preserve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it requires GC.@preserve for Ref.

function f0(c::Char)
    u = hton(reinterpret(UInt32, c))
    p = unsafe_convert(Ptr{Cvoid}, Ref{UInt32}(u))
    show(unsafe_wrap(Array{UInt8,1}, Ptr{UInt8}(p), (4,)))
end

function f1(c::Char)
    u = hton(reinterpret(UInt32, c))
    GC.@preserve u begin
        p = unsafe_convert(Ptr{Cvoid}, Ref{UInt32}(u))
        show(unsafe_wrap(Array{UInt8,1}, Ptr{UInt8}(p), (4,)))
    end
end

function f2(c::Char)
    u = Ref{UInt32}(hton(reinterpret(UInt32, c)))
    GC.@preserve u begin
        p = unsafe_convert(Ptr{Cvoid}, u)
        show(unsafe_wrap(Array{UInt8,1}, Ptr{UInt8}(p), (4,)))
    end
end

UInt8[0xed, 0x95, 0x9c, 0x00] must be printed:

julia> f0('한')
UInt8[0x40, 0x27, 0x3e, 0x3c]
julia> f0('한')
UInt8[0x01, 0x00, 0x00, 0x00]
julia>

julia> f1('한')
UInt8[0x20, 0x1c, 0x3e, 0x3c]
julia> f1('한')
UInt8[0x01, 0x00, 0x00, 0x00]
julia>

julia> f2('한')
UInt8[0xed, 0x95, 0x9c, 0x00]
julia> f2('한')
UInt8[0xed, 0x95, 0x9c, 0x00]
julia>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right – if @time isn't showing the extra allocation of the Ref object, this is broken (as you are demonstrating above)

@appleparan
Copy link

It passed Base.runtests(["strings/io", "char", "iobuffer"]).

I mean for your issue. e.g. '한'. I think you should create tests for this issue.

@yeonsookimdev
Copy link
Contributor Author

I mean for your issue. e.g. '한'. I think you should create tests for this issue.

julia/test/char.jl

Lines 213 to 239 in 5583ae4

@testset "overlong codes" begin
function test_overlong(c::Char, n::Integer, rep::String)
if isvalid(c)
@test Int(c) == n
else
@test_throws Base.InvalidCharError UInt32(c)
end
@test sprint(show, c) == rep
if Base.isoverlong(c)
@test occursin(rep*": [overlong]", sprint(show, "text/plain", c))
end
end
# TODO: use char syntax once #25072 is fixed
test_overlong('\0', 0, "'\\0'")
test_overlong("\xc0\x80"[1], 0, "'\\xc0\\x80'")
test_overlong("\xe0\x80\x80"[1], 0, "'\\xe0\\x80\\x80'")
test_overlong("\xf0\x80\x80\x80"[1], 0, "'\\xf0\\x80\\x80\\x80'")
test_overlong('\x30', 0x30, "'0'")
test_overlong("\xc0\xb0"[1], 0x30, "'\\xc0\\xb0'")
test_overlong("\xe0\x80\xb0"[1], 0x30, "'\\xe0\\x80\\xb0'")
test_overlong("\xf0\x80\x80\xb0"[1], 0x30, "'\\xf0\\x80\\x80\\xb0'")
test_overlong('\u8430', 0x8430, "'萰'")
test_overlong("\xf0\x88\x90\xb0"[1], 0x8430, "'\\xf0\\x88\\x90\\xb0'")
end

Here are some tests for multibyte cases. test_overlong calls sprint(show, c) which eventually calls write(io::IO, c::Char). If write did not work correctly, it cannot pass these tests.

(u >>= 8) == 0 && return n
n += 1
v = reinterpret(UInt32, c)
u = Ref{UInt32}(hton(v)) # BIG-endian
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an unconditional bswap since UTF-8 is independent of the endianess of the system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UTF-8 is independent of the endianess of the system.

Yes, it is true, but UInt32 isn't.

base/io.jl Outdated
unsafe_write(io, p, 2)
else
unsafe_write(io, p, 1)
end
Copy link
Member

@StefanKarpinski StefanKarpinski Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can replace this with something along the lines of

unsafe_write(io, p, min(4, (trailing_zeros(v) >>> 3) + 1))

Copy link
Contributor Author

@yeonsookimdev yeonsookimdev Jun 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Copy link
Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Shame about AppVeyor timing out. Perhaps we should wait until that's sorted out and rebase since testing Windows seems fairly key here.

@yeonsookimdev
Copy link
Contributor Author

yeonsookimdev commented Jun 18, 2018

Perhaps we should wait until that's sorted out and rebase since testing Windows seems fairly key here.

@StefanKarpinski @JeffBezanson
Some other tests failed on Windows. I first thought it will be okay if I rebase it, but no.

One of them is from cmdlineargs.jl.

julia/test/cmdlineargs.jl

Lines 544 to 545 in b28496a

write(infile, "(1, 2+3)")
@test read(pipeline(exename, stdin=infile), String) == "(1, 5)\n"

Error message from 64bit AppVeyor

Error in testset cmdlineargs:
Test Failed at C:\projects\julia\julia-df4bb5401f\share\julia\test\cmdlineargs.jl:545
  Expression: read(pipeline(exename, stdin=infile), String) == "(1, 5)\n"
   Evaluated: "\x001, 5)\n" == "(1, 5)\n"

Error message from 32bit AppVeyor

Test Failed at C:\projects\julia\julia-df4bb5401f\share\julia\test\cmdlineargs.jl:545
  Expression: read(pipeline(exename, stdin=infile), String) == "(1, 5)\n"
   Evaluated: "\xcc1, 5)\n" == "(1, 5)\n"
ERROR: LoadError: Test run finished with errors

I guess the problem is garbage collected Ref, because the first broken Char keep changing like \x00 or \xd0 on my computer. But I don't understand, I protect Ref with GC.@preserve.

Those tests are spawning a process using @cmd, spawn, or pipeline and reading/writing data from/to redirected stdin/stdout.

@yeonsookimdev
Copy link
Contributor Author

@vtjnash
I think this pr won't be available until finding out the reason of broken bytes from unsafe_write. Fixing libuv to use buffering looks more general and nicer.

@yeonsookimdev
Copy link
Contributor Author

#28314 Fixes the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

io Involving the I/O subsystem: libuv, read, write, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants