Why does this allocate? (with example)

It looks like in a ccall, a string doesn’t allocate but a string view does. Is this expected? Why? To me it’s unexpected.

This doesn’t allocate

st = "experiment number 1 and 2"
fmt = "experiment number %ld %n"
function test_alloc(st, fmt)
    i_ref = Ref{Cint}(0);
    j_ref = Ref{Cint}(0);
    x = @allocated ccall(:sscanf, Cint, (Cstring, Ptr{UInt8}, Ptr{Cint}, Ptr{Cint}), st, fmt, i_ref, j_ref)
    println("alloc=$x, i_ref=$(i_ref[]), j_ref=$(j_ref[])")
end
test_alloc(st, fmt)

The code above prints:

alloc=0, i_ref=1, j_ref=20

(I print i_ref and j_ref just so I’m 100% sure that sscanf is being called correctly and doing what I want.)
No allocs, as expected. However…

st_view = @view st[1:20]
test_alloc(st_view, fmt)

prints

alloc=48, i_ref=1, j_ref=20

So two questions: A) Why does this allocate? And B) is there a way to pass views into sscanf in a way that doesn’t allocate?

Thank you

EDIT:

julia> versioninfo()
Julia Version 1.7.0
Commit 3bf9d17731 (2021-11-30 12:12 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, haswell)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 

probably because you promised a Cstring but a view is a SubString so Julia makes a copy when passing to Ccall. A Cstring is nothing but a pointer to UInt8, so you can replace st in the call with pointer(st), if you need view, you can offset it by pointer(st) + 2 or something.

IIUC, since it’s Cstring, you can’t take arbitrary slices, since they are NULL terminated

2 Likes

Nailed it. Thank you

You have to be a bit careful here, because C code generally expects null-terminated strings. String in Julia is already null-terminated, but this is really an implementation detail which is why there is Cstring for C interop to signify that the string needs to be null-terminated. This is why the conversion from String is generally no-op, but since a SubString generally isn’t null-terminated, a copy is made in that case.

2 Likes

Does sscanf care though?

Yes, it does. See for example:

julia> s = SubString("1234", 1:2)
"12"

julia> r = Ref{Cint}(0)
Base.RefValue{Int32}(0)

julia> ccall(:sscanf, Cint, (Cstring, Cstring, Ptr{Cint}), s, "%ld", r)
1

julia> r
Base.RefValue{Int32}(12)

julia> ccall(:sscanf, Cint, (Ptr{UInt8}, Cstring, Ptr{Cint}), s, "%ld", r)
1

julia> r
Base.RefValue{Int32}(1234)
1 Like

For my use case I have reasonably certainty that the formatting string won’t run past the scanned string.

as I was saying, offsetting the head is fine because you still see the string to the end so NULL termination is not changed, but if you want a slice of a String you can’t avoid copy.

Alright guys, follow up.

How is it that this allocates:

const test_line1 = "\"x\":1622764914150,\"y\":31534070432,\0";
const fmt1 = "\"x\":%ld,\"y\":%ld,\0";
function pp(func::Function, line::AbstractString)
    x_ref = Ref{Clong}();
    y_ref = Ref{Clong}();
    result = ccall(:sscanf, Cint, (Cstring, Ptr{UInt8}, Ptr{Clong}, Ptr{Clong}), pointer(line), fmt1, x_ref, y_ref)
    # println("$(x_ref[]), $(y_ref[])")
    return result
end
function test_alloc1(line)
    counter = 0
    r = pp((x)->counter+=1, line)
end
@allocated test_alloc1(test_line1)

But this doesn’t:

const test_line2 = "\"x\":1622764914150,";
const fmt2 = "\"x\":%ld,\0";
function pp(func::Function, line::AbstractString)
    x_ref = Ref{Clong}();
    result = ccall(:sscanf, Cint, (Cstring, Ptr{UInt8}, Ptr{Clong}), pointer(line), fmt2, x_ref)
    # println("$(x_ref[])")
    return result
end
function test_alloc2(line)
    counter = 0
    r = pp((x)->counter+=1, line)
end
@allocated test_alloc2(test_line2)

I’ve left inside commented out printlns which you can remove to check that the ccall is actually doing the right thing.

EDIT:

I’ve found that the @code_typed of these two functions is quite different. The Type{Code.Box} on the first one looks super suspicious to me.

julia> @code_typed test_alloc1(test_line1)
CodeInfo(
1 ─ %1 = Core.Box::Type{Core.Box}
│   %2 = %new(%1)::Core.Box
│        Core.setfield!(%2, :contents, 0)::Int64
│   %4 = %new(Main.:(var"#15#16"), %2)::var"#15#16"
│   %5 = invoke Main.pp1(%4::Function, line::String)::Int32
└──      return %5
) => Int32
julia> @code_typed test_alloc2(test_line2)
CodeInfo(
1 ─ %1 = %new(Base.RefValue{Int64})::Base.RefValue{Int64}
│   %2 = $(Expr(:foreigncall, :(:jl_string_ptr), Ptr{UInt8}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{UInt8}
│   %3 = Base.bitcast(Base.Cstring, %2)::Cstring
│   %4 = Main.fmt2::String
│   %5 = $(Expr(:foreigncall, :(:jl_string_ptr), Ptr{UInt8}, svec(Any), 0, :(:ccall), :(%4)))::Ptr{UInt8}
│   %6 = $(Expr(:foreigncall, :(:jl_value_ptr), Ptr{Nothing}, svec(Any), 0, :(:ccall), :(%1)))::Ptr{Nothing}
│   %7 = Base.bitcast(Ptr{Int64}, %6)::Ptr{Int64}
│   %8 = $(Expr(:foreigncall, :(:sscanf), Int32, svec(Cstring, Ptr{UInt8}, Ptr{Int64}), 0, :(:ccall), :(%3), :(%5), :(%7), :(%1), :(%4), :(%3)))::Int32
└──      return %8
) => Int32

EDIT2:
Some further hints. In the first code, replacing counter+=1 with counter+1 makes it the boxing disappear and allocations get reduced from 32 to 16. The @code_typed in this case becomes:

julia> @code_typed test_alloc1(test_line1)
CodeInfo(
1 ─ %1 = π (0, Int64)
│   %2 = %new(var"#31#32"{Int64}, %1)::var"#31#32"{Int64}
│   %3 = invoke Main.pp1(%2::Function, line::String)::Int32
└──      return %3
) => Int32

So the questions remain: why is boxing necessary? counter is Int and so is 1. What is the %new in the latest @code_typed?

Looks a lot like performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub

Thanks for the link, very interesting.

I think you might be on to something.

Ok so what’s the atlernative here?
This function is a parser parse(func::Function, line::AbstractString), which parses data into types and calls func on those types:

for line in lines
    parse(handler, line)
end

But of course, for the handler to do something useful with the data it has to put it somewhere, so I have something which is conceptually like

v = []
for line in lines
    parse( (x)->push!(v, x), line)
end

I get it that in this example I might allocate on push! but A) it’s amortized, B) i could pick a different container, C) still doesn’t explain the box. The problem is that to get the data out of the handler, the handler has to be a closure. What do you suggest?