i just wrapped the C function strtok to Julia to practice. I wanted to try it myself only with the help of the documentation.
Can I leave this approach as it is or are there other possibilities? I didn’t know exactly how to handle the NULL better.
Yes! Thanks to point this out, I was about to write to @mwolff that in this case the user must ensure that the str is not deallocated by the GC. It works in the example because the string is stored in a global variable, right?
@yuyichao, just to improve my understanding, I know that you cannot mutate a string in Julia. Thus, if I pass a string as argument to a function, it will not be modified after the function is called. However, in this example, it is indeed modified:
stok = "- This, a sample string."
println(stok)
p = strtok(stok, " ,.-")
println(stok)
julia> include("strok.jl")
- This, a sample string.
- This a sample string.
It’ll also simply work if the compiler and GC doesn’t feel like messing with it. You are merely giving it a license to free it, which may happen at any time.
You can certainly write code to mutate string in Julia, it’s just that doing that is undefined behavior. You cannot expect Julia to act in any sane way after this point.
If you have a little spare time, can you take a look in this modified version? This integration is something I am really interested in:
function strtok(str::Union{Nothing,Vector{String}}, delim::String)
if str == nothing
ptr = Cstring(C_NULL)
else
if length(str) != 1
error("The vector `str` must have only one element.")
end
GC.@preserve str (ptr = convert(Cstring,pointer(str[1])))
end
ptr_tok = ccall((:strtok, "libc"), Cstring, (Cstring, Cstring), ptr, delim)
tok = ptr_tok != C_NULL ? unsafe_string(ptr_tok) : ""
return tok
end
stok = ["- This, a sample string."]
p = strtok(stok, " ,.-")
while(p ≠ "")
global p
println("Token : $p")
p = strtok(nothing, " ,.-")
end
But in the dokumentary cstring(String) should be used for char* in c function. Vector only if i have to allocate memory myself then Ptr{Uint8}. I think strtok does this internally, because the original string is broken and gives new strings back. Sorry i must translate everything from german to english…
No. Passing a pointer simply means the caller is now responsible to make sure the julia object doesn’t get free’d, there’s absolutely no way you can cheat on this. If you pass a pointer that come from a string then it’s wrong.
Now I see. That would answer tons of questions I had with random segmentation faults while creating TextUserInterfaces.jl. Some ncurses functions needs a null-terminated string to create menus (char*). That string must not be deallocated since it is not copied anywhere, only the pointer is stored.
Since I need to use a mutable type, then I came up with this solution, in which I convert the string to a const UInt8 array. Of course, this will only work for ASCII characters:
const _vstr = UInt8[]
function strtok(str::Union{Nothing,String}, delim::String)
GC.@preserve _vstr begin
if str == nothing
ptr = C_NULL
else
# Empty the current string vector.
empty!(_vstr)
# Create the new string vector.
for c in str
push!(_vstr, UInt8(c))
end
# We need a null-terminated string.
push!(_vstr, '\0')
# Get the pointer.
ptr = convert(Ptr{Cvoid}, pointer(_vstr))
end
ptr_tok = ccall((:strtok, "libc"), Cstring, (Ptr{Cvoid}, Cstring), ptr, delim)
tok = ptr_tok != C_NULL ? unsafe_string(ptr_tok) : ""
return tok
end
end
p = strtok("- This, a sample string.", " ,.-")
while(p ≠ "")
global p
println("Token : $p")
p = strtok(nothing, " ,.-")
end
In this case, is everything right? (I know it is ugly, but is it at least right )
It was the very first thing that came up to my mind. I cannot declare it local, on the function, because it will get deallocated right? AFAIK, strtok stores the pointer and uses it if called with NULL as input.
@yuyichao btw, I am wondering, if the C function does not change the string, only uses it, (ncurses does that) would then be wrong to pass the pointer to the string?