How to convert a pointer to String back to a String?


#1

I can create a pointer to a String using pointer and if I have a pointer to a string what’s the way to “convert” it back? Basically create a String where the first char is pointed at by the pointer

s = "abc"
ptr = pointer(s)
# how do I convert the ptr back to a string? By pointing to it?

Update
Actually, I am looking for a way where xp and yp are the same, so as to conserve memory

x = "abc"
xp = pointer(x)
y = unsafe_string(xp)
yp = pointer(y)
yp == xp # false

#2
julia> s = "abc"
"abc"

julia> p = pointer(s)
Ptr{UInt8} @0x0000000121d16058

julia> unsafe_string(p)
"abc"

Perhaps also of interest:

julia> unsafe_load(p)
0x61

julia> unsafe_load(p+1)
0x62

julia> unsafe_load(p+2)
0x63

julia> unsafe_load(p+3)
0x00

#3

Yep. Just don’t forget to use Base.unsafe_string or import Base.unsafe_string.


#4

What version of Julia? In 0.6.1, unsafe_string doesn’t need to be imported.


#5

I am using 0.6.2.

Also

x = "abc"
xp = pointer(x)
y = unsafe_string(xp)
yp = pointer(y)
yp == xp # false

Actually I was looking for a way so that xp and yp are the same.


#6

unsafe_string makes a copy of the string, as you discovered. Why not just set yp equal to xp if you want two pointers to the same string?


#7

My use case is string grouping. I have a Strings array but many of them point to the same underlying location. So instead of grouping strings by underlying value (slow!), I group them by their pointer (fast!), but then I still want the underlying strings-value again in the final step of the algorithm.

Anyway, I can just unsafe_string once for each groups I guess.


#8

Not sure if you want do this in production code :stuck_out_tongue: :

julia> unsafe_pointer_to_objref(xp-8) === x
true

#9

You generally don’t want to take pointers to strings like this. The main reason you’d want a pointer to a string is interacting with C for which you’ll want to use the Cstring type in the ccall signature, which checks that you don’t have embedded nulls and ensures that your string data is null terminated. See

https://docs.julialang.org/en/stable/manual/calling-c-and-fortran-code/

Edit: if you just want a pointer and a length, then you’ll want to use Ptr{UInt8} for the pointer field and Csize_t for the length (according to the C API, of course).


#10

What about interning the strings? There is a package now that does just that.


#11

Yeah, I will at some point make an optimisation for that which is basically pointer based sorting/grouping (btw which at this stage is 40% faster than R’s string radixsort already).