Errors while writing in CuDeviceArray buffers

I have some function which is called by my kernel

function foo(F, material, state, buffer,fiber,family)
    grid = material.convexstrategy.grid
    grid_F = material.convexstrategy.grid_F
    𝐠 = material.convexstrategy.g
    # compute some things and write into grid,grid_F and 𝐠
    #...
    #....
    plus_idx =  findfirst(x -> x >= F[1], grid)
    minus_idx = findlast(x -> x <= F[1], grid)
    ipnext = findfirst(x->x>F[1],grid)
    iplast = findlast(x->x<F[1],grid)

    # What I'd like to do:
    #buffer.W_min[fiber*family] = (𝐠[iplast] + ((𝐠[ipnext] - 𝐠[iplast])/(grid[ipnext] - grid[iplast])) * (F[1] - grid[iplast]))
    #buffer.F_points[fiber*family,1] = grid_F[plus_idx] 
    #buffer.F_points[fiber*family,2] = grid_F[minus_idx]
    # But only constant assignments work, such as:
    buffer.W_min[fiber*family] = (𝐠[1] + ((𝐠[2] - 𝐠[1])/(grid[2] - grid[1])) * (F[1] - grid[1]))
    buffer.F_points[fiber*family,1] = grid_F[1]
    buffer.F_points[fiber*family,2] = grid_F[2]
    return nothing
end

fiber and family are my thread ids, buffer is a struct which holds a few CuArrays that are adapted with Adapt.jl to the appropriate CuDevice Array
If uncomment what I’d like to do and comment the constant assignments, I get string errors, that I do not understand:

Reason: unsupported dynamic function invocation (call to print)
Stacktrace:
  [1] print_to_string
    @ strings/io.jl:135
  [2] string
    @ strings/io.jl:174
  [3] to_index
    @ indices.jl:300
  [4] to_index
    @ indices.jl:277
  [5] to_indices
    @ indices.jl:333
  [6] to_indices
    @ indices.jl:325
  [7] getindex
    @ abstractarray.jl:1170
  [8] convexify
    @ ~/convexified-damage/src/convexification.jl:77
  [9] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [10] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
  [1] _growend!
    @ array.jl:884
  [2] resize!
    @ array.jl:1104
  [3] print_to_string
    @ strings/io.jl:137
  [4] string
    @ strings/io.jl:174
  [5] to_index
    @ indices.jl:300
  [6] to_index
    @ indices.jl:277
  [7] to_indices
    @ indices.jl:333
  [8] to_indices
    @ indices.jl:325
  [9] getindex
    @ abstractarray.jl:1170
 [10] convexify
    @ ~/convexified-damage/src/convexification.jl:77
 [11] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [12] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
  [1] _deleteend!
    @ array.jl:893
  [2] resize!
    @ array.jl:1109
  [3] print_to_string
    @ strings/io.jl:137
  [4] string
    @ strings/io.jl:174
  [5] to_index
    @ indices.jl:300
  [6] to_index
    @ indices.jl:277
  [7] to_indices
    @ indices.jl:333
  [8] to_indices
    @ indices.jl:325
  [9] getindex
    @ abstractarray.jl:1170
 [10] convexify
    @ ~/convexified-damage/src/convexification.jl:77
 [11] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [12] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
  [1] String
    @ strings/string.jl:53
  [2] print_to_string
    @ strings/io.jl:137
  [3] string
    @ strings/io.jl:174
  [4] to_index
    @ indices.jl:300
  [5] to_index
    @ indices.jl:277
  [6] to_indices
    @ indices.jl:333
  [7] to_indices
    @ indices.jl:325
  [8] getindex
    @ abstractarray.jl:1170
  [9] convexify
    @ ~/convexified-damage/src/convexification.jl:77
 [10] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [11] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
  [1] _string_n
    @ strings/string.jl:74
  [2] StringVector
    @ iobuffer.jl:31
  [3] #IOBuffer#361
    @ iobuffer.jl:114
  [4] print_to_string
    @ strings/io.jl:133
  [5] string
    @ strings/io.jl:174
  [6] to_index
    @ indices.jl:300
  [7] to_index
    @ indices.jl:277
  [8] to_indices
    @ indices.jl:333
  [9] to_indices
    @ indices.jl:325
 [10] getindex
    @ abstractarray.jl:1170
 [11] convexify
    @ ~/convexified-damage/src/convexification.jl:77
 [12] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [13] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
  [1] unsafe_wrap
    @ strings/string.jl:85
  [2] StringVector
    @ iobuffer.jl:31
  [3] #IOBuffer#361
    @ iobuffer.jl:114
  [4] print_to_string
    @ strings/io.jl:133
  [5] string
    @ strings/io.jl:174
  [6] to_index
    @ indices.jl:300
  [7] to_index
    @ indices.jl:277
  [8] to_indices
    @ indices.jl:333
  [9] to_indices
    @ indices.jl:325
 [10] getindex
    @ abstractarray.jl:1170
 [11] convexify
    @ ~/convexified-damage/src/convexification.jl:77
 [12] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [13] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported call through a literal pointer (call to __memset_avx2_unaligned)
Stacktrace:
  [1] fill!
    @ array.jl:406
  [2] #IOBuffer#361
    @ iobuffer.jl:121
  [3] print_to_string
    @ strings/io.jl:133
  [4] string
    @ strings/io.jl:174
  [5] to_index
    @ indices.jl:300
  [6] to_index
    @ indices.jl:277
  [7] to_indices
    @ indices.jl:333
  [8] to_indices
    @ indices.jl:325
  [9] getindex
    @ abstractarray.jl:1170
 [10] convexify
    @ ~/convexified-damage/src/convexification.jl:77
 [11] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [12] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584
Reason: unsupported dynamic function invocation (call to var"#sprint#385"(context, sizehint::Integer, ::typeof(sprint), f::Function, args...) in Base at strings/io.jl:100)
Stacktrace:
  [1] #repr#386
    @ strings/io.jl:219
  [2] limitrepr
    @ strings/io.jl:221
  [3] to_index
    @ indices.jl:300
  [4] to_index
    @ indices.jl:277
  [5] to_indices
    @ indices.jl:333
  [6] to_indices
    @ indices.jl:325
  [7] getindex
    @ abstractarray.jl:1170
  [8] convexify
    @ ~/convexified-damage/src/convexification.jl:77
  [9] constitutive_driver
    @ ~/convexified-damage/src/material.jl:654
 [10] integrate_fiber
    @ ~/convexified-damage/src/material.jl:584

line 77 corresponds to the line of the non-constant buffer.W_min assignment.

If I let this outcommented and instead uncomment one line of buffer.F_points, e.g. buffer.F_points[i*j,1] = grid_F[plus_idx] I get a stacktrace which doesn’t point any longer to my code. Note that all of this is with -g 2 flag.

son: unsupported dynamic function invocation (call to print)
Stacktrace:
 [1] print_to_string
   @ strings/io.jl:135
 [2] string
   @ strings/io.jl:174
 [3] to_index
   @ indices.jl:300
 [4] to_index
   @ indices.jl:277
 [5] multiple call sites
   @ unknown:0
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
 [1] _growend!
   @ array.jl:884
 [2] resize!
   @ array.jl:1104
 [3] print_to_string
   @ strings/io.jl:137
 [4] string
   @ strings/io.jl:174
 [5] to_index
   @ indices.jl:300
 [6] to_index
   @ indices.jl:277
 [7] multiple call sites
   @ unknown:0
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
 [1] _deleteend!
   @ array.jl:893
 [2] resize!
   @ array.jl:1109
 [3] print_to_string
   @ strings/io.jl:137
 [4] string
   @ strings/io.jl:174
 [5] to_index
   @ indices.jl:300
 [6] to_index
   @ indices.jl:277
 [7] multiple call sites
   @ unknown:0
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
 [1] String
   @ strings/string.jl:53
 [2] print_to_string
   @ strings/io.jl:137
 [3] string
   @ strings/io.jl:174
 [4] to_index
   @ indices.jl:300
 [5] to_index
   @ indices.jl:277
 [6] multiple call sites
   @ unknown:0
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
 [1] _string_n
   @ strings/string.jl:74
 [2] StringVector
   @ iobuffer.jl:31
 [3] #IOBuffer#361
   @ iobuffer.jl:114
 [4] print_to_string
   @ strings/io.jl:133
 [5] string
   @ strings/io.jl:174
 [6] to_index
   @ indices.jl:300
 [7] to_index
   @ indices.jl:277
 [8] multiple call sites
   @ unknown:0
Reason: unsupported call through a literal pointer (call to )
Stacktrace:
 [1] unsafe_wrap
   @ strings/string.jl:85
 [2] StringVector
   @ iobuffer.jl:31
 [3] #IOBuffer#361
   @ iobuffer.jl:114
 [4] print_to_string
   @ strings/io.jl:133
 [5] string
   @ strings/io.jl:174
 [6] to_index
   @ indices.jl:300
 [7] to_index
   @ indices.jl:277
 [8] multiple call sites
   @ unknown:0
Reason: unsupported call through a literal pointer (call to __memset_avx2_unaligned)
Stacktrace:
 [1] fill!
   @ array.jl:406
 [2] #IOBuffer#361
   @ iobuffer.jl:121
 [3] print_to_string
   @ strings/io.jl:133
 [4] string
   @ strings/io.jl:174
 [5] to_index
   @ indices.jl:300
 [6] to_index
   @ indices.jl:277
 [7] multiple call sites
   @ unknown:0
Reason: unsupported dynamic function invocation (call to var"#sprint#385"(context, sizehint::Integer, ::typeof(sprint), f::Function, args...) in Base at strings/io.jl:100)
Stacktrace:
 [1] #repr#386
   @ strings/io.jl:219
 [2] limitrepr
   @ strings/io.jl:221
 [3] to_index
   @ indices.jl:300
 [4] to_index
   @ indices.jl:277
 [5] multiple call sites
   @ unknown:0

Am I missing something conceptually? If I can write constants inside my buffers, the datatypes should match. Only if I make use of dynamic indices like plus_idx which are obtained by findfirst and similar functions, I get errors. I checked if the indices have an integer value, so they aren’t of type Nothing.

I’d be happy if anyone points out conceptual problems or hints to further debug this. I tried to use @device_code_warntype interactive=true, but couldn’t make any sense of the IR of this portion of code

this should be the relevant piece of @device_code_warntype

294 ─ %910 = Base.getfield(buffer, :F_points)::CuDeviceMatrix{Tensor{2, 1, Float64, 1}, 1}
β”‚     %911 = Base.mul_int(fiber, family)::Int64
β”‚     %912 = Core.tuple(%911, 2)::Tuple{Int64, Int64}
β”‚     %913 = Base.getfield(%910, :shape)::Tuple{Int64, Int64}
β”‚     %914 = Base.getfield(%913, 1, true)::Int64
β”‚     %915 = Base.slt_int(%914, 0)::Bool
β”‚     %916 = Base.ifelse(%915, 0, %914)::Int64
β”‚     %917 = Base.getfield(%913, 2, true)::Int64
β”‚     %918 = Base.slt_int(%917, 0)::Bool
β”‚     %919 = Base.ifelse(%918, 0, %917)::Int64
β”‚     %920 = Base.sle_int(1, %911)::Bool
β”‚     %921 = Base.sle_int(%911, %916)::Bool
β”‚     %922 = Base.and_int(%920, %921)::Bool
β”‚     %923 = Base.sle_int(1, 2)::Bool
β”‚     %924 = Base.sle_int(2, %919)::Bool
β”‚     %925 = Base.and_int(%923, %924)::Bool
β”‚     %926 = Base.and_int(%925, true)::Bool
β”‚     %927 = Base.and_int(%922, %926)::Bool
└────        goto #296 if not %927
295 ─        goto #297
296 ─        invoke Base.throw_boundserror(%910::CuDeviceMatrix{Tensor{2, 1, Float64, 1}, 1}, %912::Tuple{Int64, Int64})::Union{}
└────        unreachable
297 ─ %932 = Base.getfield(%910, :shape)::Tuple{Int64, Int64}
β”‚     %933 = Base.getfield(%932, 1, true)::Int64
β”‚     %934 = Base.slt_int(%933, 0)::Bool
β”‚     %935 = Base.ifelse(%934, 0, %933)::Int64
β”‚     %936 = Base.sub_int(%935, 0)::Int64
β”‚     %937 = Base.mul_int(1, %936)::Int64
β”‚     %938 = Base.sub_int(%911, 1)::Int64
β”‚     %939 = Base.mul_int(%938, 1)::Int64
β”‚     %940 = Base.add_int(1, %939)::Int64
β”‚     %941 = Base.sub_int(2, 1)::Int64
β”‚     %942 = Base.mul_int(%941, %937)::Int64
β”‚     %943 = Base.add_int(%940, %942)::Int64
└────        goto #298
298 ─ %945 = Base.getfield(%910, :ptr)::Core.LLVMPtr{Tensor{2, 1, Float64, 1}, 1}
β”‚     %946 = Base.llvmcall::Core.IntrinsicFunction
β”‚     %947 = Core.tuple("; ModuleID = 'llvmcall'\nsource_filename = \"llvmcall\"\n\n; Function Attrs: alwaysinline\ndefine void @entry(i8 addrspace(1)* %0, [1 x [1 x double]] %1, i64 %2) #0 {\nentry:\n  %3 = bitcast i8 addrspace(1)* %0 to [1 x [1 x double]] addrspace(1)*\n  %4 = getelementptr inbounds [1 x [1 x double]], [1 x [1 x double]] addrspace(1)* %3, i64 %2\n  store [1 x [1 x double]] %1, [1 x [1 x double]] addrspace(1)* %4, align 8, !tbaa !0\n  ret void\n}\n\nattributes #0 = { alwaysinline }\n\n!0 = !{!1, !1, i64 0, i64 0}\n!1 = !{!\"custom_tbaa_addrspace(1)\", !2, i64 0}\n!2 = !{!\"custom_tbaa\"}\n", "entry")::Tuple{String, String}
β”‚     %948 = Base.sub_int(%943, 1)::Int64
β”‚            (%946)(%947, Nothing, Tuple{Core.LLVMPtr{Tensor{2, 1, Float64, 1}, 1}, Tensor{2, 1, Float64, 1}, Int64}, %945, %904, %948)::Nothing
└────        goto #299
299 ─        goto #300
300 ─        goto #301
301 ─        goto #302
302 ─        nothing::Nothing
303 β”„        return ConvexDamage.nothing
)

findfirst and findlast are the kinds of array operations that are not supported by CuDeviceArray, so that’s probably why indexing doesn’t work.

damn, is there any substitution or should I write my own small function? I guess it’s not supported because of the type instability?

Is there somewhere an overview which operations aren’t supported?

No, union types are supported. It’s not implemented because kernel code is supposed to be scalar. If the input to such functions would be large, you’d have a single thread iterate in a very data-divergent manned, which wouldn’t perform at all. Generally, array operations are to be called from the host (spawning many threads to perform the operation in parallel), while kernel code performs scalar operations.

One notable exception is StaticArrays, but that’s intended to be used on small sets of values, which can be fine in the context of GPU kernels.

ye I know that kernel code should be scalar, but I have some use case, where I have multiple thousands independent evaluations of a quite simple function which cannot be expressed as purely scalar operations. There will be some drawback, but I hope that it will outperform my cpu threaded version at some point.