Type stability on Arrays of Arrays in nested for loops?

swishmas · February 19, 2021, 2:31pm

I was writing a function that would ideally be written as

  function ideal(supervec::Array{Array{Int64,1},1}, vec::Array{Int64,1})
    return vcat(supervec[vec])
  end

But this takes a bit more time than I think is unnecessary

julia> using BenchmarkTools

julia> @btime ideal([[1],[2]],[1,2])
  232.497 ns (6 allocations: 576 bytes)
2-element Array{Array{Int64,1},1}:
 [1]
 [2]

So, I tried to write a longer version

  function simplecount(supervec::Array{Array{Int64,1},1}, vec::Array{Int64,1})
    consize = sum(a->length(supervec[a]),vec)
    con = Array{Int64,1}(undef,consize)
    counter = 0
  
    for j = 1:length(vec)
      b = vec[j]
      altvec = supervec[b]
      for r = 1:length(supervec[b])
        p = altvec[r]
        counter += 1
        con[counter] = p
      end
    end
    return con
  end

This performs better

julia> @btime simplecount([[1],[2]],[1,2])
  177.572 ns (5 allocations: 480 bytes)
2-element Array{Int64,1}:
 1
 2

But it has a type instability when I use @code_warntype

julia> @code_warntype simplecount([[1],[2]],[1,2])
Variables
  #self#::Core.Compiler.Const(simplecount, false)
  supervec::Array{Array{Int64,1},1}
  vec::Array{Int64,1}
  #97::var"#97#98"{Array{Array{Int64,1},1}}
  val::Nothing
  consize::Int64
  con::Array{Int64,1}
  counter::Int64
  @_9::Union{Nothing, Tuple{Int64,Int64}}
  j::Int64
  b::Int64
  altvec::Array{Int64,1}
  @_13::Union{Nothing, Tuple{Int64,Int64}}
  r::Int64
  p::Int64

Body::Array{Int64,1}
1 ─       Core.NewvarNode(:(val))
│   %2  = Main.:(var"#97#98")::Core.Compiler.Const(var"#97#98", false)
│   %3  = Core.typeof(supervec)::Core.Compiler.Const(Array{Array{Int64,1},1}, false)
│   %4  = Core.apply_type(%2, %3)::Core.Compiler.Const(var"#97#98"{Array{Array{Int64,1},1}}, false)
│         (#97 = %new(%4, supervec))
│   %6  = #97::var"#97#98"{Array{Array{Int64,1},1}}
│         (consize = Main.sum(%6, vec))
│   %8  = Core.apply_type(Main.Array, Main.intType, 1)::Core.Compiler.Const(Array{Int64,1}, false)
│         (con = (%8)(Main.undef, consize))
│         (counter = 0)
│         $(Expr(:inbounds, true))
│   %12 = Main.length(vec)::Int64
│   %13 = (1:%12)::Core.Compiler.PartialStruct(UnitRange{Int64}, Any[Core.Compiler.Const(1, false), Int64])
│         (@_9 = Base.iterate(%13))
│   %15 = (@_9 === nothing)::Bool
│   %16 = Base.not_int(%15)::Bool
└──       goto #7 if not %16
2 ┄ %18 = @_9::Tuple{Int64,Int64}::Tuple{Int64,Int64}
│         (j = Core.getfield(%18, 1))
│   %20 = Core.getfield(%18, 2)::Int64
│         (b = Base.getindex(vec, j))
│         (altvec = Base.getindex(supervec, b))
│   %23 = Base.getindex(supervec, b)::Array{Int64,1}
│   %24 = Main.length(%23)::Int64
│   %25 = (1:%24)::Core.Compiler.PartialStruct(UnitRange{Int64}, Any[Core.Compiler.Const(1, false), Int64])
│         (@_13 = Base.iterate(%25))
│   %27 = (@_13 === nothing)::Bool
│   %28 = Base.not_int(%27)::Bool
└──       goto #5 if not %28
3 ┄ %30 = @_13::Tuple{Int64,Int64}::Tuple{Int64,Int64}
│         (r = Core.getfield(%30, 1))
│   %32 = Core.getfield(%30, 2)::Int64
│         (p = Base.getindex(altvec, r))
│         (counter = counter + 1)
│         Base.setindex!(con, p, counter)
│         (@_13 = Base.iterate(%25, %32))
│   %37 = (@_13 === nothing)::Bool
│   %38 = Base.not_int(%37)::Bool
└──       goto #5 if not %38
4 ─       goto #3
5 ┄       (@_9 = Base.iterate(%13, %20))
│   %42 = (@_9 === nothing)::Bool
│   %43 = Base.not_int(%42)::Bool
└──       goto #7 if not %43
6 ─       goto #2
7 ┄       (val = nothing)
│         $(Expr(:inbounds, :pop))
│         val
└──       return con

Why is the iterator in the innermost for-loop unable to distinguish between Nothing and Tuple (the line @_9::Union{Nothing, Tuple{Int64,Int64}} is yellow)? Is there a way to force the compiler to know which one it should be? Is there is a simpler way to write something like this? Thanks a lot!

lmiq · February 19, 2021, 2:57pm

swishmas · February 19, 2021, 3:20pm

Ok, I’ll assume it is ok then. The reason that I asked was when I run ProfileView, this is the only function in the whole code that shows up as red on the resulting graph. So, I guess I should ignore the red bar there and trust this comment and @code_warntype. Thank you!

pixel27 · February 19, 2021, 4:17pm

You could change it (both) to a while loop, then there is no yellow.

swishmas · February 19, 2021, 4:25pm

Interesting, that seems to work, @pixel27:

  function whilesimplecount(supervec::Array{Array{Int64,1},1}, vec::Array{Int64,1})
    consize = sum(a->length(supervec[a]),vec)
    con = Array{Int64,1}(undef,consize)
    counter = 0
  
    j = 0
    while j < length(vec)
      j += 1
      b = vec[j]
      altvec = supervec[b]
      r = 0
      while r < length(supervec[b])
        r += 1
        p = altvec[r]
        counter += 1
        con[counter] = p
      end
    end
    return con
  end

with no loss in efficiency when tested with @btime

@btime whilesimplecount([[1],[2]],[1,2])
  179.348 ns (5 allocations: 480 bytes)

Thank you!

lmiq · February 19, 2021, 4:34pm

but no gain either, I expect

pixel27 · February 19, 2021, 5:16pm

I was seeing differences of about 6ns but that was really just fluctuation based on other activity on the computer. However it sounds like ProfileView is giving false positives, and unfortunately false positives have a habit of hiding true positives.

swishmas · February 19, 2021, 5:18pm

With the new improved “while” version, the troublesome spots in ProfileView are now gone, so good news there.

I think for an individual function call, there is no difference that I can see. For larger inputs and repeated calls to this function, I can see a bit of a difference. Thank you both so much for your help! I really appreciate your input!

Jeff_Emanuel · February 19, 2021, 6:29pm

I think this is a really bad idiom, especially since it has no advantage other than removing an inconsequential warning. It spreads the loop control logic over three lines and makes it harder for the reader to parse than the clear, concise, and much less error prone for j=1:length(vec). It screams “maintenance land mine” to me.

rdeits · February 19, 2021, 6:33pm

Yeah, agreed–the compiler turns you for loop into a while loop internally anyway, obeying the iteration protocol discussed here: Writing Iterators in Julia 0.7

If you find the while loop easier to write or read, then by all means go for it, but there should be no efficiency advantage in turning all your for loops into while loops just to make @code_warntype less colorful.

Seif_Shebl · February 19, 2021, 8:34pm

I’m not sure if it’s a good idea, but if you change the function signature as below, you get rid of all warnings. Performance is unchanged, of course. BTW, what is intType?

function simplecount(supervec::Array{Array{Int64,1},1}, 
                     vec::Array{Int64,1},
                     L=length(vec))
    ...
    for j = 1:L
    ...

swishmas · February 19, 2021, 9:23pm

intType was just trying to copy an HDF5 variable name…it’s just Int64 here

swishmas · February 19, 2021, 9:25pm

My main reason for this is that the code might be used by other people, so if they notice that there is potentially a type stability issue, I will hear about it. They may or may not know about this non-issue.

It’s a bit of a shame that the major focus of debugging in Julia is to get rid of type stability and that phantom issues can appear…but it’s a relief that I didn’t do anything wrong!

Nosferican · February 19, 2021, 9:50pm

Maybe try

using BenchmarkTools
ideal(x::AbstractVector{<:AbstractVector}, y::AbstractVector{<:Integer}) =
    reduce(vcat, x[elem] for elem in y)
x = [[1], [2]]
y = [1, 2]

julia> @code_warntype ideal(x, y) # Seems to be gucci
Variables
  #self#::Core.Compiler.Const(ideal, false)
  x::Array{Array{Int64,1},1}
  y::Array{Int64,1}
  #15::var"#15#16"{Array{Array{Int64,1},1}}

Body::Array{Int64,1}
1 ─ %1 = Main.:(var"#15#16")::Core.Compiler.Const(var"#15#16", false)
│   %2 = Core.typeof(x)::Core.Compiler.Const(Array{Array{Int64,1},1}, false)
│   %3 = Core.apply_type(%1, %2)::Core.Compiler.Const(var"#15#16"{Array{Array{Int64,1},1}}, false)
│        (#15 = %new(%3, x))
│   %5 = #15::var"#15#16"{Array{Array{Int64,1},1}}
│   %6 = Base.Generator(%5, y)::Base.Generator{Array{Int64,1},var"#15#16"{Array{Array{Int64,1},1}}}
│   %7 = Main.reduce(Main.vcat, %6)::Array{Int64,1}
└──      return %7

julia> @btime ideal($x, $y)
  49.821 ns (1 allocation: 96 bytes)
2-element Array{Int64,1}:
 1
 2

rdeits · February 19, 2021, 10:41pm

That’s certainly your choice to make. But note that if you go down this road, you won’t be able to use any iterators of any kind (not just for loops). That’s a huge price to pay to avoid false positive issue reports from users. particularly when you can just point those users to the excellent explanations which have already been written, e.g. Union-splitting: what it is, and why you should care

swishmas · February 19, 2021, 11:55pm

That is very slick code. Thank you. Just for anyone stumbling onto this code in the future, had I initialize the inputs outside of the function call like they are here, then the results essentially match. Thank you! Very clever!

swishmas · February 20, 2021, 12:01am

I very much agree here. You are correct that people should know. Had this been an issue that extended to more than one function, I would leave it alone, but for some reason some similar functions in the code don’t have this issue. I might change it to the very clever implementation with reduce above.

Thanks again for your input, I very much appreciate it!

Topic		Replies	Views
Question about "type-stability" of Arrays in Julia 0.6 General Usage performance , type , arrays , type-stability	3	858	August 16, 2017
Type stability and array of matrices Performance arrays , type-stability	5	656	July 19, 2018
How to make type-stable: passing an element of a vector of functions Performance question	4	562	April 26, 2021
Struggling with type stability - is SMatrix type stable? Performance type-stability , staticarrays	9	989	August 6, 2018
Array of Arrays times Array is type unstable General Usage	6	457	July 31, 2020

Type stability on Arrays of Arrays in nested for loops?

Related topics