Very slow loop

Henrique_Becker · January 19, 2021, 8:58pm

I have to disagree here.

Keyword arguments may be completely independent from the positional arguments of the same function/method. There is no special reason to make them dependent of the remaining positional arguments if this is not the case.

Many times the default of a keyword argument is just an instance of some object representing a safe default, and if this (often immutable) object is used in other places, then it is common for it to be stored in a global constant (for both the “no magical numbers” pattern and ease of change of the value in all contexts with a new release version).

lmiq · January 19, 2021, 9:08pm

Ok, if there was a global constant x yes, although probably I would choose a better name, like x0. Yet it was not the case here.

jmcastro2109 · January 21, 2021, 11:58am

Leandro, one last question. Do you mind explaining me why is the fragment

if N>NF
        A= vcat(A,A,A)
        H = cat(H ,H ,H ,dims=2)
        P= cat(P,P,P,dims=2)
    end

so detrimental for performance even in we were working on a case in which N==NF, so we should not be entering that loop?

lmiq · January 21, 2021, 12:46pm

As far as I understand, the problem is in cat(...,dims=2). That can return an array of different kind, depending on the input. For example:

julia> B = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> cat(B,B,dims=2)
3×2 Array{Int64,2}:
 1  1
 2  2
 3  3

In this case, the input is a vector and the output is a two-dimensional array. In the code that contains that conditional clause, there is no way for the compiler to know in advance if that change in dimensions will occur or not, because it is dependent on an input parameter of the function. Thus, the code does not become specialized for the type of array that you are operating on. That is independent on entering or not the conditional, the problem is the possibility of entering and changing the meaning of the foregoing operations.

If you put that in an outer function, as I suggested, the inner function will get specialized for the type of array received every time. Consider the following minimal example:

julia> function f(A,N)
         if N == 0
           println("got here")
           A = cat(A,A,dims=2)
         end
         s = 0.
         for a in A
           s += a
         end
         s
       end
f (generic function with 1 method)

julia> function outer(A,N)
         if N == 0
           println("got here")
           A = cat(A,A,dims=2)
         end
         g(A)
       end
       function g(A)
         s = 0.
         for a in A
           s += a
         end
         s
       end
g (generic function with 1 method)

julia> A = rand(3,3);

julia> using BenchmarkTools

julia> @btime f($A,1)
  269.185 ns (18 allocations: 288 bytes)
4.425721907655399

julia> @btime outer($A,1)
  17.008 ns (1 allocation: 16 bytes)
4.425721907655399

Note that it also has the same problem, and the same solution. If you do @code_warntype on f(A,1) or on outer(A,1), you will see that both are type-unstable. Yet, no matter which is the input of g(A), it is always type-stable. Since g is the function doing the expensive computations, that specialization is absolutely determinant to performance.

Sometimes, depending on the conditional involved, the compiler figures out that it can split the function in two branches, and that does not happen. But here it cannot do it.

mcabbott · January 21, 2021, 12:54pm

You can make cat type-stable, like so:

julia> @code_warntype cat(B, B, dims=2)
...
Body::Any

julia> @code_warntype hcat(B, B)
...
Body::Matrix{Int64}

julia> @code_warntype cat(B, B, dims=Val(2))
...
Body::Matrix{Int64}

The issue with the first one is that cat(B, B; dims=1) isa Vector – same input types, but different output.

lmiq · January 21, 2021, 1:11pm

Why does this make any difference? What is less constant in the number 2 than in Val(2)?

Henrique_Becker · January 21, 2021, 1:24pm

2 is a value of type Int
Val(2) is a value of type Val{2}

Therefore, the dimension is mapped to type-space.

mcabbott · January 21, 2021, 1:25pm

One answer is that @code_warntype only looks one layer deep. For example, it tells us map is type-stable here, because it evaluates the inputs before getting to work:

julia> @code_warntype map(exp, cat(B,B, dims=rand(1:3)))
Body::Vector{Float64}

julia> @code_warntype map(exp, cat(B,B, dims=rand(1:3)))  # another run of the same
Body::Array{Float64, 3}

But another answer is that if you literally write dims = Val(2) in the source code, Julia is usually smart enough to propagate that constant literal 2 outwards, to Val{2}() which has it in the type domain. Writing d = 2; cat(B, B, dims=Val(d)) would in general work less well (even though @code_warntype acting on cat won’t tell you this), but sometimes it will still manage:

julia> mycat(B) = begin d=1; cat(B,B, dims=Val(d+1)) end;

julia> @code_warntype mycat(B)
Body::Matrix{Int64}

julia> mycat2(B, d=2) = cat(B,B, dims=Val(d));

julia> @code_warntype mycat2(B)  # seems easier but fails
Body::Any

lmiq · January 21, 2021, 1:31pm

Thanks @Henrique_Becker and @mcabbott , but in that case I think I have to go to the books to understand that… or maybe not even that. It is very unclear to me how the compiler will have more information in case than in the other. The compiler won’t as effectively propagate the type information of a Int value than a value of type Val{2}? Seems something very specific about the way that the type-inference is implemented rather than a fundamental principle, isn’t it?

dmolina · January 21, 2021, 1:36pm

Yes, it is an interesting topic but a little advanced, the use of Val.

From my point of view, consider comments from @mcabbott, I get the conclusion that it is usually better to use vcat or hcat when it is possible than general cat, because the most general function like that could return generic types.

mcabbott · January 21, 2021, 1:37pm

Yes, I don’t think these follow logically from the definition of the type system. In older Julia code you will see much more Val{2}() or types Val{2} used instead of Val(2), as it used to be less smart about propagating things. The precise details of when this will & won’t happen are above my pay grade, I just learn by poking it & seeing what happens.

@dmolina yes, I think writing hcat is usually best & clearest when you know in advance. There is no special name for cat(A, B, dims=Val(3)) so that’s when it’s worth knowing that it accepts Val.

Henrique_Becker · January 21, 2021, 3:40pm

Not obligatorily. The compiler could, of course, if sufficient smart propagate the value 2 of type Int and obtain the same results. But at the moment 2 is encoded in the type Val{2} then the function receiving it will (probably) be specialized for that type what leads to easier analysis (at cost of compiling one more specialization of cat). It seems to me that it follows from the concept of function barrier, but I may be wrong. Anyway, considering that type-instability always takes inference in account, and inference is not entirely defined by the language (i.e., versions of Julia may change what they are able to infer without this being considered breaking), then it is somewhat type-inference specific.

lmiq · January 24, 2021, 1:49pm

I just noticed that if you annotate the output of cat, the instability is solved in this case as well (at least in the MWE). The result is as good as using dims=Val(2), and probably more intuitive:

julia> function g(A,N)
         if N == 0
           println("got here")
           A = cat(A,A,dims=2)::Array{Float64,2}
         end
         s = 0.
         for a in A
           s += a
         end
         s
       end
g (generic function with 1 method)

julia>  A = rand(3,3);

julia> @btime g($A,1)
  5.923 ns (0 allocations: 0 bytes)
5.379437443399205

DNF · January 24, 2021, 7:26pm

It’s slightly less general than hcat et al, though, since it will fail for N-d arrays for N>2.