Best practices regarding Adjoint, when should it be used instead of Array, best way to 'style it out'

Very often I generate a structure like this, when all I want (as far as I know) is an array:

10×2 Adjoint{Float64,Array{Float64,2}}:
 0.4988    0.5012
 0.378885  0.621115
 0.651367  0.348633
 0.474066  0.525934
 0.405123  0.594877
 0.579569  0.420431
 0.309181  0.690819
 0.355957  0.644043
 0.324029  0.675971
 0.337925  0.662075

Can someone explain:

  1. the best practice way of dealing with [unwanted] Adjoints?
  2. why I would want to potentially write functions that accept/return Adjoint instead of just array?

I know it might look stupid but this is what I do now (is there a better way?):

Always convert the output of some algebraic statement to array, no matter what it’s supposed to return, just in case it’s Adjoint and not a vector or array.

Use copy to materialize it, just as it says in in the docs.

You really need to post your code here, because I don’t see a reason a priori for a function to not accept Adjoints.

If you use arrays and not matrices (i.e. you don’t care about the algebraic structure, which seems to be your case), you should use permutedims.

1 Like

I don’t know your specific use case but I find that for most functions I wrote in the pre Adjoint days that took arrays as arguments (i.e. f(x::Array)) work just fine with adjoints. Just need to change f(x::Array) to f(x::AbstractArray) and things mostly just work. An important exception is if you’re passing the array outside of julia into some c or fortran linear algebra routine that requires a contiguous array. But I would argue that in this case you should hold onto the adjoint as long as possible and just convert to an Array before passing it outside of julia.

Depending on your use case, working with adjoints instead of converting them to new Arrays could save significant allocations. It’s not just about doing linear algebra rigorously. Here’s a simple example illustrating this

julia> using BenchmarkTools

julia> f1(x) = x'
f1 (generic function with 1 method)

julia> f2(x) = Array(x')
f2 (generic function with 1 method)

julia> a = rand(5000,5000);

julia> @benchmark f1($a)
BenchmarkTools.Trial: 
  memory estimate:  16 bytes
  allocs estimate:  1
  --------------
  minimum time:     4.055 ns (0.00% GC)
  median time:      8.516 ns (0.00% GC)
  mean time:        8.439 ns (7.66% GC)
  maximum time:     6.475 μs (99.86% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark f2($a)
BenchmarkTools.Trial: 
  memory estimate:  190.73 MiB
  allocs estimate:  3
  --------------
  minimum time:     212.697 ms (0.00% GC)
  median time:      215.247 ms (0.05% GC)
  mean time:        222.924 ms (3.94% GC)
  maximum time:     276.981 ms (22.93% GC)
  --------------
  samples:          23
  evals/sample:     1

The f1 function returns an Adjoint, which is just a wrapper around the original array. No need to copy the contents into a new array. The second version has to allocate a new array to store the transposed matrix.

2 Likes

Thanks, when would an array and a matrix not be used interchangeably? Matrix operations work fine on arrays as far as I know.

Array is just a collection of numbers, while matrix is an algebraic object that doesn’t need to be represented as such (see Diagonal or Tridiagonal for trivial examples). Adjoint has no meaning in context of an array, so you calling it to transpose a 2D array is in a sense abuse of functionality that can lead to you shooting yourself in the foot.

2 Likes

It’s probably a good idea here to distinguish between the mathematical concept of a matrix and the julia type alias Matrix{T}. In julia Matrix{T} is just an alias for Array{T,2}. The special julia matrix types like Diagonal and Tridiagonal are special data structures used to represent matrices (matrices in the mathematical sense now) with special structure. A diagonal matrix—matrix in the mathematical sense again—can be stored as a dense 2D Array with zeros everywhere except on the diagonal, but it can also be stored in the special Diagonal type, which only stores the diagonal elements in the vector and not the structural zeros.

Note that Diagonal <: Matrix is false. We do have, however, that Diagonal <: AbstractMatrix. And AbstractMatrix is just an alias for AbstractArray{T,2} where T.

As I mentioned earlier in this thread, there are some situations where you really do need to work with a dense array, stored as an Array in julia. But if you don’t have a specific reason to limit yourself to plain Arrays, it’s best to take advantage of all the great AbstractArray subtypes offered by julia.

2 Likes