ForwardDiff - Multiple gradient evaluations at once?

LudiWin · July 17, 2019, 6:13pm

Hi,

so I’m fairly new to Julia coming from Python.

I was playing around with the ForwardDiff package and stumbled upon the following ‘problem’:

A 2-D Gaussian from the Distributions package can take a matrix of size = (2, N) and outputs N probabilities.

Now I wanted to evaluate the gradient of multiple data points similar to PyTorch but apparently the gradient can only be evaluated for single data vectors and not data matrices.

Here is a MWE of my ‘problem’:

using Distributions
using Plots
using ForwardDiff

μ = [ 1. ; 2.]
Σ = [ 1. 0 ; 0 1]
dist = MvNormal(μ, Σ)

# samples = rand(dist, 2000)
# scatter(samples[1,:], samples[2,:]) for visualization purposes

matrix = [ 1 1 ; 2. 2]
vec = [ 1. ; 2.]

prob_vec = pdf(dist, vec) # evaluates to scalar value
prob_matrix = pdf(dist, matrix) # evaluate to array with two values like it should
grad_vec = ForwardDiff.gradient(x -> pdf(dist, x), vec) # evaluates to gradient with two 0's as it should bc of mean position
grad_matrix = ForwardDiff.gradient(x -> pdf(dist, x), matrix) # doesnt work

So is there any way to evaluate the gradient of multiple data points or do I really have to put in a for loop?
Or vectorize it with the dot operator although I haven’t figured out how to do that.

Thank you in advance for your time and effort! =)

PS: This was the error message
ERROR: MethodError: no method matching extract_gradient!(::Type{ForwardDiff.Tag{getfield(Main, Symbol(“##31#32”)),Float64}}, ::Array{Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Main, Symbol(“##31#32”)),Float64},Float64,4},1},2}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(Main, Symbol(“##31#32”)),Float64},Float64,4},1})

DNF · July 17, 2019, 6:32pm

As far as I can tell you are asking for the value of the pdf of a vector distribution at a matrix value, which doesn’t make sense. You can’t get a matrix value from a vector distribution.

So, yes you either have to write a loop or perhaps broadcast over eachcolumn(matrix) or something like that (or maybe it’s eachcol)

LudiWin · July 17, 2019, 6:35pm

Mathematically you’re absolutely right.

Yet I was hoping for some kind of batched operator for evaluating the gradient?

Thanks for your answer. =)

rdeits · July 17, 2019, 6:39pm

There is generally no need to “vectorize” code like this in Julia–if you want to do something multiple times, just use a loop. Loops are fast.

You can probably broadcast over eachcolumn as suggested above, but the only reason to do so would be if it makes your code easier to understand or if you can let broadcast fusion combine multiple operations. Otherwise, just use a loop.

LudiWin · July 17, 2019, 6:44pm

I just found a solution (also it’s a bit clunky) by using using a array of vectors and the vectorized operator:

matrix = [ [ 1. ; 2.], [ 2. ; 2.]]
grad = ForwardDiff.gradient.(x -> pdf(dist, x), matrix)

@rdeits:
So I know that for-loops are not as bad as in Python (they are actually horribly slow in Python) because of the JIT compiler.
In terms of performance, can I readily use for-loops in Julia and the JIT compiler is able to optimize them well?

kristoffer.carlsson · July 17, 2019, 6:49pm

Yes, although batching can still be useful in some cases eg when it sets up the memory layout more favourably.

LudiWin · July 17, 2019, 6:56pm

Cool, thanks a lot folks

rdeits · July 17, 2019, 6:58pm

Loops in Julia are as fast as loops in C, C++, fortran, etc.

Moreover, broadcasting (including dot-calls like gradient.(x) are internally implemented as loops, just like all the “vectorized” operations in tools like Numpy.

DNF · July 17, 2019, 6:59pm

Those batched operations are really just for-loops under the hood. In python or Matlab they just call out to for-loops written in a fast language. Julia is such a fast language.

Topic		Replies	Views
Vectorized gradient using ForwardDiff Specific Domains forwarddiff	5	424	July 17, 2022
Gradient evaluation with ForwardDiff and LoopVectorization General Usage forwarddiff , loopvectorization	2	902	October 24, 2021
ForwardDiff with function using broadcasting New to Julia question	1	750	October 25, 2020
ForwardDiff failing on simple polynomial New to Julia	1	941	September 28, 2017
Gradient of a normal (NOT vector) function that has vector arguments General Usage	10	1124	July 30, 2019

ForwardDiff - Multiple gradient evaluations at once?

Related topics