I’m totally new in Julia. Now, I’m working on derivative. I found that ‘2nd derivatives’ (ex. function derivative_sec_fd(f,x)), but still couldn’t find how to do partial derivative…
Could you tell me which command should I use?
Do you try to do numeric, automatic or symbolic differentiation?
Related (maybe xref?): Partial derivatives in Julia - Stack Overflow
You could use automatic differentiation to calculate partial derivatives of a function
julia> using ForwardDiff
julia> f(x) = 2*x[2]^2+x[1]^2 # some function
f (generic function with 2 methods)
julia> g = x -> ForwardDiff.gradient(f, x); # g is now a function representing the gradient of f
julia> g([1,2]) # evaluate the partial derivatives (gradient) at some point x
2-element Array{Int64,1}:
2
8
Thank you for all of replies. I’ll try to do it !!
Is there anything wrong with doing it this way?
julia> using ForwardDiff
julia> f(x, y) = 2y^2 + x^2
f (generic function with 1 method)
julia> ∂f_∂x(x, y) = ForwardDiff.derivative(x -> f(x, y), x)
∂f_∂x (generic function with 1 method)
julia> ∂f_∂y(x, y) = ForwardDiff.derivative(y -> f(x, y), y)
∂f_∂y (generic function with 1 method)
julia> ∂f_∂x(1, 2)
2
julia> ∂f_∂y(1, 2)
8
Just thought I’d ask to check if it’s unhealthy to define partial derivatives like so.
I think it’s fine. You’re formally introducing y
via closure, which is usually pretty efficient in Julia.
Disclaimer: I do this myself for some of my codes.
No particularly, but if you need both partial derivatives anyway, just define a function on a vector and use ForwardDiff.gradient
.
This is true, but given f(x, y)
,
g(x) = f(x[1], x[2])
∇g(x) = ForwardDiff.gradient(g, x)
∂f_∂x(x, y) = ∇g([x, y])[1]
∂f_∂y(x, y) = ∇g([x, y])[2]
is noticeably longer and somewhat less elegant than
∂f_∂x(x, y) = ForwardDiff.derivative(x -> f(x, y), x)
∂f_∂y(x, y) = ForwardDiff.derivative(y -> f(x, y), y)
in my opinion.
I’ve needed second derivatives too, so the latter method is much less verbose (and accomplishes the same thing, AFAIU).
Some one in slack (sorry, I forgot whom) gave me this snippet
using ForwardDiff: derivative
partiali(f, x, i) = ForwardDiff.partials(f([Dual{}(x[j], 1.0*(i==j)) for j in eachindex(x)]))[]
struct ∂{k, F} <: Function
f::F
∂{k}(f::F) where {k, F} = new{k, F}(f)
end
function (pd::∂{k})(args...) where {k}
N = length(args)
derivative(argk -> f(ntuple(i -> i == k ? argk : args[i], Val(N))...), args[k])
end
f(x, y, z, w) = x + 2y + 3z + 4w
∂{3}(f)(1,1,1,1)
Addendum via @cscherrer
julia> using StructArrays
julia> using ForwardDiff
julia> function partiali(n,i)
ith = zeros(n)
ith[i] += 1
function (f,x)
sa = StructArray{ForwardDiff.Dual{}}((x, ith))
return f(sa)
end
end
partiali (generic function with 2 methods)
julia> f(x) = sum(x -> x^2, x)
f (generic function with 1 method)
julia> x = randn(1000);
julia> ∂₅₀ = partiali(1000,50);
julia> x[50]
2.1129129387789667
julia> @btime $f($x)
82.987 ns (0 allocations: 0 bytes)
1000.4848272576928
julia> @btime $∂₅₀($f,$x)
167.038 ns (0 allocations: 0 bytes)
Dual{Nothing}(1000.4848272576928,4.225825877557933)
Thanks @mschauer
This is the approach I’d go with if you just need one partial at a time, say for coordinate descent or Gibbs sampling. A big advantage of doing it this way is that by using a StructArray you can use the original data with no new allocations.
Hi @cscherrer , can you explain the unnamed function inside partiali()
?
Say you have a function like
julia> f(x, y, z) = x^2 * y - x * y * z
f (generic function with 1 method)
At the point (2,3,4)
, this is
julia> f(2,3,4)
-12
If you wanted \frac{\partial f}{\partial y} at this point, you could get this by replacing the values with ForwardDiff.Dual
s:
julia> using ForwardDiff: Dual
julia> f(Dual{}(2,0), Dual{}(3,1), Dual{}(4,0))
Dual{Nothing}(-12,-4)
If the function instead takes an array, like
julia> function g(arr)
x,y,z = arr
x^2 * y - x * y * z
end
g (generic function with 1 method)
then we could write an input as e.g.
julia> [Dual{}(2,0), Dual{}(3,1), Dual{}(4,0)]
3-element Vector{Dual{Nothing, Int64, 1}}:
Dual{Nothing}(2,0)
Dual{Nothing}(3,1)
Dual{Nothing}(4,0)
To save typing, we could instead write this as
julia> Dual{}.([2,3,4],[0,1,0])
3-element Vector{Dual{Nothing, Int64, 1}}:
Dual{Nothing}(2,0)
Dual{Nothing}(3,1)
Dual{Nothing}(4,0)
Now, notice that [2,3,4]
is just the point we want to evaluate at, and [0,1,0]
is a “one-hot-encoding” of the variable we want the partial with respect to. The latter is ith
in @mschauer’s post above.
Finally, even if x
and ith
are given, building the Dual
vector in this way requires allocation:
julia> x = [2,3,4]
3-element Vector{Int64}:
2
3
4
julia> ith = [0,1,0]
3-element Vector{Int64}:
0
1
0
julia> @allocated Dual{}.(x, ith)
176
The idea behind the StructArray
is that it lets you avoid this allocation. There are a few other packages that would work in the same way - you could instead use MappedArrays or LazyArrays, and the effect would be the same.
Thank you, thats a great explanation!! Seems like something the package could implement, but i suppose they don’t want to bloat it. But then again, implementing this is not perfectly strait forward to the novice user…