Numpy Diff in Julia DataFrames?

can’t find the rolling diff in julia. Don’t want to use pycall just for this one thing.

close_data = [147.82,149.5,149.78,149.86,149.93,150.89,152.39,153.74,152.79,151.23,151.78]

In pandas/numpy I can just run

r = diff(np.log(close_data)  

and r would result in

[ 0.011301  0.001871  0.000534  0.000467  0.006383  0.009892  0.00882
 -0.006198 -0.010263  0.00363 ]

which is effectively

log((close_data[1]/close_data[2]) - 1 )

Happy to code it using an array but just wondered if there is an off the shelf way of getting this done?

diff is included in Base: https://docs.julialang.org/en/v1/base/arrays/#Base.diff

julia> diff(log.(close_data))
10-element Vector{Float64}:
  0.011301075474000832
  0.001871157990479766
  0.0005339741149432697
  ...
4 Likes

And if diff wasn’t provided by base, here’s what I would suggest you’d do as a trivial ‘off the shelf’ replacement:

mydiff(v::AbstractVector) = @views v[(begin+1):end] .- v[begin:(end-1)]

In words, this is just saying

Take a view of v from the second index to it’s last index, and then from that, do a (broadcasted) subtraction with a view from the first index to the second last index

In a handful of tests I did, this appears to perform just as well or better than the version from base, and is in some ways ‘smarter’ because it takes advantage of range types naturally to avoid unnecessary work:

julia> diff(1:4)
3-element Vector{Int64}:
 1
 1
 1

julia> mydiff(1:4)
StepRangeLen(1, 0, 3)
2 Likes

WOW! I missed this one BIG time! sorry about that. Thanks for the solution. I just went to docs and typed in “diff” and it was the first answer. MAKE IT STOP!! thanks again

ALSO
@Mason thanks for taking the time to give me some more thinking to do. I am STILL having difficulties with many of the julia concepts and great examples like this are really helpful

1 Like