Broadcasting and function composition (circ)

tbeason · September 17, 2020, 5:04am

How does one easily compose and broadcast?

For example,

x = randn(100,10)
mapslices(cumprod ∘ exp, x; dims=1)

fails because we need to broadcast exp.

One workaround is to do the following

x = randn(100,10)
expdot(v) = exp.(v)
mapslices(cumprod ∘ expdot, x; dims=1)

Are there other, hopefully better, ways?

mcabbott · September 17, 2020, 5:38am

You can write

∘̇(f,g) = x -> f(g.(x))
mapslices(cumprod ∘̇ exp, x; dims=1)

but perhaps shouldn’t. Better:

using Underscores
@_ mapslices(cumprod(exp.(_)), x; dims=1)

xiaodai · September 17, 2020, 6:53am

I would do

x = randn(100,10)
mapslices(xrow->exp.(xrow) |> cumprod, x; dims=1)

DNF · September 17, 2020, 7:12am

These suggestions all create an intermediate array that seems like it is redundant. But it’s not so easy to come up with a trivial solution that avoids it. I thought maybe accumulate would help, but I actually need something like mapaccumulate:

mapaccumulate(exp, *, x; dims=1)

Unfortunately, there is no mapaccumulate.

Looks like a loop is needed, unless I’m missing something. Perhaps Transducers.jl has something helpful.

mcabbott · September 17, 2020, 7:26am

mapaccumulate is an interesting idea. You could also do this as a lazy broadcast, to avoid making too many intermediates. This might even be fast:

reduce(hcat, map(eachcol(x)) do slice
    lazy = Broadcast.Broadcasted(exp, (slice,))
    cumsum(lazy)
end)

avoiding mapslices too. Unfortunately this doesn’t work:

y = similar(x);
foreach(eachcol(y), eachcol(x)) do ycol, xcol
    lazy = Broadcast.Broadcasted(exp, (xcol,))
    cumsum!(ycol, lazy; dims=1)
end

josuagrw · September 17, 2020, 8:52am

The problem with the lazy approach seems to be that cumprod has no method for Generators:

x = rand(10, 100)
lazy_exp(v) = (exp(y) for y in v)
lazy_cumprod(x) = Iterators.accumulate(*, x)
mapslices(collect ∘ lazy_cumprod ∘ lazy_exp, x; dims=1)

xiaodai · September 17, 2020, 9:49am

map(v->accumulate((x,y)->x*exp(y), x; dims=1)

josuagrw · September 17, 2020, 9:59am

mapslices(v->accumulate((x,y)->x*exp(y), v), x; dims=1)

EDIT: Oh, this doesn’t exponentiate the first element of each row

tbeason · September 17, 2020, 5:31pm

this is not equivalent to the other methods, something must be off. Your previous post is slightly faster than my original method.

I should add that the dimensions I gave for the example are way off of the true size. The number of columns is way way bigger than rows. randn(750,100000) is pretty close to the actual size of the input.

josuagrw · September 18, 2020, 4:17pm

This approach is 4 times fast than the mapslices approach


using BenchmarkTools
function cumprod_exp(x)
    y = similar(x)
    foreach(
        [view(y, :, i) for i in 1:size(y, 2)],
        [view(x, :, i) for i in 1:size(x, 2)]
    ) do v, u
        cumprod!(v, exp.(u))
    end
    return y
end

@btime cumprod_exp($(rand(10, 100)))

You can make it even faster if x is no longer needed

function cumprod_exp(x)
    y = similar(x)
    foreach(
        [view(y, :, i) for i in 1:size(y, 2)],
        [view(x, :, i) for i in 1:size(x, 2)]
    ) do v, u
        u .= exp.(u)
        cumprod!(v, u)
    end
    return y
end

Note, that you can easily save even more time by preallocating y – this just avoids the vcats in mapslices

tbeason · September 18, 2020, 5:13pm

Yea I ended up rolling my own along those lines once I realized that there was no in-place mapslices (why!?). I also started chaining together 2 additional operations, a trailing sum of N elements and then a log difference from N periods ago which meant that stuff would allocate regardless unless I was smart about those (since they don’t just use consecutive entries). I ended up using CircularBuffer from DataStructures.jl for that part. Now the only thing that allocates is the buffer.

I guess the limiting part is really just that I’m doing this 10s or 100s of thousands of times. Unfortunate that the way this is being used I can’t parallelize it across columns. Oh well. Thanks for the suggestions!