Broadcasting for a new type & passing a value by ref

Hi,

I have a struct as follows:

struct P
    df::DataFrame
end

where the DataFrame contains labels (qs) & prior probabilities (probs). I’ve implemented multiplication & normalization:

Base.:(*)(p::P, v) = P(DataFrame(qs=p.df.qs, probs=p.df.probs .* v))
normalize!(p::P) = p.df.probs = p.df.probs ./ sum(p.df.probs)

which allows Bayesian updating:

prior = P(DataFrame(qs = ["heads", "tail"], probs=[0.5, 0.5]) # illustrative. Not how it's done
likelihood = [0.75, 0.5]
posterior = prior * likelihood
normalize!(posterior)

Now for the question: say I want to create an update function that will update a probability distribution with new data (likelihood):

# Does not work
function update!(p::P, likelihood)
  p *= likehood
  normalize!(p)
end

This doesn’t work because although p is passed by reference, p *= likelihood is actually translated to p = p * likelihood, which results in a new p being allocated.

OK, fine. As the multiplication is really a broadcast operation - what about supporting broadcast for the new type, so I can write:

p .*= likelihood

This (if it works) should result in modification of the elements of the DataFrame, which should preserve the pass-by-ref semantics.

Trouble is, I can’t figure out how to actually implement the broadcasting interface. I’m lost in a sea of broadcastable, and BroadcastStyles…

How do I define a simple broadcast interface for this type? Is there any other way of to get update! to work?

(obviously, I can just write:

p.probs .*= likelihood

or define a function that will do that for me:

mul!(p::P, likelihood) = p.probs .*= likelihood

but that’s a cop out)

Thanks
DD

Let’s start with the interface:

https://docs.julialang.org/en/v1/manual/interfaces/#man-interfaces-broadcasting

From here, you can see “Methods to implement”:

  • Base.BroadcastStyle
  • Base.similar

So that is the absolute minimum you need. The former is just a trait with a type you define to allow for dispatching to your methods, the latter tells the broadcasting machinery how to allocate an object of your type (with given element type). In addition to the above, you’ll also want to define copyto! for your broadcast style for in-place broadcasting.

I’d recommend giving this talk a watch, it has an introduction to broadcasting a custom array in the second half:

Thanks, the lecture was pretty helpful.

I’ve progressed and now have:

using DataFrames
struct P
    df::DataFrame
end
P(q,p) = P(DataFrame(qs=q, ps=p))
Base.size(p::P) = size(p.df)
Base.getindex(p::P, i) = p.df[findfirst(==(i), p.df.qs), :ps]
Base.show(io::IO, m::MIME{Symbol("text/html")}, p::P) = Base.show(io, m, p.df)
Base.ndims(n::Type{P}) = 1
Base.Broadcast.broadcasted(::typeof(*), p::P, x) = P(p.df.qs, p.df.ps .* x)

I can now do:

g=P(["heads", "tails"], [0.5, 0.5])
g = g .* [0.2, 0.5]

However, the following doesn’t work:

g .*= [0.2, 0.5]

Julia complains about

MethodError: no method matching copyto!(::P, ::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(identity), Tuple{P}})

How do I define this method correctly?

function Base.copyto!(d::P, bc::Base.Broadcast.Broadcasted)
   # ???
end

Regards,
DD

EDIT:
the following seems to work:

function Base.copyto!(d::P, bc::Base.Broadcast.Broadcasted)
    if bc.f == identity
        d.df.ps = bc.args[1].df.ps
    else
       # ???
    end
    return d
end

g=P(["heads", "tails"], [0.5, 0.5])
f(g) = g .*= [0.5, 0.5]
f(g)
g

This seems to work and I get the expected result:

2 rows × 2 columns

qs	ps
String	Float64
1	heads	0.25
2	tails	0.25

However, I have no idea what goes into the else clause. In addition, this solution involves too much copying.