Usage of multiple dispatch in Base (wikipedia entry)

tim.holy · July 13, 2017, 11:07am

I happened to stumble across Wikipedia’s page on Multiple dispatch and noted that (at the time) Julia’s only mention was a single link, and all serious discussion was focused on other languages (many of which do not have native support for MD).

If you follow that link, you’ll see I fixed that . In particular, folks may be interested in the graphs I added analyzing the usage of multiple dispatch in practice. For archival purposes, here’s my analysis code:

# Pairwise comparison of argument types in method signatures
struct ArgSharingCounts
    # counts by argument position of same/different
    # argsame[n][s] or argdiff[n][s]: methods with n arguments, comparing slot s
    argsame::Vector{Vector{Int}}
    argdiff::Vector{Vector{Int}}
    # histogram of total numbers of identical arguments
    # histsame[n][k]: number of method-pairs with n arguments of which k-1 are the same
    histsame::Vector{Vector{Int}}
end
ArgSharingCounts() = ArgSharingCounts(Vector{Vector{Int}}(0), Vector{Vector{Int}}(0), Vector{Vector{Int}}(0))

Base.length(asc::ArgSharingCounts) = length(asc.histsame)

function grow_to!(asc::ArgSharingCounts, n)
    while (m = length(asc)) < n
        push!(asc.argsame, zeros(Int, m+1))
        push!(asc.argdiff, zeros(Int, m+1))
        push!(asc.histsame, zeros(Int, m+2)) # histogram has extra slot for k=0 case
    end
    asc
end

sig(m::Method) = sig(m.sig)
sig(s) = s
sig(s::UnionAll) = Base.unwrap_unionall(s)

function Base.push!(asc::ArgSharingCounts, m1::Method, m2::Method)
    p1, p2 = sig(m1).parameters, sig(m2).parameters
    (l = length(p1)) == length(p2) || return asc
    grow_to!(asc, l-1)  # first arg is typeof(f), skip it
    k = 0
    for i = 1:l-1
        if p1[i+1] == p2[i+1]
            asc.argsame[l-1][i] += 1
            k += 1
        else
            asc.argdiff[l-1][i] += 1
        end
    end
    asc.histsame[l-1][k+1] += 1
    asc
end

funcdict = Dict{Symbol,ArgSharingCounts}()
for name in names(Base)
    @show name
    f = getfield(Base, name)
    if isa(f, Function)
        mths = collect(methods(f))
        asc = ArgSharingCounts()
        for j = 1:length(mths)
            mj = mths[j]
            for i = 1:j-1
                mi = mths[i]
                push!(asc, mi, mj)
            end
        end
        funcdict[name] = asc
    end
end

# Total counts
asct = ArgSharingCounts()
for (fsym, asc) in funcdict
    grow_to!(asct, length(asc))
    for i = 1:length(asc)
        asct.argsame[i] .+= asc.argsame[i]
        asct.argdiff[i] .+= asc.argdiff[i]
        asct.histsame[i] .+= asc.histsame[i]
    end
end

# normalized analysis of 2-argument functions
nmethods = Int[]
p0 = Float64[]
p1 = Float64[]
for (fsym, asc) in funcdict
    if length(asc) >= 2
        push!(nmethods, length(collect(Iterators.filter(m->length(sig(m).parameters)==3, methods(getfield(Base, fsym))))))
        nt = sum(asc.histsame[2])
        if nt > 0
            push!(p0, asc.histsame[2][1]/nt)
            push!(p1, asc.histsame[2][2]/nt)
        end
    end
end

using Gadfly
phist = plot(x=nmethods, Guide.xlabel("with # methods"), Guide.ylabel("# 2-arg funcs"), Scale.x_log10, Geom.histogram);
pshared = plot(x=p0, Guide.xlabel("Frac. pairs, 0 shared arguments"), Guide.ylabel("Count"), Geom.histogram, Coord.Cartesian(xmin=0.0, xmax=1.0));
vstack(phist, pshared)

Tamas_Papp · July 13, 2017, 12:17pm

Nice addition.

As much as I love CL, as Julia matures and grows in popularity, it may make sense to extend (or, heaven forbid, replace) the CL examples with Julia. But that’s now something I would push for at the moment.

stevengj · July 13, 2017, 2:23pm

Honestly, it’s pretty questionable to add that data to Wikipedia, as it probably falls under the no original research ban unless you publish it elsewhere.

tim.holy · July 13, 2017, 2:26pm

Does it count if published here? It’s definitely not worth a paper. But the world is a worse place without it.

stevengj · July 13, 2017, 2:29pm

Highly questionable. Put it this way: if you were writing an academic paper, would you unquestioningly cite data that was published only by some “random person” in an online discussion board somewhere?

If it were in an official Julia blog post, however, that could arguably pass the bar. (A blog post takes more work than dashing off a script on discourse, but less work than an academic paper at least.)

tim.holy · July 13, 2017, 2:29pm

Didn’t remember that policy, but the whole reason I posted the source code here was to ensure reproducibility and verifiability. Skimming the policy, to me it looks like this is a valid use. (EDIT: my reply here was to your first post, we had a race condition in our replies.)

Steven_Sagaert · July 13, 2017, 2:34pm

If wikipedia gives you a hard time about the policy you could always put it on wikiversity and maybe put a link to the wikiversity page on the wikipedia page

stevengj · July 13, 2017, 2:34pm

Self-published primary sources are supposed to be used with extreme caution in Wikipedia. Wikipedia articles are only supposed to cite reputable sources.

(Full disclosure: I’ve been a Wikipedia editor since 2003, and actually have admin privileges there, although I’m not as active these days. The “no original research” ban and “reliable sources” policy are extremely important to Wikipedia, e.g. it is the first line of defense against science cranks.)

stevengj · July 13, 2017, 2:38pm

It seems better to replace it with the data on “quantifying multiple dispatch” that was published in the SIAM Review article (arxiv: [1411.1607] Julia: A Fresh Approach to Numerical Computing). This is especially nice because it compares multiple languages, using data from another paper.

ChrisRackauckas · July 13, 2017, 2:48pm

Is it really the same if it’s a code to reproduce the data yourself? No one has to trust @tim.holy here, they can just run the script. The true primary source here then is the Base installation of Julia itself, and what’s shown in this thread is just how one can query that data.

stevengj · July 13, 2017, 3:16pm

Yes. (You could make the same argument for novel mathematical arguments you could “check yourself”, or computer cold-fusion simulations “you could run yourself”.)

The basic epistemological issue here is that Wikipedia editors are not in a position to check the validity of the code or statistical methodology of anyone. There are no arguments about validity of code or math on Wikipedia (or at least, there aren’t supposed to be), only arguments about whether a source is reputable/notable.

A random code snippet posted to a forum somewhere is not a reputable source of statistical methodology.

tim.holy · July 13, 2017, 3:24pm

Thanks for stepping up and making the edits!

Topic		Replies	Views
Article about Multiple Dispatch Community multidispatch	7	1980	August 12, 2022
Why is multiple dispatch a feature? Internals & Design question	33	10654	August 23, 2020
Julia adoption: Method confusion New to Julia question	12	857	May 8, 2025
The Unreasonable Efficiency and Effectiveness of Multiple Dispatch: Your Favourite Examples General Usage question , multiple-dispatch	36	2847	January 10, 2025
What makes multiple dispatch so special? New to Julia	12	1521	January 5, 2023

Usage of multiple dispatch in Base (wikipedia entry)

Related topics