Seven Lines of Julia (examples sought)

JeffreySarnoff · November 19, 2020, 7:37am

I am looking for examples: show me some real computational power in seven or fewer lines of Julia that read clearly. It is ok to be using other packages, though no more than four (and it is ok to use fewer). Each example is intended to make a compelling case for Julia to skilled practitioners of whatever is done.

baggepinnen · November 19, 2020, 7:44am

This rule seems arbitrary and useless to me, packages in Julia are often small and composable making it reasonable to use many of them to accomplish what you need. You can also make a single meta package that uses many other packages and make it appear as if you’re only using one package.

JeffreySarnoff · November 19, 2020, 7:58am

I was thinking less constructively – what I meant to convey:
There are many expertly crafted packages, a few of which may be essential to demonstrating something important by virtue of their reason for being. To have more than 4 packages used would make it less easy for others to see the underlying powerful simplicity (here of composition as other’s efforts are working together with one’s current intent). I appreciate your perspective, and indeed that is something to be shown … perhaps better seen after showing some that require less of the viewer/reader as domain experts are not always equally expert at programming.

rafael.guerra · November 19, 2020, 10:09pm

Julia code for the Lorenz problem provided in Wikipedia that uses DifferentialEquations.jl.
Lorenz’s equations published in 1963 describe atmospheric convection and are very sensitive to the initial parameters. The work went unnoticed until 1972, when Lorenz gave a conference entitled “Predictability: does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?”, which made famous the “butterfly effect” and led to the chaos theory.
The electronic computer used (Royal McBee LGP-30) was according to Lorenz only 1000 times faster than manual computing… Julia code allows writing this problem compactly with clear syntax, and to collect the solution on today’s laptops within 1 ms.

using DifferentialEquations, ParameterizedFunctions, Plots
lorenz = @ode_def begin                  # define the system
    dx = σ*(y - x);  dy = x*(ρ - z) - y; dz = x*y - β*z
 end σ ρ β
u0=[1.,1.,1.]; tspan=(0.,60.); p=[10.,28.,8/3];  # initial conds., timespan and parms.
prob = ODEProblem(lorenz, u0, tspan, p)  # define the problem
sol0 = solve(prob, reltol=1e-9)          # solve it
plot(sol0, vars = (1, 2, 3))             # display it

lorenz

mthelm85 · November 19, 2020, 11:53pm

This probably violates the read clearly part, but:

using CurrentPopulationSurvey
using DataFrames
using DataFramesMeta

oct20 = cpsdata(2020,10) |> DataFrame                                           
unemployment_rate = (@linq oct20 |>
           where(map(x -> x in 3:4, :pemlr)) |>
           combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1] /
           (@linq oct20 |>
           where(map(x -> x in 1:4, :pemlr)) |>
           combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1]
0.06593909683981009

With that, I downloaded the October '20 Current Population Survey public use microdata .zip file from the Census Bureau, unzipped it, parsed it, piped it into a DataFrame, and then computed the non-seasonally-adjusted unemployment rate for the U.S. (go ahead and check it at the Bureau of Labor Statistics’ data hub ). I think that’s kind of cool.

malacroi · November 20, 2020, 1:40am

Seven is a really small number, but here goes.

Declare a Type to represent the k-th partition of the integer n in lexicographic order for which every part is a power of 2.
Add a method to Base.iterate to describe how to view such a partition one part at a time.
The body of the function.
Does end even count as a line?
Extending Base.show gives us a way to print out our partitions.
Populate an array that powers everything.
Print out all the binary partitions of size at most 30

struct BinaryPartition k::Int64; n::Int64; b::Vector{Int64} end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
    n == 0 && return nothing ; k<=b.b[(n-1)>>1+1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-b.b[(n-1)>>1+1],n>>1,i+1))
end
Base.show(io::IO, b::BinaryPartition) = (f=0; for x in b print(io,(f+=1) == 1 ? "" : "+",x) end)
maxn = 30; rt = Vector{Int64}(undef,maxn+1) ; rt[1] = 1 ; for i=2:maxn+1 rt[i] = rt[i-1]+rt[(i+1)>>1] end
for n=1:maxn, k=1:rt[n>>1+1] println(n," = ",BinaryPartition(k,n,rt)) end

These are admittedly some very compressed lines, and the interplay between line 3 and line 6 is pretty magical, and specific to this use case. It’s probably not even showing off computational power, but it does highlight interface flexibility.

JeffreySarnoff · November 20, 2020, 2:34am

Ok – its much less about counting to seven than it is about being clear and coherent in the eye of the perceiver. Would you do that again without such “run on lines” please.
end and a comment or an empty line are not counted
comments should reduce the “line” count, not add to it

kevbonham · November 20, 2020, 2:53am

Principle coordinates analysis with a plot. CSV is not really necessary, could do the same with readdlm but I don’t feel like looking up the syntax. This is not tested, but it’s something like this:

using CSV, Distances, MultivariateStats, StatsPlots

data = CSV.read("data.csv", Matrix, skipto=2)
dm = pairwise(data, BrayCurtis()) # distance matrix
pcoa = fit(MDS, dm, distances=true) # multi dimensional scaling

plot(pcoa)

The thing about this that’s most amazing to me is that the third and fifth lines make use of
things that I (partly) implemented with small PRs to amazing and widely used packages, and the maintainers for those packages were so kind, thoughtful, and patient in guiding me through that (and improving my code).

malacroi · November 20, 2020, 4:26am

We’ll call it seven steps of Julia.

Make a Type, BinaryPartitionNumberTable to hold a sequence of values with a custom constructor to populate the table from an initial value and a recurrence.
The table has repeated values, and needs to be indexed from 0 to make sense, so we add a method to Base.getindex() for our type to implement the indexing.
Actually generate a table.
The table has all the information we need to list the parts of the k-th binary partition of n in lexicographic order. Create a Type, BinaryPartition to represent this intent.
Since what we really want to do with these partitions is look at them one part at a time, instead of constructing the partitions, we add new method to Base.iterate() describing how to produce one part of one partition, and which partition describes the remaining parts. The iterator doesn’t know how many parts will be produced ahead of time, but it does know that each part is an Int64, so we implement Base.IteratorSize() and Base.eltype() for our type to convey this.
We would also like to view our partitions, so we add a method to Base.show(). A lot of flexibility comes here, since the show method can be aware of the MIME-type of target output, and produce different results depending on context. For now, we just want to write a partition as a parenthesized sum, but we could just as easily write LaTeX to pretty print it, or do something that would make our data easy to copy and paste somewhere else in our workflow.
Actually view our partitions.

# A table of the number of binary partitions of integers starting at 0
# These are defined by the recurrence b[2n]=b[2n-1]+b[n], and b[2n+1]=b[2n]
# with the initial condition b[0]=1.  Since each value is duplicated,
# we back the type with a vector half the size and implement custom indexing,
# and only actually compute the even terms.
struct BinaryPartitionNumberTable
    table::Vector{Int64}
    function BinaryPartitionNumberTable(maxn)
        # rt[i+1] = b[2i]
        rt = Vector{Int64}(undef,maxn>>1+1)
        rt[1] = 1 # Initial condition
        for i=2:maxn>>1+1 
            rt[i] = rt[i-1] + rt[(i+1)>>1] # Recursion
        end
        return new(rt)
    end
end
# Implement custom indexing for our table.
Base.getindex(b::BinaryPartitionNumberTable, i) = b.table[i>>1+1]
Base.firstindex(b::BinaryPnums) = 0
Base.length(b::BinaryPnums) = 2*length(b.table)
Base.lastindex(b::BinaryPnums) = length(b)-1
Base.iterate(b::BinaryPnums, i=0) = i<length(b) ? (b[i],i+1) : nothing

maxn = 30 
const bpt = BinaryPartitionNumberTable(maxn)

# Given our table, we can iterate over the parts of a partition, only constructing
# it one part at a time as needed.  We make a type to hold initial values for
# the iteration.
struct BinaryPartition 
    k::Int64 # The k'th binary partition
    n::Int64 # of the integer n
end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
    n == 0 && return nothing
    # For lexicographic order, the first bpt[n-1] partitions are a 1 followed by
    # a partition of n-1, the remaining ones are double a partition of n/2.
    # Either output 2^doublings encountered so far and tell the iterator to
    # proceed to n-1, or proceed directly to n/2, telling the iterator to
    # add an extra power of 2 to every remaining part.
    k<=bpt[n-1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-bpt[n-1],n>>1,i+1))
end
Base.IteratorSize(::BinaryPartition) = Base.SizeUnknown() # Permits collect()
Base.eltype(::BinaryPartition) = Int64 # Lets collect use an array of the correct type

# We can now define a custom rule for displaying our partitions.
function Base.show(io::IO, b::BinaryPartition) 
    termnumber = 0
    for x in b print(io,(termnumber+=1)==1 ? "(" : "+",x) end
    print(io,")")
end

for n=1:maxn, k=1:bpt[n] println(n,"=",BinaryPartition(k,n)) end

I really like the flexibility for defining interactions with user-defined types. Indexing, iteration, and display rules can all be quickly extended.

FedericoStra · November 20, 2020, 6:46pm

The code is a bit golfed and I’m not exactly sure how to count the lines.

using Plots
function mandelbrot(z) w = z
    for n in 1:74
        abs2(w) < 4 ? w = w^2 + z : return n
    end; 75
end
x, y = range(-0.65, -0.45; length=1600), range(0.51, 0.71; length=1600)
heatmap(x, y, -log.(mandelbrot.(x' .+ y .* im)); aspect_ratio=1)

This is the output:
mandel

kevbonham · November 20, 2020, 7:27pm

Here’s a shorter version actually. Doesn’t use Bray-Curtis dissimilarity, but it gives a good idea of the point

using MultivariateStats, RDatasets, StatsPlots

iris = dataset("datasets", "iris")
X = convert(Matrix, iris[:, 1:4])
M = fit(MDS, X'; maxoutdim=2)

plot(M, group=iris.Species)

FedericoStra · November 20, 2020, 11:20pm

using Plots, StaticArrays
const θ = π/3; const m = SA[cos(θ) -sin(θ); sin(θ) cos(θ)]
const p = SA[1/3,0]; const q = p + m*p; const r = 2p
f(n) = n == 0 ? [0,0] : (v = f(n-1)/3; [v p.+m*v q.+m'*v r.+v])
a = f(6); b = [a [1,0] .+ (m*m)'*a [1/2,-sin(θ)] .+ (m*m)*a [0,0]]
plot(b[1,:], b[2,:]; aspect_ratio=1, size=(800,800), legend=false)

I’ll let you discover what it produces.

Spoiler

Koch snowflake:
koch

rafael.guerra · November 21, 2020, 12:36am

π is the most studied and fascinating number in the history of our world. Like other technical languages, Julia provides a constant for π but that is special (see blog).
The following code, edited from StackOverflow, prints π to N-digits after decimal point (c-digits per row), without rounding-off the last one. It plots also the relative frequency of each digit, computed using StatsBase. Some of Julia language features used: unicode symbols, arbitrary-precision BigFloats, string concatenation as a “product or power” and comprehensions:

using StatsBase, Plots
N, c = 1789, 80; r = mod(N,c)  # N-digits, print c digits/row, r=remainder
setprecision(Int(ceil(log2(10)*(N+1)))) do
    pi_str = string(BigFloat(pi))[1:N+2] * ' '^(c-r)  # pad with blanks
    println("π to " * string(N) * " digits past decimal point: \n" * pi_str[1:2])
    [println("  " * pi_str[i:i+c-1]) for i in 3:c:N+3-r]
    m = sort(countmap(pi_str))  # counts occurences of each digit
    plot(collect(keys(m))[3:end], collect(values(m))[3:end]/N, t=[:bar], leg=false, title="π digits frequency")
    # savefig("pi_digits.png")
end

Result:

π to 1789 digits past decimal point: 
3.
  14159265358979323846264338327950288419716939937510582097494459230781640628620899
  86280348253421170679821480865132823066470938446095505822317253594081284811174502
  84102701938521105559644622948954930381964428810975665933446128475648233786783165
  27120190914564856692346034861045432664821339360726024914127372458700660631558817
  48815209209628292540917153643678925903600113305305488204665213841469519415116094
  33057270365759591953092186117381932611793105118548074462379962749567351885752724
  89122793818301194912983367336244065664308602139494639522473719070217986094370277
  05392171762931767523846748184676694051320005681271452635608277857713427577896091
  73637178721468440901224953430146549585371050792279689258923542019956112129021960
  86403441815981362977477130996051870721134999999837297804995105973173281609631859
  50244594553469083026425223082533446850352619311881710100031378387528865875332083
  81420617177669147303598253490428755468731159562863882353787593751957781857780532
  17122680661300192787661119590921642019893809525720106548586327886593615338182796
  82303019520353018529689957736225994138912497217752834791315155748572424541506959
  50829533116861727855889075098381754637464939319255060400927701671139009848824012
  85836160356370766010471018194295559619894676783744944825537977472684710404753464
  62080466842590694912933136770289891521047521620569660240580381501935112533824300
  35587640247496473263914199272604269922796782354781636009341721641219924586315030
  28618297455570674983850549458858692699569092721079750930295532116534498720275596
  02364806654991198818347977535663698074265425278625518184175746728909777727938000
  81647060016145249192173217214772350141441973568548161361157352552133475741849468
  43852332390739414333454776241686251898356948556209921922218427255025425688767179
  04946016534668049886272327917

pi_digits

mforets · November 21, 2020, 1:06am

A visual proof of positive invariance in 7 LOC:

using ReachabilityAnalysis, Plots

@taylorize function vanderpol!(dx, x, p, t)
    dx[1] = x[2]; dx[2] = x[2] * (1 - x[1]^2) - x[1]
end

X0 = (1.25 .. 1.55) × (2.35 .. 2.45)
prob = @ivp(x' = vanderpol!(x), dim=2, x(0) ∈ X0)

sol = solve(prob, tspan=(0.0, 7.0))

plot(sol, vars=(1, 2), xlabel="x₁", ylabel="x₂")

Further details of what is done can be found in ARCH-COMP20 Category Report: Continuous and Hybrid Systems with Nonlinear Dynamics.

Raf · November 21, 2020, 2:39am

A complete definition of the game of life, and a simulation in the REPL:

using DynamicGrids
const born_survive = (0, 0, 0, 1, 0, 0, 0, 0, 0), (0, 0, 1, 1, 0, 0, 0, 0, 0)
rule = Neighbors(Moore(1)) do neighborhood, state
    born_survive[state + 1][sum(neighborhood) + 1]
end
output = REPLOutput(rand(Bool, 100, 100); tspan=1:1000, fps=50, color=:blue)
sim!(output, rule)

(A description of what is going on as requested by @viralbshah)

This cellular automata rule is run over the grid defined by rand(Bool, 100, 100), for every cell at every timestep in tstep. The neighborhood is an iterable over the surrounding cells. In the most well-known game of life rule, a cell becomes active when the surrounding 8 cells (a “Moore” neighborhood of radius 1) sum to 2. If the cell is already active it will stay active when the sum is 2 or 3. Swapping the 1s and 0s in born_survive will give you different GOL rules.

                             █ █  ▄▀▀▄             ▄█▄▀█▄▀▀▄      
      ▀▀▀                     ▀    ▀▀             ▀   ▄  ██▀      
    █     █                                         ▄   ██▄       
    ▀     ▀                           ▄▄▄             ▀  ▀     ▄▀▀
      ▀▀▀                                              ▄        ▀▀
                             ██     ██                 █          
                                       ▄           ▄▄▄   ▄▄▄      
                                       █               ▄       ▄▀▄
                                           ██          █       ▀▄▀
   ▄▄                              ██                             
   ▀▀                       ██                      ██   ██       
                                       ▄▄                         
                      ▄▀▄              ▀▀                        ▄
                     ▀▄▄▀       ▄▄▄   ▀▀▄▀ █▄                    ▀
                               █▀▀█ █   ▄▄ ▀                █▄█   
                            ▄   ▀    ▄█▀▄▄▄▄    ██           ▄▄█  
      ▄▄      ▀▀▀▄        ▄█▀▀▄     ▀▄▄▀▄▀  ▄▀█▄▄▄▀         ▄█ ▄ ▀
      ▀▀    █     ▀▄     ▄   ▄▀█    █▄▀ ▀█▄▄    ▀▀           ▀▄▀▄▀
            ▀▄     █   ▄    ▄█▀    ▀ █   ▀▄ █                     
              ▀▄▄▄   ▀▀█   ▀█▄    ▀  ▀▄█   ▀         █▀▀ ▄        
     ▄▀ ▀▄▄              ▄   █▀▄   █ ▀ ▄ ▄█          ▀ ▀▀  ▄▄▄    
  ▄▀▄█▀ ▄█ █              ▄▄  █▀    ██ █▄▀    ▄▀▄        ▄        
 ▀▄ ▄█    ▀                ▀█▀        ▀       ▀▄▀        █        
    ▀▀                            ██                              
       ▄                                                          
      ▀▄█    ▄▄                                                   
             ▀▀                                    ▄              
                      ▀▄                          █ ▀▄

FedericoStra · November 21, 2020, 3:16pm

Hello World! in Brainfuck:

function bf(s) m, l, i, j, o = zeros(UInt8,2^15), Int[], 1, 1, UInt8(1)
  while i<=length(s) c=s[i]
    c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o : c=='-' ? m[j]-=o :
    c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
    c=='[' ? push!(l,i+1) : c==']' ? (m[j]!=0 ? (i=last(l);continue) : pop!(l)) : 0
    i+=1 end end
bf("++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.")

The implementation is non-standard because when it encounters the character '[' it unconditionally executes the block at least once instead of skipping to the matching ']' if the current byte is zero. In terms of the translation table described here, it is as if '[' were translated to do {...} while (*ptr) instead of while (*ptr) {...}.

Edit 1

Here is a version with the correct implementation for '['. The function jmp is used to jump to the instruction after the matching ']'.

jmp(s,i) = (n=1; i+=1; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i)
function bf(s) m=zeros(UInt8,2^15); l=Int[]; i=j=1; o=UInt8(1)
  while i<=length(s) c=s[i]; c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o :
    c=='-' ? m[j]-=o : c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
    c=='[' ? (m[j]==0 ? (i=jmp(s,i);continue) : push!(l,i+1)) :
    c==']' ? (m[j]!=0 ? (i=l[end];continue) : pop!(l)) : 0; i+=1 end end
bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")

The Brainfuck code at the end generates the Thue–Morse sequence (I hope: I haven’t checked) and is taken from here. The resulting program is non-terminating.

Some comments about the code

s is the Brainfuck source code, m is the memory tape, i is the current instruction index, c is the current instruction code character, j is the index of the current memory byte, l is the loop stack, o is one as UInt8 used to increment/decrement the memory.

Edit 2

Here is a version that transpiles Brainfuck to Julia code using metaprogramming.

const t = Dict('>'=>:(j+=1), '<'=>:(j-=1), '+'=>:(m[j]+=UInt8(1)), '-'=>:(m[j]-=UInt8(1)), '.'=>:(print(Char(m[j]))), ','=>:(m[j]=read(stdin,UInt8)))
le(s) = (n=1; i=2; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i-1)
function bf(s) if isempty(s) nothing
  elseif s[1]=='[' l=le(s); n=l+1; :( while m[j]!=0; $(bf(s[2:l-1])) end; $(bf(s[l+1:end])) )
  else :($(get(t,s[1],nothing)); $(bf(s[2:end]))) end end
macro bf(s) :(()->(m=zeros(UInt8,2^15); j=1; $(bf(s)))) end
@bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()

The function le is used to find the loop end (that is, the index of the matching ]). The function bf transforms a Brainfuck string to a Julia expression. The macro @bf returns an anonymous function that initializes the memory, the memory index and executes the code generated by bf.

The code generated by bf is not particularly nice to read because it is deeply nested. There are better ways to generate the code but they require more lines (at least, I haven’t found a smart way to do it). You can inspect the code of the function generated by the macro with

@code_warntype @bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()

Glen · November 22, 2020, 4:17am

using Measurements, Unitful, Unitful.DefaultSymbols
# Calculate the thickness of tinfoil including uncertainties
ρ = (2.7 ± 0.2)g/cm^3    # density of aluminum
# For a roll of tinfoil
mass = (250 ± 10)g
width = (30.5 ± 0.2)cm
length = (14.24 ± 0.2)m
# ρ = mass/(width*length*thickness)
@show thickness = μm(mass/(ρ*width*length))

thickness = μm(mass / (ρ * width * length)) = 21.3 ± 1.8 μm

jlperla · November 22, 2020, 4:57am

Pretty much any of the examples in https://github.com/JuliaApproximation/ApproxFun.jl . Many are less then 7 lines already.

TheCedarPrince · November 22, 2020, 5:18am

You may find this interesting! Was benchmarking the latest on Arrow with its work on DataFrames and a personal project I am working on:

Also, this is just nice to show the clarity of Julia and its ease of use:

FedericoStra · November 22, 2020, 2:54pm

You probably had the global variable m already defined before using Unitful.DefaultSymbols.