Seven Lines of Julia (examples sought)

This probably violates the read clearly part, but:

using CurrentPopulationSurvey
using DataFrames
using DataFramesMeta

oct20 = cpsdata(2020,10) |> DataFrame                                           
unemployment_rate = (@linq oct20 |>
           where(map(x -> x in 3:4, :pemlr)) |>
           combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1] /
           (@linq oct20 |>
           where(map(x -> x in 1:4, :pemlr)) |>
           combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1]

With that, I downloaded the October '20 Current Population Survey public use microdata .zip file from the Census Bureau, unzipped it, parsed it, piped it into a DataFrame, and then computed the non-seasonally-adjusted unemployment rate for the U.S. (go ahead and check it at the Bureau of Labor Statistics’ data hub :slightly_smiling_face:). I think that’s kind of cool.


Seven is a really small number, but here goes.

  1. Declare a Type to represent the k-th partition of the integer n in lexicographic order for which every part is a power of 2.
  2. Add a method to Base.iterate to describe how to view such a partition one part at a time.
  3. The body of the function.
  4. Does end even count as a line?
  5. Extending gives us a way to print out our partitions.
  6. Populate an array that powers everything.
  7. Print out all the binary partitions of size at most 30
struct BinaryPartition k::Int64; n::Int64; b::Vector{Int64} end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
    n == 0 && return nothing ; k<=b.b[(n-1)>>1+1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-b.b[(n-1)>>1+1],n>>1,i+1))
end, b::BinaryPartition) = (f=0; for x in b print(io,(f+=1) == 1 ? "" : "+",x) end)
maxn = 30; rt = Vector{Int64}(undef,maxn+1) ; rt[1] = 1 ; for i=2:maxn+1 rt[i] = rt[i-1]+rt[(i+1)>>1] end
for n=1:maxn, k=1:rt[n>>1+1] println(n," = ",BinaryPartition(k,n,rt)) end

These are admittedly some very compressed lines, and the interplay between line 3 and line 6 is pretty magical, and specific to this use case. It’s probably not even showing off computational power, but it does highlight interface flexibility.

1 Like

Ok – its much less about counting to seven than it is about being clear and coherent in the eye of the perceiver. Would you do that again without such “run on lines” :slight_smile: please.
end and a comment or an empty line are not counted
comments should reduce the “line” count, not add to it

<the different approaches and perspectives are helpful, leading to more exemplary arrows for the compelling Julia code quiver>


Principle coordinates analysis with a plot. CSV is not really necessary, could do the same with readdlm but I don’t feel like looking up the syntax. This is not tested, but it’s something like this:

using CSV, Distances, MultivariateStats, StatsPlots

data ="data.csv", Matrix, skipto=2)
dm = pairwise(data, BrayCurtis()) # distance matrix
pcoa = fit(MDS, dm, distances=true) # multi dimensional scaling


The thing about this that’s most amazing to me is that the third and fifth lines make use of
things that I (partly) implemented with small PRs to amazing and widely used packages, and the maintainers for those packages were so kind, thoughtful, and patient in guiding me through that (and improving my code).


We’ll call it seven steps of Julia.

  1. Make a Type, BinaryPartitionNumberTable to hold a sequence of values with a custom constructor to populate the table from an initial value and a recurrence.
  2. The table has repeated values, and needs to be indexed from 0 to make sense, so we add a method to Base.getindex() for our type to implement the indexing.
  3. Actually generate a table.
  4. The table has all the information we need to list the parts of the k-th binary partition of n in lexicographic order. Create a Type, BinaryPartition to represent this intent.
  5. Since what we really want to do with these partitions is look at them one part at a time, instead of constructing the partitions, we add new method to Base.iterate() describing how to produce one part of one partition, and which partition describes the remaining parts. The iterator doesn’t know how many parts will be produced ahead of time, but it does know that each part is an Int64, so we implement Base.IteratorSize() and Base.eltype() for our type to convey this.
  6. We would also like to view our partitions, so we add a method to A lot of flexibility comes here, since the show method can be aware of the MIME-type of target output, and produce different results depending on context. For now, we just want to write a partition as a parenthesized sum, but we could just as easily write LaTeX to pretty print it, or do something that would make our data easy to copy and paste somewhere else in our workflow.
  7. Actually view our partitions.
# A table of the number of binary partitions of integers starting at 0
# These are defined by the recurrence b[2n]=b[2n-1]+b[n], and b[2n+1]=b[2n]
# with the initial condition b[0]=1.  Since each value is duplicated,
# we back the type with a vector half the size and implement custom indexing,
# and only actually compute the even terms.
struct BinaryPartitionNumberTable
    function BinaryPartitionNumberTable(maxn)
        # rt[i+1] = b[2i]
        rt = Vector{Int64}(undef,maxn>>1+1)
        rt[1] = 1 # Initial condition
        for i=2:maxn>>1+1 
            rt[i] = rt[i-1] + rt[(i+1)>>1] # Recursion
        return new(rt)
# Implement custom indexing for our table.
Base.getindex(b::BinaryPartitionNumberTable, i) = b.table[i>>1+1]
Base.firstindex(b::BinaryPnums) = 0
Base.length(b::BinaryPnums) = 2*length(b.table)
Base.lastindex(b::BinaryPnums) = length(b)-1
Base.iterate(b::BinaryPnums, i=0) = i<length(b) ? (b[i],i+1) : nothing

maxn = 30 
const bpt = BinaryPartitionNumberTable(maxn)

# Given our table, we can iterate over the parts of a partition, only constructing
# it one part at a time as needed.  We make a type to hold initial values for
# the iteration.
struct BinaryPartition 
    k::Int64 # The k'th binary partition
    n::Int64 # of the integer n
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
    n == 0 && return nothing
    # For lexicographic order, the first bpt[n-1] partitions are a 1 followed by
    # a partition of n-1, the remaining ones are double a partition of n/2.
    # Either output 2^doublings encountered so far and tell the iterator to
    # proceed to n-1, or proceed directly to n/2, telling the iterator to
    # add an extra power of 2 to every remaining part.
    k<=bpt[n-1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-bpt[n-1],n>>1,i+1))
Base.IteratorSize(::BinaryPartition) = Base.SizeUnknown() # Permits collect()
Base.eltype(::BinaryPartition) = Int64 # Lets collect use an array of the correct type

# We can now define a custom rule for displaying our partitions.
function, b::BinaryPartition) 
    termnumber = 0
    for x in b print(io,(termnumber+=1)==1 ? "(" : "+",x) end

for n=1:maxn, k=1:bpt[n] println(n,"=",BinaryPartition(k,n)) end

I really like the flexibility for defining interactions with user-defined types. Indexing, iteration, and display rules can all be quickly extended.


The code is a bit golfed and I’m not exactly sure how to count the lines.

using Plots
function mandelbrot(z) w = z
    for n in 1:74
        abs2(w) < 4 ? w = w^2 + z : return n
    end; 75
x, y = range(-0.65, -0.45; length=1600), range(0.51, 0.71; length=1600)
heatmap(x, y, -log.(mandelbrot.(x' .+ y .* im)); aspect_ratio=1)

This is the output:


Here’s a shorter version actually. Doesn’t use Bray-Curtis dissimilarity, but it gives a good idea of the point :man_shrugging:

using MultivariateStats, RDatasets, StatsPlots

iris = dataset("datasets", "iris")
X = convert(Matrix, iris[:, 1:4])
M = fit(MDS, X'; maxoutdim=2)

plot(M, group=iris.Species)


using Plots, StaticArrays
const θ = π/3; const m = SA[cos(θ) -sin(θ); sin(θ) cos(θ)]
const p = SA[1/3,0]; const q = p + m*p; const r = 2p
f(n) = n == 0 ? [0,0] : (v = f(n-1)/3; [v p.+m*v q.+m'*v r.+v])
a = f(6); b = [a [1,0] .+ (m*m)'*a [1/2,-sin(θ)] .+ (m*m)*a [0,0]]
plot(b[1,:], b[2,:]; aspect_ratio=1, size=(800,800), legend=false)

I’ll let you discover what it produces.


Koch snowflake:


π is the most studied and fascinating number in the history of our world. Like other technical languages, Julia provides a constant for π but that is special (see blog).
The following code, edited from StackOverflow, prints π to N-digits after decimal point (c-digits per row), without rounding-off the last one. It plots also the relative frequency of each digit, computed using StatsBase. Some of Julia language features used: unicode symbols, arbitrary-precision BigFloats, string concatenation as a “product or power” and comprehensions:

using StatsBase, Plots
N, c = 1789, 80; r = mod(N,c)  # N-digits, print c digits/row, r=remainder
setprecision(Int(ceil(log2(10)*(N+1)))) do
    pi_str = string(BigFloat(pi))[1:N+2] * ' '^(c-r)  # pad with blanks
    println("π to " * string(N) * " digits past decimal point: \n" * pi_str[1:2])
    [println("  " * pi_str[i:i+c-1]) for i in 3:c:N+3-r]
    m = sort(countmap(pi_str))  # counts occurences of each digit
    plot(collect(keys(m))[3:end], collect(values(m))[3:end]/N, t=[:bar], leg=false, title="π digits frequency")
    # savefig("pi_digits.png")


π to 1789 digits past decimal point: 



A visual proof of positive invariance in 7 LOC:

using ReachabilityAnalysis, Plots

@taylorize function vanderpol!(dx, x, p, t)
    dx[1] = x[2]; dx[2] = x[2] * (1 - x[1]^2) - x[1]

X0 = (1.25 .. 1.55) × (2.35 .. 2.45)
prob = @ivp(x' = vanderpol!(x), dim=2, x(0) ∈ X0)

sol = solve(prob, tspan=(0.0, 7.0))

plot(sol, vars=(1, 2), xlabel="x₁", ylabel="x₂")

Further details of what is done can be found in ARCH-COMP20 Category Report: Continuous and Hybrid Systems with Nonlinear Dynamics.


A complete definition of the game of life, and a simulation in the REPL:

using DynamicGrids
const born_survive = (0, 0, 0, 1, 0, 0, 0, 0, 0), (0, 0, 1, 1, 0, 0, 0, 0, 0)
rule = Neighbors(Moore(1)) do neighborhood, state
    born_survive[state + 1][sum(neighborhood) + 1]
output = REPLOutput(rand(Bool, 100, 100); tspan=1:1000, fps=50, color=:blue)
sim!(output, rule)

(A description of what is going on as requested by @viralbshah)

This cellular automata rule is run over the grid defined by rand(Bool, 100, 100), for every cell at every timestep in tstep. The neighborhood is an iterable over the surrounding cells. In the most well-known game of life rule, a cell becomes active when the surrounding 8 cells (a “Moore” neighborhood of radius 1) sum to 2. If the cell is already active it will stay active when the sum is 2 or 3. Swapping the 1s and 0s in born_survive will give you different GOL rules.

                             █ █  ▄▀▀▄             ▄█▄▀█▄▀▀▄      
      ▀▀▀                     ▀    ▀▀             ▀   ▄  ██▀      
    █     █                                         ▄   ██▄       
    ▀     ▀                           ▄▄▄             ▀  ▀     ▄▀▀
      ▀▀▀                                              ▄        ▀▀
                             ██     ██                 █          
                                       ▄           ▄▄▄   ▄▄▄      
                                       █               ▄       ▄▀▄
                                           ██          █       ▀▄▀
   ▄▄                              ██                             
   ▀▀                       ██                      ██   ██       
                      ▄▀▄              ▀▀                        ▄
                     ▀▄▄▀       ▄▄▄   ▀▀▄▀ █▄                    ▀
                               █▀▀█ █   ▄▄ ▀                █▄█   
                            ▄   ▀    ▄█▀▄▄▄▄    ██           ▄▄█  
      ▄▄      ▀▀▀▄        ▄█▀▀▄     ▀▄▄▀▄▀  ▄▀█▄▄▄▀         ▄█ ▄ ▀
      ▀▀    █     ▀▄     ▄   ▄▀█    █▄▀ ▀█▄▄    ▀▀           ▀▄▀▄▀
            ▀▄     █   ▄    ▄█▀    ▀ █   ▀▄ █                     
              ▀▄▄▄   ▀▀█   ▀█▄    ▀  ▀▄█   ▀         █▀▀ ▄        
     ▄▀ ▀▄▄              ▄   █▀▄   █ ▀ ▄ ▄█          ▀ ▀▀  ▄▄▄    
  ▄▀▄█▀ ▄█ █              ▄▄  █▀    ██ █▄▀    ▄▀▄        ▄        
 ▀▄ ▄█    ▀                ▀█▀        ▀       ▀▄▀        █        
    ▀▀                            ██                              
      ▀▄█    ▄▄                                                   
             ▀▀                                    ▄              
                      ▀▄                          █ ▀▄            


Hello World! in Brainfuck:

function bf(s) m, l, i, j, o = zeros(UInt8,2^15), Int[], 1, 1, UInt8(1)
  while i<=length(s) c=s[i]
    c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o : c=='-' ? m[j]-=o :
    c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
    c=='[' ? push!(l,i+1) : c==']' ? (m[j]!=0 ? (i=last(l);continue) : pop!(l)) : 0
    i+=1 end end

The implementation is non-standard because when it encounters the character '[' it unconditionally executes the block at least once instead of skipping to the matching ']' if the current byte is zero. In terms of the translation table described here, it is as if '[' were translated to do {...} while (*ptr) instead of while (*ptr) {...}.

Edit 1

Here is a version with the correct implementation for '['. The function jmp is used to jump to the instruction after the matching ']'.

jmp(s,i) = (n=1; i+=1; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i)
function bf(s) m=zeros(UInt8,2^15); l=Int[]; i=j=1; o=UInt8(1)
  while i<=length(s) c=s[i]; c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o :
    c=='-' ? m[j]-=o : c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
    c=='[' ? (m[j]==0 ? (i=jmp(s,i);continue) : push!(l,i+1)) :
    c==']' ? (m[j]!=0 ? (i=l[end];continue) : pop!(l)) : 0; i+=1 end end

The Brainfuck code at the end generates the Thue–Morse sequence (I hope: I haven’t checked) and is taken from here. The resulting program is non-terminating.

Some comments about the code

s is the Brainfuck source code, m is the memory tape, i is the current instruction index, c is the current instruction code character, j is the index of the current memory byte, l is the loop stack, o is one as UInt8 used to increment/decrement the memory.

Edit 2

Here is a version that transpiles Brainfuck to Julia code using metaprogramming.

const t = Dict('>'=>:(j+=1), '<'=>:(j-=1), '+'=>:(m[j]+=UInt8(1)), '-'=>:(m[j]-=UInt8(1)), '.'=>:(print(Char(m[j]))), ','=>:(m[j]=read(stdin,UInt8)))
le(s) = (n=1; i=2; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i-1)
function bf(s) if isempty(s) nothing
  elseif s[1]=='[' l=le(s); n=l+1; :( while m[j]!=0; $(bf(s[2:l-1])) end; $(bf(s[l+1:end])) )
  else :($(get(t,s[1],nothing)); $(bf(s[2:end]))) end end
macro bf(s) :(()->(m=zeros(UInt8,2^15); j=1; $(bf(s)))) end

The function le is used to find the loop end (that is, the index of the matching ]). The function bf transforms a Brainfuck string to a Julia expression. The macro @bf returns an anonymous function that initializes the memory, the memory index and executes the code generated by bf.

The code generated by bf is not particularly nice to read because it is deeply nested. There are better ways to generate the code but they require more lines (at least, I haven’t found a smart way to do it). You can inspect the code of the function generated by the macro with

@code_warntype @bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()
using Measurements, Unitful, Unitful.DefaultSymbols
# Calculate the thickness of tinfoil including uncertainties
ρ = (2.7 ± 0.2)g/cm^3    # density of aluminum
# For a roll of tinfoil
mass = (250 ± 10)g
width = (30.5 ± 0.2)cm
length = (14.24 ± 0.2)m
# ρ = mass/(width*length*thickness)
@show thickness = μm(mass/(ρ*width*length))

thickness = μm(mass / (ρ * width * length)) = 21.3 ± 1.8 μm


Pretty much any of the examples in . Many are less then 7 lines already.


You may find this interesting! Was benchmarking the latest on Arrow with its work on DataFrames and a personal project I am working on:

Also, this is just nice to show the clarity of Julia and its ease of use:


You probably had the global variable m already defined before using Unitful.DefaultSymbols.

1 Like

I’m loving all the examples being presented here. I would like to encourage folks to submit as many different kinds of examples as possible, and I’ll work on collecting them and getting them onto and generally on to social media with something like a #WhyJulia or #WorldOfJulia tag. I think @jiahao used to maintain such a thing a while ago for presentations and such.

It would greatly help to have a 3-5 line description of what it is doing, and why it is interesting, to go along with the code snippet.



The examples are nice and interesting, but IMO <10 LOC is not where Julia really shines. When not relying on packages, this domain is mostly about surface syntax tricks baked into the language.

The amazing powers of the language start to show around 100 LOC, and come into play fully from 1-5 kLOC, replicating functionality and speed of much larger libraries in other languages.


Agree with Tamas, but it is also nice to have a couple of very short examples. Here a Gibbs sampler sampling the posterior parameters of normal observations with mean distributed a priori as N(μ0, σ0) and precision 1/σ^2 a priori as Gamma(α, β):

using Distributions, Statistics
function mc(x, iterations, μ0 = 0., τ0 = 1e-7, α = 0.0001, β = 0.0001)
    sumx, n = sum(x), length(x)
    μ, τ = sumx/n, 1/var(x)
    samples = [(μ, τ)]
    for i in 1:iterations
        μ = rand(Normal((τ0*μ0 + τ*sumx)/(τ0 + n*τ), 1/sqrt(τ0 + n*τ)))
        τ = rand(Gamma(α + n/2, 1/(β + 0.5*sum((xᵢ-μ)^2 for xᵢ in x))))
        push!(samples, (μ, τ))


samples = mc(rand(Normal(0.2, 1.7^(-0.5)), 1_000), 10_000)
μs, τs = first.(samples), last.(samples)
println("mean μ = ", mean(μs), " ± ", std(μs))
println("precision τ = ", mean(τs), " ± ", std(τs))


mean μ = 0.2076960556449643 ± 0.023693641548604788
precision τ = 1.8301418451485263 ± 0.08225999887566306


Couldn’t this be written without the need for a loop and push!? I think this would get you to seven lines while also being to my eye quite readable:

using Distributions, Statistics

function mc(x, iterations, μ₀ = 0., τ₀ = 1e-7, α = 0.0001, β = 0.0001)
    ∑x, n = sum(x), length(x)
    μ, τ = ∑x/n, 1/var(x)
    μₛ = rand(Normal((τ₀*μ₀ + τ*∑x)/(τ₀ + n*τ), 1/√(τ₀ + n*τ)), iterations)
    τₛ = rand(Gamma(α + n/2, 1/(β + 0.5*sum((xᵢ-μ)^2 for xᵢ in x))), iterations)
    samples = [(μ, τ); collect(zip(μₛ, τₛ))]