Seven Lines of Julia (examples sought)

This probably violates the read clearly part, but:

using CurrentPopulationSurvey
using DataFrames
using DataFramesMeta

oct20 = cpsdata(2020,10) |> DataFrame                                           
unemployment_rate = (@linq oct20 |>
           where(map(x -> x in 3:4, :pemlr)) |>
           combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1] /
           (@linq oct20 |>
           where(map(x -> x in 1:4, :pemlr)) |>
           combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1]
0.06593909683981009

With that, I downloaded the October '20 Current Population Survey public use microdata .zip file from the Census Bureau, unzipped it, parsed it, piped it into a DataFrame, and then computed the non-seasonally-adjusted unemployment rate for the U.S. (go ahead and check it at the Bureau of Labor Statistics’ data hub :slightly_smiling_face:). I think that’s kind of cool.

22 Likes

Seven is a really small number, but here goes.

  1. Declare a Type to represent the k-th partition of the integer n in lexicographic order for which every part is a power of 2.
  2. Add a method to Base.iterate to describe how to view such a partition one part at a time.
  3. The body of the function.
  4. Does end even count as a line?
  5. Extending Base.show gives us a way to print out our partitions.
  6. Populate an array that powers everything.
  7. Print out all the binary partitions of size at most 30
struct BinaryPartition k::Int64; n::Int64; b::Vector{Int64} end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
    n == 0 && return nothing ; k<=b.b[(n-1)>>1+1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-b.b[(n-1)>>1+1],n>>1,i+1))
end
Base.show(io::IO, b::BinaryPartition) = (f=0; for x in b print(io,(f+=1) == 1 ? "" : "+",x) end)
maxn = 30; rt = Vector{Int64}(undef,maxn+1) ; rt[1] = 1 ; for i=2:maxn+1 rt[i] = rt[i-1]+rt[(i+1)>>1] end
for n=1:maxn, k=1:rt[n>>1+1] println(n," = ",BinaryPartition(k,n,rt)) end

These are admittedly some very compressed lines, and the interplay between line 3 and line 6 is pretty magical, and specific to this use case. It’s probably not even showing off computational power, but it does highlight interface flexibility.

1 Like

Ok – its much less about counting to seven than it is about being clear and coherent in the eye of the perceiver. Would you do that again without such “run on lines” :slight_smile: please.
end and a comment or an empty line are not counted
comments should reduce the “line” count, not add to it

<the different approaches and perspectives are helpful, leading to more exemplary arrows for the compelling Julia code quiver>

3 Likes

Principle coordinates analysis with a plot. CSV is not really necessary, could do the same with readdlm but I don’t feel like looking up the syntax. This is not tested, but it’s something like this:

using CSV, Distances, MultivariateStats, StatsPlots

data = CSV.read("data.csv", Matrix, skipto=2)
dm = pairwise(data, BrayCurtis()) # distance matrix
pcoa = fit(MDS, dm, distances=true) # multi dimensional scaling

plot(pcoa)

The thing about this that’s most amazing to me is that the third and fifth lines make use of
things that I (partly) implemented with small PRs to amazing and widely used packages, and the maintainers for those packages were so kind, thoughtful, and patient in guiding me through that (and improving my code).

7 Likes

We’ll call it seven steps of Julia.

  1. Make a Type, BinaryPartitionNumberTable to hold a sequence of values with a custom constructor to populate the table from an initial value and a recurrence.
  2. The table has repeated values, and needs to be indexed from 0 to make sense, so we add a method to Base.getindex() for our type to implement the indexing.
  3. Actually generate a table.
  4. The table has all the information we need to list the parts of the k-th binary partition of n in lexicographic order. Create a Type, BinaryPartition to represent this intent.
  5. Since what we really want to do with these partitions is look at them one part at a time, instead of constructing the partitions, we add new method to Base.iterate() describing how to produce one part of one partition, and which partition describes the remaining parts. The iterator doesn’t know how many parts will be produced ahead of time, but it does know that each part is an Int64, so we implement Base.IteratorSize() and Base.eltype() for our type to convey this.
  6. We would also like to view our partitions, so we add a method to Base.show(). A lot of flexibility comes here, since the show method can be aware of the MIME-type of target output, and produce different results depending on context. For now, we just want to write a partition as a parenthesized sum, but we could just as easily write LaTeX to pretty print it, or do something that would make our data easy to copy and paste somewhere else in our workflow.
  7. Actually view our partitions.
# A table of the number of binary partitions of integers starting at 0
# These are defined by the recurrence b[2n]=b[2n-1]+b[n], and b[2n+1]=b[2n]
# with the initial condition b[0]=1.  Since each value is duplicated,
# we back the type with a vector half the size and implement custom indexing,
# and only actually compute the even terms.
struct BinaryPartitionNumberTable
    table::Vector{Int64}
    function BinaryPartitionNumberTable(maxn)
        # rt[i+1] = b[2i]
        rt = Vector{Int64}(undef,maxn>>1+1)
        rt[1] = 1 # Initial condition
        for i=2:maxn>>1+1 
            rt[i] = rt[i-1] + rt[(i+1)>>1] # Recursion
        end
        return new(rt)
    end
end
# Implement custom indexing for our table.
Base.getindex(b::BinaryPartitionNumberTable, i) = b.table[i>>1+1]
Base.firstindex(b::BinaryPnums) = 0
Base.length(b::BinaryPnums) = 2*length(b.table)
Base.lastindex(b::BinaryPnums) = length(b)-1
Base.iterate(b::BinaryPnums, i=0) = i<length(b) ? (b[i],i+1) : nothing

maxn = 30 
const bpt = BinaryPartitionNumberTable(maxn)

# Given our table, we can iterate over the parts of a partition, only constructing
# it one part at a time as needed.  We make a type to hold initial values for
# the iteration.
struct BinaryPartition 
    k::Int64 # The k'th binary partition
    n::Int64 # of the integer n
end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
    n == 0 && return nothing
    # For lexicographic order, the first bpt[n-1] partitions are a 1 followed by
    # a partition of n-1, the remaining ones are double a partition of n/2.
    # Either output 2^doublings encountered so far and tell the iterator to
    # proceed to n-1, or proceed directly to n/2, telling the iterator to
    # add an extra power of 2 to every remaining part.
    k<=bpt[n-1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-bpt[n-1],n>>1,i+1))
end
Base.IteratorSize(::BinaryPartition) = Base.SizeUnknown() # Permits collect()
Base.eltype(::BinaryPartition) = Int64 # Lets collect use an array of the correct type

# We can now define a custom rule for displaying our partitions.
function Base.show(io::IO, b::BinaryPartition) 
    termnumber = 0
    for x in b print(io,(termnumber+=1)==1 ? "(" : "+",x) end
    print(io,")")
end

for n=1:maxn, k=1:bpt[n] println(n,"=",BinaryPartition(k,n)) end

I really like the flexibility for defining interactions with user-defined types. Indexing, iteration, and display rules can all be quickly extended.

2 Likes

The code is a bit golfed and I’m not exactly sure how to count the lines.

using Plots
function mandelbrot(z) w = z
    for n in 1:74
        abs2(w) < 4 ? w = w^2 + z : return n
    end; 75
end
x, y = range(-0.65, -0.45; length=1600), range(0.51, 0.71; length=1600)
heatmap(x, y, -log.(mandelbrot.(x' .+ y .* im)); aspect_ratio=1)

This is the output:
mandel

54 Likes

Here’s a shorter version actually. Doesn’t use Bray-Curtis dissimilarity, but it gives a good idea of the point :man_shrugging:

using MultivariateStats, RDatasets, StatsPlots

iris = dataset("datasets", "iris")
X = convert(Matrix, iris[:, 1:4])
M = fit(MDS, X'; maxoutdim=2)

plot(M, group=iris.Species)

image

8 Likes
using Plots, StaticArrays
const θ = π/3; const m = SA[cos(θ) -sin(θ); sin(θ) cos(θ)]
const p = SA[1/3,0]; const q = p + m*p; const r = 2p
f(n) = n == 0 ? [0,0] : (v = f(n-1)/3; [v p.+m*v q.+m'*v r.+v])
a = f(6); b = [a [1,0] .+ (m*m)'*a [1/2,-sin(θ)] .+ (m*m)*a [0,0]]
plot(b[1,:], b[2,:]; aspect_ratio=1, size=(800,800), legend=false)

I’ll let you discover what it produces.

Spoiler

Koch snowflake:
koch

7 Likes

π is the most studied and fascinating number in the history of our world. Like other technical languages, Julia provides a constant for π but that is special (see blog).
The following code, edited from StackOverflow, prints π to N-digits after decimal point (c-digits per row), without rounding-off the last one. It plots also the relative frequency of each digit, computed using StatsBase. Some of Julia language features used: unicode symbols, arbitrary-precision BigFloats, string concatenation as a “product or power” and comprehensions:

using StatsBase, Plots
N, c = 1789, 80; r = mod(N,c)  # N-digits, print c digits/row, r=remainder
setprecision(Int(ceil(log2(10)*(N+1)))) do
    pi_str = string(BigFloat(pi))[1:N+2] * ' '^(c-r)  # pad with blanks
    println("π to " * string(N) * " digits past decimal point: \n" * pi_str[1:2])
    [println("  " * pi_str[i:i+c-1]) for i in 3:c:N+3-r]
    m = sort(countmap(pi_str))  # counts occurences of each digit
    plot(collect(keys(m))[3:end], collect(values(m))[3:end]/N, t=[:bar], leg=false, title="π digits frequency")
    # savefig("pi_digits.png")
end

Result:

π to 1789 digits past decimal point: 
3.
  14159265358979323846264338327950288419716939937510582097494459230781640628620899
  86280348253421170679821480865132823066470938446095505822317253594081284811174502
  84102701938521105559644622948954930381964428810975665933446128475648233786783165
  27120190914564856692346034861045432664821339360726024914127372458700660631558817
  48815209209628292540917153643678925903600113305305488204665213841469519415116094
  33057270365759591953092186117381932611793105118548074462379962749567351885752724
  89122793818301194912983367336244065664308602139494639522473719070217986094370277
  05392171762931767523846748184676694051320005681271452635608277857713427577896091
  73637178721468440901224953430146549585371050792279689258923542019956112129021960
  86403441815981362977477130996051870721134999999837297804995105973173281609631859
  50244594553469083026425223082533446850352619311881710100031378387528865875332083
  81420617177669147303598253490428755468731159562863882353787593751957781857780532
  17122680661300192787661119590921642019893809525720106548586327886593615338182796
  82303019520353018529689957736225994138912497217752834791315155748572424541506959
  50829533116861727855889075098381754637464939319255060400927701671139009848824012
  85836160356370766010471018194295559619894676783744944825537977472684710404753464
  62080466842590694912933136770289891521047521620569660240580381501935112533824300
  35587640247496473263914199272604269922796782354781636009341721641219924586315030
  28618297455570674983850549458858692699569092721079750930295532116534498720275596
  02364806654991198818347977535663698074265425278625518184175746728909777727938000
  81647060016145249192173217214772350141441973568548161361157352552133475741849468
  43852332390739414333454776241686251898356948556209921922218427255025425688767179
  04946016534668049886272327917

pi_digits

7 Likes

A visual proof of positive invariance in 7 LOC:

using ReachabilityAnalysis, Plots

@taylorize function vanderpol!(dx, x, p, t)
    dx[1] = x[2]; dx[2] = x[2] * (1 - x[1]^2) - x[1]
end

X0 = (1.25 .. 1.55) × (2.35 .. 2.45)
prob = @ivp(x' = vanderpol!(x), dim=2, x(0) ∈ X0)

sol = solve(prob, tspan=(0.0, 7.0))

plot(sol, vars=(1, 2), xlabel="x₁", ylabel="x₂")

Further details of what is done can be found in ARCH-COMP20 Category Report: Continuous and Hybrid Systems with Nonlinear Dynamics.

26 Likes

A complete definition of the game of life, and a simulation in the REPL:

using DynamicGrids
const born_survive = (0, 0, 0, 1, 0, 0, 0, 0, 0), (0, 0, 1, 1, 0, 0, 0, 0, 0)
rule = Neighbors(Moore(1)) do neighborhood, state
    born_survive[state + 1][sum(neighborhood) + 1]
end
output = REPLOutput(rand(Bool, 100, 100); tspan=1:1000, fps=50, color=:blue)
sim!(output, rule)

(A description of what is going on as requested by @viralbshah)

This cellular automata rule is run over the grid defined by rand(Bool, 100, 100), for every cell at every timestep in tstep. The neighborhood is an iterable over the surrounding cells. In the most well-known game of life rule, a cell becomes active when the surrounding 8 cells (a “Moore” neighborhood of radius 1) sum to 2. If the cell is already active it will stay active when the sum is 2 or 3. Swapping the 1s and 0s in born_survive will give you different GOL rules.

                             █ █  ▄▀▀▄             ▄█▄▀█▄▀▀▄      
      ▀▀▀                     ▀    ▀▀             ▀   ▄  ██▀      
    █     █                                         ▄   ██▄       
    ▀     ▀                           ▄▄▄             ▀  ▀     ▄▀▀
      ▀▀▀                                              ▄        ▀▀
                             ██     ██                 █          
                                       ▄           ▄▄▄   ▄▄▄      
                                       █               ▄       ▄▀▄
                                           ██          █       ▀▄▀
   ▄▄                              ██                             
   ▀▀                       ██                      ██   ██       
                                       ▄▄                         
                      ▄▀▄              ▀▀                        ▄
                     ▀▄▄▀       ▄▄▄   ▀▀▄▀ █▄                    ▀
                               █▀▀█ █   ▄▄ ▀                █▄█   
                            ▄   ▀    ▄█▀▄▄▄▄    ██           ▄▄█  
      ▄▄      ▀▀▀▄        ▄█▀▀▄     ▀▄▄▀▄▀  ▄▀█▄▄▄▀         ▄█ ▄ ▀
      ▀▀    █     ▀▄     ▄   ▄▀█    █▄▀ ▀█▄▄    ▀▀           ▀▄▀▄▀
            ▀▄     █   ▄    ▄█▀    ▀ █   ▀▄ █                     
              ▀▄▄▄   ▀▀█   ▀█▄    ▀  ▀▄█   ▀         █▀▀ ▄        
     ▄▀ ▀▄▄              ▄   █▀▄   █ ▀ ▄ ▄█          ▀ ▀▀  ▄▄▄    
  ▄▀▄█▀ ▄█ █              ▄▄  █▀    ██ █▄▀    ▄▀▄        ▄        
 ▀▄ ▄█    ▀                ▀█▀        ▀       ▀▄▀        █        
    ▀▀                            ██                              
       ▄                                                          
      ▀▄█    ▄▄                                                   
             ▀▀                                    ▄              
                      ▀▄                          █ ▀▄            


50 Likes

Hello World! in Brainfuck:

function bf(s) m, l, i, j, o = zeros(UInt8,2^15), Int[], 1, 1, UInt8(1)
  while i<=length(s) c=s[i]
    c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o : c=='-' ? m[j]-=o :
    c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
    c=='[' ? push!(l,i+1) : c==']' ? (m[j]!=0 ? (i=last(l);continue) : pop!(l)) : 0
    i+=1 end end
bf("++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.")

The implementation is non-standard because when it encounters the character '[' it unconditionally executes the block at least once instead of skipping to the matching ']' if the current byte is zero. In terms of the translation table described here, it is as if '[' were translated to do {...} while (*ptr) instead of while (*ptr) {...}.

Edit 1

Here is a version with the correct implementation for '['. The function jmp is used to jump to the instruction after the matching ']'.

jmp(s,i) = (n=1; i+=1; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i)
function bf(s) m=zeros(UInt8,2^15); l=Int[]; i=j=1; o=UInt8(1)
  while i<=length(s) c=s[i]; c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o :
    c=='-' ? m[j]-=o : c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
    c=='[' ? (m[j]==0 ? (i=jmp(s,i);continue) : push!(l,i+1)) :
    c==']' ? (m[j]!=0 ? (i=l[end];continue) : pop!(l)) : 0; i+=1 end end
bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")

The Brainfuck code at the end generates the Thue–Morse sequence (I hope: I haven’t checked) and is taken from here. The resulting program is non-terminating.

Some comments about the code

s is the Brainfuck source code, m is the memory tape, i is the current instruction index, c is the current instruction code character, j is the index of the current memory byte, l is the loop stack, o is one as UInt8 used to increment/decrement the memory.

Edit 2

Here is a version that transpiles Brainfuck to Julia code using metaprogramming.

const t = Dict('>'=>:(j+=1), '<'=>:(j-=1), '+'=>:(m[j]+=UInt8(1)), '-'=>:(m[j]-=UInt8(1)), '.'=>:(print(Char(m[j]))), ','=>:(m[j]=read(stdin,UInt8)))
le(s) = (n=1; i=2; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i-1)
function bf(s) if isempty(s) nothing
  elseif s[1]=='[' l=le(s); n=l+1; :( while m[j]!=0; $(bf(s[2:l-1])) end; $(bf(s[l+1:end])) )
  else :($(get(t,s[1],nothing)); $(bf(s[2:end]))) end end
macro bf(s) :(()->(m=zeros(UInt8,2^15); j=1; $(bf(s)))) end
@bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()

The function le is used to find the loop end (that is, the index of the matching ]). The function bf transforms a Brainfuck string to a Julia expression. The macro @bf returns an anonymous function that initializes the memory, the memory index and executes the code generated by bf.

The code generated by bf is not particularly nice to read because it is deeply nested. There are better ways to generate the code but they require more lines (at least, I haven’t found a smart way to do it). You can inspect the code of the function generated by the macro with

@code_warntype @bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()
11 Likes
using Measurements, Unitful, Unitful.DefaultSymbols
# Calculate the thickness of tinfoil including uncertainties
ρ = (2.7 ± 0.2)g/cm^3    # density of aluminum
# For a roll of tinfoil
mass = (250 ± 10)g
width = (30.5 ± 0.2)cm
length = (14.24 ± 0.2)m
# ρ = mass/(width*length*thickness)
@show thickness = μm(mass/(ρ*width*length))

thickness = μm(mass / (ρ * width * length)) = 21.3 ± 1.8 μm

27 Likes

Pretty much any of the examples in https://github.com/JuliaApproximation/ApproxFun.jl . Many are less then 7 lines already.

9 Likes

You may find this interesting! Was benchmarking the latest on Arrow with its work on DataFrames and a personal project I am working on:

Also, this is just nice to show the clarity of Julia and its ease of use:

4 Likes

You probably had the global variable m already defined before using Unitful.DefaultSymbols.

1 Like

I’m loving all the examples being presented here. I would like to encourage folks to submit as many different kinds of examples as possible, and I’ll work on collecting them and getting them onto julialang.org and generally on to social media with something like a #WhyJulia or #WorldOfJulia tag. I think @jiahao used to maintain such a thing a while ago for presentations and such.

It would greatly help to have a 3-5 line description of what it is doing, and why it is interesting, to go along with the code snippet.

-viral

25 Likes

The examples are nice and interesting, but IMO <10 LOC is not where Julia really shines. When not relying on packages, this domain is mostly about surface syntax tricks baked into the language.

The amazing powers of the language start to show around 100 LOC, and come into play fully from 1-5 kLOC, replicating functionality and speed of much larger libraries in other languages.

12 Likes

Agree with Tamas, but it is also nice to have a couple of very short examples. Here a Gibbs sampler sampling the posterior parameters of normal observations with mean distributed a priori as N(μ0, σ0) and precision 1/σ^2 a priori as Gamma(α, β):

using Distributions, Statistics
function mc(x, iterations, μ0 = 0., τ0 = 1e-7, α = 0.0001, β = 0.0001)
    sumx, n = sum(x), length(x)
    μ, τ = sumx/n, 1/var(x)
    samples = [(μ, τ)]
    for i in 1:iterations
        μ = rand(Normal((τ0*μ0 + τ*sumx)/(τ0 + n*τ), 1/sqrt(τ0 + n*τ)))
        τ = rand(Gamma(α + n/2, 1/(β + 0.5*sum((xᵢ-μ)^2 for xᵢ in x))))
        push!(samples, (μ, τ))
    end
    samples
end

Usage

samples = mc(rand(Normal(0.2, 1.7^(-0.5)), 1_000), 10_000)
μs, τs = first.(samples), last.(samples)
println("mean μ = ", mean(μs), " ± ", std(μs))
println("precision τ = ", mean(τs), " ± ", std(τs))

Output:

mean μ = 0.2076960556449643 ± 0.023693641548604788
precision τ = 1.8301418451485263 ± 0.08225999887566306

9 Likes

Couldn’t this be written without the need for a loop and push!? I think this would get you to seven lines while also being to my eye quite readable:

using Distributions, Statistics

function mc(x, iterations, μ₀ = 0., τ₀ = 1e-7, α = 0.0001, β = 0.0001)
    ∑x, n = sum(x), length(x)
    μ, τ = ∑x/n, 1/var(x)
    μₛ = rand(Normal((τ₀*μ₀ + τ*∑x)/(τ₀ + n*τ), 1/√(τ₀ + n*τ)), iterations)
    τₛ = rand(Gamma(α + n/2, 1/(β + 0.5*sum((xᵢ-μ)^2 for xᵢ in x))), iterations)
    samples = [(μ, τ); collect(zip(μₛ, τₛ))]
end
3 Likes