I am looking for examples: show me some real computational power in seven or fewer lines of Julia that read clearly. It is ok to be using
other packages, though no more than four (and it is ok to use fewer). Each example is intended to make a compelling case for Julia to skilled practitioners of whatever is done.
This rule seems arbitrary and useless to me, packages in Julia are often small and composable making it reasonable to use many of them to accomplish what you need. You can also make a single meta package that uses many other packages and make it appear as if youβre only using one package.
I was thinking less constructively β what I meant to convey:
There are many expertly crafted packages, a few of which may be essential to demonstrating something important by virtue of their reason for being. To have more than 4 packages used would make it less easy for others to see the underlying powerful simplicity (here of composition as otherβs efforts are working together with oneβs current intent). I appreciate your perspective, and indeed that is something to be shown β¦ perhaps better seen after showing some that require less of the viewer/reader as domain experts are not always equally expert at programming.
Julia code for the Lorenz problem provided in Wikipedia that uses DifferentialEquations.jl.
Lorenzβs equations published in 1963 describe atmospheric convection and are very sensitive to the initial parameters. The work went unnoticed until 1972, when Lorenz gave a conference entitled βPredictability: does the flap of a butterflyβs wings in Brazil set off a tornado in Texas?β, which made famous the βbutterfly effectβ and led to the chaos theory.
The electronic computer used (Royal McBee LGP-30) was according to Lorenz only 1000 times faster than manual computingβ¦ Julia code allows writing this problem compactly with clear syntax, and to collect the solution on todayβs laptops within 1 ms.
using DifferentialEquations, ParameterizedFunctions, Plots
lorenz = @ode_def begin # define the system
dx = Ο*(y - x); dy = x*(Ο - z) - y; dz = x*y - Ξ²*z
end Ο Ο Ξ²
u0=[1.,1.,1.]; tspan=(0.,60.); p=[10.,28.,8/3]; # initial conds., timespan and parms.
prob = ODEProblem(lorenz, u0, tspan, p) # define the problem
sol0 = solve(prob, reltol=1e-9) # solve it
plot(sol0, vars = (1, 2, 3)) # display it
This probably violates the read clearly
part, but:
using CurrentPopulationSurvey
using DataFrames
using DataFramesMeta
oct20 = cpsdata(2020,10) |> DataFrame
unemployment_rate = (@linq oct20 |>
where(map(x -> x in 3:4, :pemlr)) |>
combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1] /
(@linq oct20 |>
where(map(x -> x in 1:4, :pemlr)) |>
combine(:pwcmpwgt => x -> sum(x) / 10_000))[1,1]
0.06593909683981009
With that, I downloaded the October '20 Current Population Survey public use microdata .zip file from the Census Bureau, unzipped it, parsed it, piped it into a DataFrame, and then computed the non-seasonally-adjusted unemployment rate for the U.S. (go ahead and check it at the Bureau of Labor Statisticsβ data hub ). I think thatβs kind of cool.
Seven is a really small number, but here goes.
- Declare a Type to represent the
k
-th partition of the integern
in lexicographic order for which every part is a power of2
. - Add a method to
Base.iterate
to describe how to view such a partition one part at a time. - The body of the function.
- Does
end
even count as a line? - Extending
Base.show
gives us a way to print out our partitions. - Populate an array that powers everything.
- Print out all the binary partitions of size at most 30
struct BinaryPartition k::Int64; n::Int64; b::Vector{Int64} end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
n == 0 && return nothing ; k<=b.b[(n-1)>>1+1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-b.b[(n-1)>>1+1],n>>1,i+1))
end
Base.show(io::IO, b::BinaryPartition) = (f=0; for x in b print(io,(f+=1) == 1 ? "" : "+",x) end)
maxn = 30; rt = Vector{Int64}(undef,maxn+1) ; rt[1] = 1 ; for i=2:maxn+1 rt[i] = rt[i-1]+rt[(i+1)>>1] end
for n=1:maxn, k=1:rt[n>>1+1] println(n," = ",BinaryPartition(k,n,rt)) end
These are admittedly some very compressed lines, and the interplay between line 3 and line 6 is pretty magical, and specific to this use case. Itβs probably not even showing off computational power, but it does highlight interface flexibility.
Ok β its much less about counting to seven than it is about being clear and coherent in the eye of the perceiver. Would you do that again without such βrun on linesβ please.
end
and a comment or an empty line are not counted
comments should reduce the βlineβ count, not add to it
<the different approaches and perspectives are helpful, leading to more exemplary arrows for the compelling Julia code quiver>
Principle coordinates analysis with a plot. CSV is not really necessary, could do the same with readdlm
but I donβt feel like looking up the syntax. This is not tested, but itβs something like this:
using CSV, Distances, MultivariateStats, StatsPlots
data = CSV.read("data.csv", Matrix, skipto=2)
dm = pairwise(data, BrayCurtis()) # distance matrix
pcoa = fit(MDS, dm, distances=true) # multi dimensional scaling
plot(pcoa)
The thing about this thatβs most amazing to me is that the third and fifth lines make use of
things that I (partly) implemented with small PRs to amazing and widely used packages, and the maintainers for those packages were so kind, thoughtful, and patient in guiding me through that (and improving my code).
Weβll call it seven steps of Julia.
- Make a Type,
BinaryPartitionNumberTable
to hold a sequence of values with a custom constructor to populate the table from an initial value and a recurrence. - The table has repeated values, and needs to be indexed from
0
to make sense, so we add a method toBase.getindex()
for our type to implement the indexing. - Actually generate a table.
- The table has all the information we need to list the parts of the
k
-th binary partition ofn
in lexicographic order. Create a Type,BinaryPartition
to represent this intent. - Since what we really want to do with these partitions is look at them one part at a time, instead of constructing the partitions, we add new method to
Base.iterate()
describing how to produce one part of one partition, and which partition describes the remaining parts. The iterator doesnβt know how many parts will be produced ahead of time, but it does know that each part is anInt64
, so we implementBase.IteratorSize()
andBase.eltype()
for our type to convey this. - We would also like to view our partitions, so we add a method to
Base.show()
. A lot of flexibility comes here, since the show method can be aware of the MIME-type of target output, and produce different results depending on context. For now, we just want to write a partition as a parenthesized sum, but we could just as easily write LaTeX to pretty print it, or do something that would make our data easy to copy and paste somewhere else in our workflow. - Actually view our partitions.
# A table of the number of binary partitions of integers starting at 0
# These are defined by the recurrence b[2n]=b[2n-1]+b[n], and b[2n+1]=b[2n]
# with the initial condition b[0]=1. Since each value is duplicated,
# we back the type with a vector half the size and implement custom indexing,
# and only actually compute the even terms.
struct BinaryPartitionNumberTable
table::Vector{Int64}
function BinaryPartitionNumberTable(maxn)
# rt[i+1] = b[2i]
rt = Vector{Int64}(undef,maxn>>1+1)
rt[1] = 1 # Initial condition
for i=2:maxn>>1+1
rt[i] = rt[i-1] + rt[(i+1)>>1] # Recursion
end
return new(rt)
end
end
# Implement custom indexing for our table.
Base.getindex(b::BinaryPartitionNumberTable, i) = b.table[i>>1+1]
Base.firstindex(b::BinaryPnums) = 0
Base.length(b::BinaryPnums) = 2*length(b.table)
Base.lastindex(b::BinaryPnums) = length(b)-1
Base.iterate(b::BinaryPnums, i=0) = i<length(b) ? (b[i],i+1) : nothing
maxn = 30
const bpt = BinaryPartitionNumberTable(maxn)
# Given our table, we can iterate over the parts of a partition, only constructing
# it one part at a time as needed. We make a type to hold initial values for
# the iteration.
struct BinaryPartition
k::Int64 # The k'th binary partition
n::Int64 # of the integer n
end
function Base.iterate(b::BinaryPartition, (k,n,i)=(b.k,b.n,0))
n == 0 && return nothing
# For lexicographic order, the first bpt[n-1] partitions are a 1 followed by
# a partition of n-1, the remaining ones are double a partition of n/2.
# Either output 2^doublings encountered so far and tell the iterator to
# proceed to n-1, or proceed directly to n/2, telling the iterator to
# add an extra power of 2 to every remaining part.
k<=bpt[n-1] ? (1<<i,(k,n-1,i)) : iterate(b,(k-bpt[n-1],n>>1,i+1))
end
Base.IteratorSize(::BinaryPartition) = Base.SizeUnknown() # Permits collect()
Base.eltype(::BinaryPartition) = Int64 # Lets collect use an array of the correct type
# We can now define a custom rule for displaying our partitions.
function Base.show(io::IO, b::BinaryPartition)
termnumber = 0
for x in b print(io,(termnumber+=1)==1 ? "(" : "+",x) end
print(io,")")
end
for n=1:maxn, k=1:bpt[n] println(n,"=",BinaryPartition(k,n)) end
I really like the flexibility for defining interactions with user-defined types. Indexing, iteration, and display rules can all be quickly extended.
The code is a bit golfed and Iβm not exactly sure how to count the lines.
using Plots
function mandelbrot(z) w = z
for n in 1:74
abs2(w) < 4 ? w = w^2 + z : return n
end; 75
end
x, y = range(-0.65, -0.45; length=1600), range(0.51, 0.71; length=1600)
heatmap(x, y, -log.(mandelbrot.(x' .+ y .* im)); aspect_ratio=1)
This is the output:
Hereβs a shorter version actually. Doesnβt use Bray-Curtis dissimilarity, but it gives a good idea of the point
using MultivariateStats, RDatasets, StatsPlots
iris = dataset("datasets", "iris")
X = convert(Matrix, iris[:, 1:4])
M = fit(MDS, X'; maxoutdim=2)
plot(M, group=iris.Species)
using Plots, StaticArrays
const ΞΈ = Ο/3; const m = SA[cos(ΞΈ) -sin(ΞΈ); sin(ΞΈ) cos(ΞΈ)]
const p = SA[1/3,0]; const q = p + m*p; const r = 2p
f(n) = n == 0 ? [0,0] : (v = f(n-1)/3; [v p.+m*v q.+m'*v r.+v])
a = f(6); b = [a [1,0] .+ (m*m)'*a [1/2,-sin(ΞΈ)] .+ (m*m)*a [0,0]]
plot(b[1,:], b[2,:]; aspect_ratio=1, size=(800,800), legend=false)
Iβll let you discover what it produces.
Spoiler
Ο is the most studied and fascinating number in the history of our world. Like other technical languages, Julia provides a constant for Ο but that is special (see blog).
The following code, edited from StackOverflow, prints Ο to N-digits after decimal point (c-digits per row), without rounding-off the last one. It plots also the relative frequency of each digit, computed using StatsBase. Some of Julia language features used: unicode symbols, arbitrary-precision BigFloats, string concatenation as a βproduct or powerβ and comprehensions:
using StatsBase, Plots
N, c = 1789, 80; r = mod(N,c) # N-digits, print c digits/row, r=remainder
setprecision(Int(ceil(log2(10)*(N+1)))) do
pi_str = string(BigFloat(pi))[1:N+2] * ' '^(c-r) # pad with blanks
println("Ο to " * string(N) * " digits past decimal point: \n" * pi_str[1:2])
[println(" " * pi_str[i:i+c-1]) for i in 3:c:N+3-r]
m = sort(countmap(pi_str)) # counts occurences of each digit
plot(collect(keys(m))[3:end], collect(values(m))[3:end]/N, t=[:bar], leg=false, title="Ο digits frequency")
# savefig("pi_digits.png")
end
Result:
Ο to 1789 digits past decimal point:
3.
14159265358979323846264338327950288419716939937510582097494459230781640628620899
86280348253421170679821480865132823066470938446095505822317253594081284811174502
84102701938521105559644622948954930381964428810975665933446128475648233786783165
27120190914564856692346034861045432664821339360726024914127372458700660631558817
48815209209628292540917153643678925903600113305305488204665213841469519415116094
33057270365759591953092186117381932611793105118548074462379962749567351885752724
89122793818301194912983367336244065664308602139494639522473719070217986094370277
05392171762931767523846748184676694051320005681271452635608277857713427577896091
73637178721468440901224953430146549585371050792279689258923542019956112129021960
86403441815981362977477130996051870721134999999837297804995105973173281609631859
50244594553469083026425223082533446850352619311881710100031378387528865875332083
81420617177669147303598253490428755468731159562863882353787593751957781857780532
17122680661300192787661119590921642019893809525720106548586327886593615338182796
82303019520353018529689957736225994138912497217752834791315155748572424541506959
50829533116861727855889075098381754637464939319255060400927701671139009848824012
85836160356370766010471018194295559619894676783744944825537977472684710404753464
62080466842590694912933136770289891521047521620569660240580381501935112533824300
35587640247496473263914199272604269922796782354781636009341721641219924586315030
28618297455570674983850549458858692699569092721079750930295532116534498720275596
02364806654991198818347977535663698074265425278625518184175746728909777727938000
81647060016145249192173217214772350141441973568548161361157352552133475741849468
43852332390739414333454776241686251898356948556209921922218427255025425688767179
04946016534668049886272327917
A visual proof of positive invariance in 7 LOC:
using ReachabilityAnalysis, Plots
@taylorize function vanderpol!(dx, x, p, t)
dx[1] = x[2]; dx[2] = x[2] * (1 - x[1]^2) - x[1]
end
X0 = (1.25 .. 1.55) Γ (2.35 .. 2.45)
prob = @ivp(x' = vanderpol!(x), dim=2, x(0) β X0)
sol = solve(prob, tspan=(0.0, 7.0))
plot(sol, vars=(1, 2), xlabel="xβ", ylabel="xβ")
Further details of what is done can be found in ARCH-COMP20 Category Report: Continuous and Hybrid Systems with Nonlinear Dynamics.
A complete definition of the game of life, and a simulation in the REPL:
using DynamicGrids
const born_survive = (0, 0, 0, 1, 0, 0, 0, 0, 0), (0, 0, 1, 1, 0, 0, 0, 0, 0)
rule = Neighbors(Moore(1)) do neighborhood, state
born_survive[state + 1][sum(neighborhood) + 1]
end
output = REPLOutput(rand(Bool, 100, 100); tspan=1:1000, fps=50, color=:blue)
sim!(output, rule)
(A description of what is going on as requested by @viralbshah)
This cellular automata rule is run over the grid defined by rand(Bool, 100, 100)
, for every cell at every timestep in tstep
. The neighborhood
is an iterable over the surrounding cells. In the most well-known game of life rule, a cell becomes active when the surrounding 8 cells (a βMooreβ neighborhood of radius 1) sum to 2. If the cell is already active it will stay active when the sum is 2 or 3. Swapping the 1s and 0s in born_survive
will give you different GOL rules.
β β ββββ βββββββββ
βββ β ββ β β βββ
β β β βββ
β β βββ β β βββ
βββ β ββ
ββ ββ β
β βββ βββ
β β βββ
ββ β βββ
ββ ββ
ββ ββ ββ ββ
ββ
βββ ββ β
ββββ βββ ββββ ββ β
ββββ β ββ β βββ
β β βββββββ ββ βββ
ββ ββββ βββββ ββββββ βββββββ ββ β β
ββ β ββ β βββ βββ ββββ ββ βββββ
ββ β β βββ β β ββ β
ββββ βββ βββ β βββ β βββ β
ββ βββ β βββ β β β ββ β ββ βββ
βββββ ββ β ββ ββ ββ βββ βββ β
ββ ββ β βββ β βββ β
ββ ββ
β
βββ ββ
ββ β
ββ β ββ
function bf(s) m, l, i, j, o = zeros(UInt8,2^15), Int[], 1, 1, UInt8(1)
while i<=length(s) c=s[i]
c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o : c=='-' ? m[j]-=o :
c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
c=='[' ? push!(l,i+1) : c==']' ? (m[j]!=0 ? (i=last(l);continue) : pop!(l)) : 0
i+=1 end end
bf("++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.")
The implementation is non-standard because when it encounters the character '['
it unconditionally executes the block at least once instead of skipping to the matching ']'
if the current byte is zero. In terms of the translation table described here, it is as if '['
were translated to do {...} while (*ptr)
instead of while (*ptr) {...}
.
Edit 1
Here is a version with the correct implementation for '['
. The function jmp
is used to jump to the instruction after the matching ']'
.
jmp(s,i) = (n=1; i+=1; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i)
function bf(s) m=zeros(UInt8,2^15); l=Int[]; i=j=1; o=UInt8(1)
while i<=length(s) c=s[i]; c=='>' ? j+=1 : c=='<' ? j-=1 : c=='+' ? m[j]+=o :
c=='-' ? m[j]-=o : c=='.' ? print(Char(m[j])) : c==',' ? m[j]=read(stdin,UInt8) :
c=='[' ? (m[j]==0 ? (i=jmp(s,i);continue) : push!(l,i+1)) :
c==']' ? (m[j]!=0 ? (i=l[end];continue) : pop!(l)) : 0; i+=1 end end
bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")
The Brainfuck code at the end generates the ThueβMorse sequence (I hope: I havenβt checked) and is taken from here. The resulting program is non-terminating.
Some comments about the code
s
is the Brainfuck source code, m
is the memory tape, i
is the current instruction index, c
is the current instruction code character, j
is the index of the current memory byte, l
is the loop stack, o
is one as UInt8
used to increment/decrement the memory.
Edit 2
Here is a version that transpiles Brainfuck to Julia code using metaprogramming.
const t = Dict('>'=>:(j+=1), '<'=>:(j-=1), '+'=>:(m[j]+=UInt8(1)), '-'=>:(m[j]-=UInt8(1)), '.'=>:(print(Char(m[j]))), ','=>:(m[j]=read(stdin,UInt8)))
le(s) = (n=1; i=2; while n>0 s[i]=='[' ? n+=1 : s[i]==']' ? n-=1 : 0; i+=1 end; i-1)
function bf(s) if isempty(s) nothing
elseif s[1]=='[' l=le(s); n=l+1; :( while m[j]!=0; $(bf(s[2:l-1])) end; $(bf(s[l+1:end])) )
else :($(get(t,s[1],nothing)); $(bf(s[2:end]))) end end
macro bf(s) :(()->(m=zeros(UInt8,2^15); j=1; $(bf(s)))) end
@bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()
The function le
is used to find the loop end (that is, the index of the matching ]
). The function bf
transforms a Brainfuck string to a Julia expression. The macro @bf
returns an anonymous function that initializes the memory, the memory index and executes the code generated by bf
.
The code generated by bf
is not particularly nice to read because it is deeply nested. There are better ways to generate the code but they require more lines (at least, I havenβt found a smart way to do it). You can inspect the code of the function generated by the macro with
@code_warntype @bf(">>++++++[>++++++++<-]+[[>.[>]+<<[->-<<<]>[>+<<]>]>++<++]")()
using Measurements, Unitful, Unitful.DefaultSymbols
# Calculate the thickness of tinfoil including uncertainties
Ο = (2.7 Β± 0.2)g/cm^3 # density of aluminum
# For a roll of tinfoil
mass = (250 Β± 10)g
width = (30.5 Β± 0.2)cm
length = (14.24 Β± 0.2)m
# Ο = mass/(width*length*thickness)
@show thickness = ΞΌm(mass/(Ο*width*length))
thickness = ΞΌm(mass / (Ο * width * length)) = 21.3 Β± 1.8 ΞΌm
Pretty much any of the examples in https://github.com/JuliaApproximation/ApproxFun.jl . Many are less then 7 lines already.
You may find this interesting! Was benchmarking the latest on Arrow with its work on DataFrames and a personal project I am working on:
Also, this is just nice to show the clarity of Julia and its ease of use:
You probably had the global variable m
already defined before using Unitful.DefaultSymbols
.