[ANN] MLStyle.jl v0.4.0

MLStyle.jl v0.4.0 released, and the main features are

  • the performance improments,
  • readability of generated code(referential transparency) and
  • “runtime suppprt free” code generation.

Performance

The full results of the benchmarks are made here. EDIT: click the figure at the GitHub README will lead you to the dataframe of benchmark result(a text file), in the same directory, with the same file name but different extension, you can find the julia source code for benchmark.

MLStyle has been already fast but it has some performance overhead in terms of array matching in v0.3, and now this went away.

To achieve this, I refactored the implementation via the approach I posted when in JuliaCN meetup 2019, which means makes the pattern matching “first-class” in the top level, and the PPT slides could be found here.

An MLStyle pattern is now a normal julia objects, as well as its visibility. Specifically, a MLStyle pattern object is something implements

  • pattern_uncall(::typeof(P), _, tparams, targs, args), for patterns `P{targs…}(args…) where {tparams…}
  • pattern_unref(::typeof(P), _, args), for patterns P[args..])

Generating Readable Code

“Readable” code means the generated code can be read by humans in some degree:

EDIT: a better example code

julia> macroexpand_nonrec(x) = macroexpand(Main, x, recursive=false)
macroexpand_nonrec (generic function with 1 method)

julia> quote @match x begin
           [1, 2, ::Int] => 1
           [_, _..., _] => 2
           (42, 5, ::String) => 3
       end end |> macroexpand_nonrec

quote
    #= REPL[39]:1 =#
    let
        true
        var"##return#640" = nothing
        var"##642" = x
        if var"##642" isa Tuple{Int64,Int64,String}
            #= REPL[39]:4 =#
            if var"##642"[1] === 42 && (var"##642"[2] === 5 && begin
                            var"##643" = var"##642"[3]
                            var"##643" isa String
                        end)
                var"##return#640" = let
                        3
                    end
                #==# @goto var"####final#641#645"
            end
        end
        if var"##642" isa AbstractArray
            #= REPL[39]:2 =#
            if length(var"##642") === 3 && (var"##642"[1] === 1 && (var"##642"[2] === 2 && begin
                                var"##644" = var"##642"[3]
                                var"##644" isa Int64
                            end))
                var"##return#640" = let
                        1
                    end
                #==# @goto var"####final#641#645"
            end
            #= REPL[39]:3 =#
            if ndims(var"##642") === 1 && length(var"##642") >= 2
                var"##return#640" = let
                        2
                    end
                #==# @goto var"####final#641#645"
            end
        end
        (error)("matching non-exhaustive, at #= REPL[39]:1 =#")
        #= =# @label var"####final#641#645"
        var"##return#640"
    end
end

This is quite readable comparing to the code produced by MLStyle 0.3.1.

By avoiding generating tricky/massive code, it’s now possible for users to understand why my code works unexpectedly, and better for bug tracking, which certainly makes using MLStyle more “transparent”.

Also, you can see the line number nodes correctly inserted into the generated code from your macro callsites. This makes the error much more readable.

The another reason why we did this is, we’re planning to make MLStyle a develop-time only dependency, i.e., you can choose not to distribute your packages with MLStyle as a dependency even if you use MLStyle to develop your package.

In this case, generating readable code makes sense unless you want to encrypt your source code…

The develop-time only dependency feature are discussed below.

Generating “Runtime Support Free” Code

This “runtime support free” means, MLStyle code generation now does not introduce any Julia 3rd package dependencies, even for MLStyle itself.

The generated code relies on only Stdlib, if you don’t mix up things with other libraries.

All extensions of MLStyle, like Record matching and Active patterns, are also “runtime support free” and can run with only Stdlib.

You might find the generated code contains GlobalRefs like $MLStyle.pattern_uncall(...), in the case, just delete these codes, they wouldn’t affect the execution of generated code, because they only for only code generation.

Unfortunately, so far this feature is still difficult to use, due to the following 2 aspects:

  1. Still no way to dump Julia ASTs into runnable Julia source code, even though I don’t need to expand macro calls like @inline or @goto(Only MLStyle macros need expanding).
    For instance, even this simple AST is not showed correctly:
      julia> :(if begin a = 1
                   true
             end
             a
        end)
      :(if #= REPL[13]:1 =#, a = 1, #= REPL[13]:2 =#, true
          #= REPL[13]:4 =#
          a
       end)
    
  2. If you generate the source code, for working it as a julia framework, your generated file should take place the file location of your source code. We might need some static code generation standard…

Some Interesting: Real Automatic Test Data Generator

To benchmark MLStyle more reasonable, I developed a random data generation framework, which can generate each type of data if you register the generator. And the data types are also able to generate.

The implementation is in this file, as a good practice of GADTs and pattern matching itself.

Using it is easy, and you can generate (nested) tuples, arrays, ASTs, user defined data structures, everything, from giving a syntactic specification.

Example 1: generate a tuple looks like (1, 2, _, ::Real).

_ means anything, ::Real means it’s a Real datum.

julia> spec1 = @spec (1, 2, _, ::Real);

julia> generate(spec1)
(1, 2, "1ERPLOA6K:qLsj_d>^XLm8L>wZ9:VoyMYVw", 91236578346507817552027220903026613041)

julia> generate(spec1)
(1, 2, 0xe36af271b15de98bc9cde9b1e0f98fe5, false)

julia> generate(spec1)
(1, 2, 0.56469476f0, Float16(0.635))

Example 2: generate such an array [1, _, <integers in 1:10>, [::DataType, ::Symbol]]:

julia> spec2 = @spec [1, _, (::Int) isa this_shape, [::DataType, ::Symbol]];

julia> generate(spec2)
4-element Array{Any,1}:
 1
  Float16(0.507)
 6
  Any[Real, Symbol("")]

Example 3: generate 2 or 3 or 4 user defined structure data:

julia> struct MyData{T}
           a :: T
           b :: Symbol
       end

julia> spec3 = @spec [MyData(_, ::Symbol){2, 4}...];

julia> generate(spec3)
2-element Array{MyData,1}:
 MyData{Symbol}(Symbol("\\\\"), Symbol(""))
 MyData{Complex{Float64}}(0.4846091787580733 + 0.6129709720838106im, Symbol("")

Example 4: generate ASTs from Expr constructor, or quotations.

julia> generate(@spec :(a % $(::Int) == $_))
:(a % -4985509457280692999 == Any)

julia> generate(@spec Expr(:ref, ::DataType, (::Symbol){1, 2}...))
:((Real)[var"=;vX3=", var""])

Well, the page size of this random data generator might take too large part of this post…but I’d say it’s too interesting, when I found out this idea I turned too excited and just didn’t sleep that night and kept playing with it…

16 Likes