Porting a CAS to Julia

Jean_Michel · March 20, 2018, 5:01pm

I am trying to port the Chevie package of Gap3 and the Gap3 functions needed to make it work. If you are curious look at

J.Michel, The development version of the Chevie package of Gap3
Journal of Algebra 435 (2015) 308–336

I am learning Julia and Git at the same time so I have my hands full. The starting point is that, though it has been developed by physicists, I think Julia has a potential for mathematicians too.

felix · March 23, 2018, 11:29am

@Jean_Michel interesting!

Why are you targeting Julia? I see in your paper …

One may ask why this package was developed in GAP3, and not GAP4, which is the
released version of GAP since 1999; the reason is that the authors of the package made
considerable use of generic programming facilities (the “type system”) in GAP3,
which is incompatible with GAP4; the package represents a considerable investment
of programming time, and the author is not yet willing to stop his research for
one year, which is the minimum time which would be needed to port the package
to GAP4

Do you want to make this a public project (for example on github)? Perhaps other people could help port routines from GAP (or re-implement, depending on the license). I would find that interesting.

You are aware of possibly related projects AbstractAlgebra.jl and http://nemocas.org/ ?

chakravala · March 23, 2018, 12:43pm

Developing it as an open source project might help it grow faster indeed.

By the way, I also have an algebra related package here:

Jean_Michel · March 23, 2018, 12:44pm

I have been looking for the last 10 years for a place to port Chevie to. I looked at sagemath but it is too slow
(and badly documented for the features I need to use). Julia is a challenge (need to port the infrastructure at the same time), but seems to me a real possibility with a bright future – just like it is winning over Python in the domain of Numpy it could eventually win over Python in the domain of sagemath. I looked of course at http://nemocas.org/ but I cannot use most of it (the design is too rigid currently). I certainly would like to talk to its designers, but I will do that when I have made something public on github (after julia 7.0). I do intend to make everything public as soon as it is advanced enough.

Jean_Michel · March 23, 2018, 12:53pm

Contrary to Maple or Maxima where the system manipulates/simplifies symbolic expressions with little regard to the meaning of the terms, systems like GAP of MAGMA are not symbolic: each object has a precise mathematical interpretation. The difficulty in porting to Julia is that Julia is even more strict: a number in GAP
is something like the union of an Int, a BigInt, a Rational{Int}, a Rational{BigInt} or a Cyclotomic with coefficients any of the previous types, with automatic conversion all the time to the lowest possible type. It is
not possible to do this efficiently at present in Julia — perhaps it is not even desirable to do so (specially the part about converting all the time to the lowest type). But this implies
a major change in design of basically everything.

stevengj · March 23, 2018, 3:04pm

Can you elaborate on why not? This is precisely the sort of thing that can usually be done very efficiently in Julia, with compile-time promotion, parametric types, and compiler type specialization.

Jean_Michel · March 23, 2018, 3:08pm

Well, I would be very happy if it was possible. Why is then not BigInt already replaced by a Union{Int,BigInt}
which would have much better performance? If it was done I would just have to look at the code and extend it…

Elrod · March 23, 2018, 3:22pm

I really like the idea of manipulating Julia expressions, eg:

:(x^4 * exp(-(x-y)^2 * sin(x/z) )

Here, nothing is concrete, they are all symbols. This also gives the added power of being useful for generating functions. Simple example, calculating Gamma moments using the mgf:

julia> using Compat, Reduce, SpecialFunctions
Reduce (Free PSL version, revision 4015), 16-May-2017 ...

julia> Reduce.Rational(false);
julia> @generated function gamma_moment(a, b, ::Val{N} = Val{0}()) where N
                         ex = :((1 - t/b)^-a)
                         for i ∈ 1:N
                             ex = df( ex, :t )
                         end
                         quote
                             @fastmath begin 
                               t = 0
                               $ex
                             end
                         end
                     end
gamma_moment (generic function with 2 methods)

julia> gamma_moment(2, 2, Val(1)) ##The mean
1.0

julia> gamma_var(α, β) = gamma_moment(α, β, Val(2)) - gamma_moment(α, β, Val(1))
gamma_var (generic function with 1 method)

julia> gamma_var(2, 2)
0.5

julia> gamma_var(3.4, 1.7)
3.1764705882352953

julia> gamma_var(4, 2)
3.0

All the algebra is done at compile time, so the functions are fast (ideally, the compiler would do a little more work on its own without having to add @fastmath; fasmath seems to simplify expressions, but the actual “fast” functions tend to be slower AND non IEEE conformant…):

julia> @benchmark gamma_moment(2, 2, Val(2))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     2.779 ns (0.00% GC)
  median time:      2.783 ns (0.00% GC)
  mean time:        2.822 ns (0.00% GC)
  maximum time:     18.820 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark ((2+1) * 2) / ((2 / 2) ^ 2 * 2 ^ 2)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.519 ns (0.00% GC)
  median time:      1.528 ns (0.00% GC)
  mean time:        1.565 ns (0.00% GC)
  maximum time:     20.820 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

I think having this sort of thing would make it easy to get creative in generating functions.
One application I’ve been thinking of is having a macro that will automatically generate an asymptotic approximation for marginal posterior distributions given a likelihood function following, eg, this. (It requires Hessians and conditional maxima; numerical techniques are likely going to be required for the latter, but assembling efficient derivatives would be helpful – I imagine it would be hard for AD to beat). That sort of thing could be useful for simulation studies.
I imagine much more creative and advanced uses are possible…

stevengj · March 23, 2018, 3:30pm

Julia’s standard library has lots of functions and types that work for arbitrary Integer types, including both Int and BigInt!

There is a distinction you may be missing here between concrete types (which represent specific objects in memory — an Int is very different from a BigInt), functions that act generically on many types (which is fast because the functions are specialized at compile time), and types that contain other types (which can use parametric types to be both fast and generic).

For example, if I write:

f(x::Union{Int,BigInt}) = x + 1

or even just f(x) = x+1, then my function f works on both Int and BigInt, and is fast for both because when you call f the compiler generates a specialized version optimized for the type that is passed. (And when you pass a different type the compiler generates another specialized version.)

More care is required for types that depend on other types. If you had a rational type defined as:

struct Rational <: Real
     num::Integer
     den::Integer
end

then it would be very generic (the numerator and denominator could be any integer type) but slow: it would have to be stored in memory as two pointers to “boxes” that have a type tag (indicating which Integer type is used) followed by the actual data.

Instead, rational numbers are defined in Julia as a parameterized type:

struct Rational{T<:Integer} <: Real
     num::T
     den::T
end

Essentially, this defines not one type, but rather a whole family of types Rational{T} parameterized by the integer type T. Any given concrete instance, like Rational{Int}, has a specific memory layout that the compiler knows about and can be processed efficiently.

(For some tutorial benchmarking demonstrating the efficiency implications of all this, see the section on “Defining our own types” in the type-stability lecture from our 18.S096 course at MIT.)

Moreover, functions like +(x::Number, y::Number) that operate on heterogeneous types can also be type-specialized and compiled very efficiently, with the type promotion handled at compile time by Julia’s programmable promotion machinery.

Understanding this stuff is really the key to getting good performance in Julia. And you really need to understand this before you port a large computer-algebra system to Julia, as otherwise the design will be all wrong.

Jean_Michel · March 23, 2018, 3:42pm

I completely agree with this which is why I am happy to discuss such questions with you. But: I think I understand now everything in your post. However, do you understand why BigInt are very slow
in Julia compared to the integers of other languages like Python, Ruby or Gap where all integers are “Big”
(I found instances where they are 50 times slower, see [A plea for int overflow checking as the default - #46 by Olof_Salberger])?
The reason is that their type BigInt is a union of a BitsType Int and a Boxed type BigInt with automatic switch from one to the other depending when overflow or when the BigInt becomes small enough to be converted.
I hoped you were going to explain how to achieve this in Julia.

Similarly, I think a Union{Int, Rational{Int}} where Rationals with denominator 1 would be automatically converted to integers could be faster than Rationals

stevengj · March 23, 2018, 3:50pm

Yes, this is well known among the Julia developers. If you are using BigInt when Int would do, it is slower than using a efficient union type, ideally with the small integers stored inline. Also, the BigInt type is slower in Julia than in many computer-algebra systems because the implementation right now is not great at re-using allocations and mutating bignums in-place. See BigInt and BigFloat performance is poor · Issue #4176 · JuliaLang/julia · GitHub, [WIP] implement BigInt (and BigFloat?) with native Array by rfourquet · Pull Request #17015 · JuliaLang/julia · GitHub, …

My recollection was that Nemo.jl has better-performing bignum types right now.

It would be great to get someone with compiler expertise working on bignum performance in the core Julia, but so far that hasn’t been a priority.

Jean_Michel · March 23, 2018, 3:51pm

Not my experience with Nemo (in A plea for int overflow checking as the default - #46 by Olof_Salberger the Nemo integers did not improve the speed).
Anyway, you see now why I said that the generalized GAP “numbers” which are union of everything from Int to
Cyclotomics are beyond my capacity to implement in the current state of Julia. I really need to have an
example Union{Int,BigInt} to look at so I understand how to extend it.

chakravala · March 24, 2018, 4:06pm

@Elrod note that you are running an older version of REDUCE, if you Pkg.checkout("Reduce") the master branch, the build script now automatically detects old versions and installs new versions of the upstream REDUCE binaries if you are a Linux or OSX user.

To do this, simply run Pkg.build("Reduce") again. However, you have to run the build command twice to trigger the update if you previously ran the build command on Reduce v0.3.0 or less. On v0.3.1 forwards, this will not be the case, so it should trigger properly after you ran build using the new update.

chakravala · March 25, 2018, 12:35am

@Elrod if you don’t need to go up to arbitrary N, and only need, say up to N=2 or whatever, then you might get faster performance if you define your function like this:

julia> Reduce.Rational(false);

julia> dfexpr(N::Int) = df( :((1-t/b)^-a), :t, N);

julia> @eval begin
           function gamma_moment(a,b,N=0)
               t = 0
               if N == 0
                   $(dfexpr(0))
               elseif N == 1
                   $(dfexpr(1))
               elseif N == 2
                   $(dfexpr(2))
               else
                   throw(error("$(N)-th not supported"))
               end
           end
       end;

e.g.

julia> gamma_moment(2,2,1)
1.0

Elrod · March 25, 2018, 4:30pm

It is definitely a good idea to move dfexpr out of the function body, to help with compile times.
Otherwise, if which derivatives I’m taking is known at compile time, the generated function version doesn’t have to do the “if” checks.
Comparison on a different computer; the _nofm version means without fastmath:

julia> @benchmark gamma_moment(2,2,Val(2))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     4.175 ns (0.00% GC)
  median time:      4.537 ns (0.00% GC)
  mean time:        4.686 ns (0.00% GC)
  maximum time:     21.161 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark gamma_moment(2,2,2)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     23.837 ns (0.00% GC)
  median time:      23.851 ns (0.00% GC)
  mean time:        24.677 ns (0.00% GC)
  maximum time:     44.803 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     996

@benchmark gamma_moment_nofm(2,2,Val(2))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     15.349 ns (0.00% GC)
  median time:      15.489 ns (0.00% GC)
  mean time:        15.960 ns (0.00% GC)
  maximum time:     37.048 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     998

It’s easy to use the Val version and support an arbitrary number of runtime derivative numbers:

julia> function gamma_moment(a, b, N::Int)
    Base.Cartesian.@nif 5 i-> (i == N) i -> gamma_moment(a,b,Val{i}()) i -> throw(error(string(N)*"-th not supported"))
end
julia> @benchmark gamma_moment(2,2,2)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     14.129 ns (0.00% GC)
  median time:      14.131 ns (0.00% GC)
  mean time:        14.722 ns (0.00% GC)
  maximum time:     30.733 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     998

In my application, which derivatives you’re taking will be known at compile time, so I can stick with that version.

The problem is in wanting to use it for symbolic differentiation is, that it wouldn’t support things like for loops, or repeated assignments (unless I’m mistaken):

function example(a,b)
    target = 3*a*b
    target += 2*(1-a)*(1-b)
    target
end

so the thing to do would be, write a macro that could translate the above expression into the single line :(3*a*b + 2*(1-a)*(1-b)), taking care to correctly support things like

function example(a,b)
    z = 3
    target = z*a*b
    z -= 1
    target += z*(1-a)*(1-b)
    target
end

One of my motivations in this is realizing how slow the AD packages I’ve tried are at finding Hessians with respect to x of the form:

f(x,A) = x' * A * x / 2

where the answer should of course simply be “return A”. Instead of being a noop, they’re pretty slow. A symbolic approach should be able to simplify things and get there quite easily. I can’t help but think packages like XGrad would be easier to write if they could use a CAS. Perhaps also combining with DataFlow (as a better @fastmath) to eliminate redundant code for getting all the derivatives 0,…,k.

chakravala · March 26, 2018, 11:50am

Interesting, yes, It might be actually possible to make the Reduce parser handle such a scenario.

The only issue is deciding on what the default behavior of differentiation a :function expression should be, it is certainly possible to differentiate line by line each part of the function (initially that is what the parser did, although I changed it so that it doesn’t do that at some point).

However, the += operator is not yet recognized by the parser, but that can be added too.

As far as translating that into a single expression like :(3*a*b + 2*(1-a)*(1-b)), I believe that the compiler takes care of such optimizations automatically.

Finallly, it should also be possible to simplify the matrix function you specified using reduce, I was looking into more matrix functionality recently. However, it will require some investigation to fully support matrix features, because it is not clear from your function definition that A is a matrix and that x is a vector. The parser would have to be able to infer that this is a matrix you want to simplify, but this information could possibly be obtained from the function definition: f(x::Vector,A::Matrix) = x' * A * x / 2.

I can’t immediately implement these features, due to having other work to do, but these would definitely be some good next steps for extending the parser. Thanks for your input. Cheers!

PS, note that df can specify the differentiation depth, so no LOOP is required df(expr, :t, N).

o314 · March 28, 2018, 4:36pm

As we are talking about cas, i am intested by a nearby topic that is constructive algebraic topology (cat).

FYI, there is a software kenzo implemented in lisp and now hosted in jupyter that can provide a bits of computation power.

Kenzo, and cat software generally, goals is more about computing mathematical space(logy) eg. topology + geometry than algebra but it should be a rather usefull addition in regards to the constant matrix and tensor work we encounter noawadays.

Here is a set of notebooks using kenzo :

Does someone know if there is some works in Julia on a similar matter ?

chakravala · April 7, 2018, 7:14pm

@Elrod your requested feature is now implemented on master branch, along with many other new capabilities

There is a new function called squash that will squash quoted expressions that use += type operations

julia> :(d = 2; d *= x) |> squash |> log
:(d = log(2x))

You can force the use of squash by simply using it on your expression

However, it may have unintended consequences, it is experimental.

Also, you can now use fully symbolic mode in Julia, you can now build expressions using the standard algebraic order of operations built into Julia and Reduce:

julia> (1-1/:n)^-:n
:(1 // ((n - 1) // n) ^ n)

julia> limit(ans,:n,Inf)
:e

julia> eval(ans)
e = 2.7182818284590...

So expressions no longer need to be quoted from the outset, they can also be programmatically constructed using symbols, expressions, and algebra / arithmetic, and other operations.

chakravala · April 7, 2018, 7:33pm

@Elrod as far as your more complicated exmaple goes, a solution does not yet exist for

function example(a,b)
    z = 3
    target = z*a*b
    z -= 1
    target += z*(1-a)*(1-b)
    target
end

However, it should be possible to figure out an algorithm that simplifies that the way you want.

Alternatively, what you can currently do with the new symbolic methods is this:

julia> z = 3
3

julia> target = z*:a*:b
:(3 * a * b)

julia> z -= 1
2

julia> target += z*(1-:a)*(1-:b)
:(((5 * a * b - 2a) - 2b) + 2)

julia> Expr(:function, :(example(a,b)), optimal(target))
:(function example(a, b)
        (5b - 2) * a - 2 * (b - 1)
    end)

So you can construct your new function that way if you want. But doing it from a quote / function block directly is not really implemented, an algorithm would have to be created for that.

Or you can do it like this

julia> @eval begin
           z = 3
           target = z*:a*:b
           z -= 1
           target += z*(1-:a)*(1-:b)
           target
       end
:(((5 * a * b - 2a) - 2b) + 2)

julia> Expr(:function, :(example(a,b)), optimal(ans))
:(function example(a, b)
        (5b - 2) * a - 2 * (b - 1)
    end)

That is another way of accomplishing that, but still not quite the same way you originally envisioned.

chakravala · April 7, 2018, 8:57pm

Alrighty, I have completed the squash function. It works under the assumption that you only require the final target expression in your end result. If you need to preserve multiple target expressions in the simplifying process, then a more complicated algorithm is needed. E.g.,

julia> Expr(:function,:(example(a,b)),quote
           z = 3
           target = z*:a*:b
           z -= 1
           target += z*(1-:a)*(1-:b)
           target
       end) |> Reduce.linefilter
:(function example(a, b)
        z = 3
        target = z * :a * :b
        z -= 1
        target += z * (1 - :a) * (1 - :b)
        target
    end)

julia> ans |> squash |> optimal
:(function example(a, b)
        (5b - 2) * a - 2 * (b - 1)
    end)

this squashing of the expression works with the current Reduce master branch on julia v0.6

Topic		Replies	Views
Symbolic computation in Julia with lisp/reduce; and calling Julia from LaTeX Internals & Design package , proposal	32	7600	October 29, 2020
CAS Best Practices Machine Learning symbolic	89	11063	April 16, 2021
[ANN] Symbolics.jl: A Modern Computer Algebra System for a Modern Language Package Announcements	146	48736	May 10, 2024
CAS benchmarks (Symbolics.jl and Maxima) Performance symbolic	36	9188	April 5, 2021
SymPy.jl vs SymEngine.jl vs Reduce.jl vs General Usage	23	13268	October 27, 2020

Porting a CAS to Julia

Related topics