Evolutionary Programming

I’m trying to work through Dan Simon’s Evolutionary Optimization Algorithms: Biologically-Inspired and Population-Based Approaches to Computer Intelligence, and would like, to figure out how to turn his sample Genetic Programming lisp code into Julia code. (His code is here, though I can’t find the id to link directly to chapter 7.)

The manual says that it’s possible to write “true Lisp-style macros operating at the level of abstract syntax trees”. But I’m having trouble finding any information on how to do so, and I keep getting errors. I may be able to reinvent the wheel after lots of trial and error, but I’d be very interested in resources.

2 Likes

The chapter you link is one of the best starting points. If you have specific questions, just ask them on this forum, which you should also search for “macro” to find a lot of discussions about macros.

That said, metaprogramming is not needed in Julia as frequently as in CL. See eg

In particular, try higher order functions first.

2 Likes

Thanks for the response.

Metaprogramming is important here, because the point of Genetic Programming is to run the following program:

Randomly generate n parent programs, parents, that attempt to perform some task.
While ! termination condition.
   for prog in parents
      Measure f_{prog}.
   end
Randomly select n/2 duples parents, pairs. #More fit programs more likely to be selected.
children = {code}[] #Empty vector (or tuple or...) of code.
for pair in pairs
   select nodes in syntax tree of pair[1] par[2]
   child1 child2 = parents with subtrees swapped at those nodes
   push!(children,child1)
   push!(children,child2)
end
parents = children
end

(This is my attempt at producing the pseudocode.)

The goal is to evolve optimal code for some task. Because this algorithm operates on the abstract syntax tree, metaprogramming is necessary here. (Though perhaps I should take the warning to heart and say this is too advanced me.)

In Lisp, it looks like algorithm is executed by operating on the strings, and I’ve played around with that in Julia, but it doesn’t work well, and the link makes sounds like its a bad practice. I’ve also tried playing around with building expressions with :(). But when I try to randomize the operation (from a list of, say, arithmetic operations) and the leaves, I run into problems related to how Julia parses literals, that I don’t understand well. I could ask about them, but I also suspect (though I don’t know) that this also isn’t best practice.

I think best practice would be to operate on the syntax tree itself with commands like:

code = :(+(2,3))
operators = [:+, :-, :*]
code.args[1] = rand(operators)
eval(code)

which randomly adds, subtracts, or multiplies 2 and 3. But I run into troubles doing much more.

For instance, if I wanted say, to iteratively create a vector, I could use:

v = Vector{Float64}()
for condition
   arg = foo
   push!(v,arg)
end

But I don’t know how to do something similar with code.

Similarly, with a DataFrame df, I can iterate through the columns with:

for name in Symbol.(names(df))

(Or something similar, I may have that a little off. That’s code’s similar to what Adrian Salceanu put in Julia Programming Projects, but I may have missed some nuances here.)

But I don’t know how to do anything like that here.

There still may not be a specific question there. What I’d like is examples of best practices for operating on, iterating through, and getting information about abstract syntax trees.

Check out Why use MLStyle.jl? — MLStyle.jl Documentation and

macrotools.jl

1 Like

I think that metaprogramming and ASTs are are distraction here. Cf

execute(f, args) = f(args...)
args = (2, 3)
f = rand((+, -, /))
execute(f, args)

Just operate on functions and values, and use a functional style. Depending on the space your genetic algorithm operates on, you could use a tree-like stucture or similar. Also, coding this to be type stable requires some tricks, but those can wait until the code stabilizes (and performance will never be worse than eval anyway).

FWIW, the same techniques apply in most Lisps, very little is Julia specific here. A modern Lisp approach could just use closures instead of AST construction, benefiting from modern Lisp compiler optimizations.

I am very surprised that some texts recommend AST manipulation. That is similar to using namespaces as a hash table: acceptable for texts from the 1980s, but not something a contemporary Lisp programmer would do.

7 Likes

Thanks! That’s helpful.

FWIW, this (I think it’s a grad school course project at MIT) implemented GP in Julia by operating on the AST’s. And, this talk from ClojureConj 2015 gives the same explanation for GP in Lisp as Simon does (and also around 13:30 he explicitly works on subtrees).

I think AST is important for re-combinations and mutations, not for the generation of the initial programs. That is, for programs to evolve, they need to produce children that are like the parents, but with “lines of code” (or something comparable) swapped. (Also, the functions not only need to be generated and evaluated, they need to be stored somewhere so they can be used to produce child functions. The ‘‘execute’’ function above returns a number, not a program, so the program the snippet produces can’t be used to produce a child program.) So, for instance, we’d like to be able to do something like the following example (from Simon), using Matlab code to the two parent programs, Program 1 and Program 2:

#Program 1
if x<1
    z = [1,2,3,4,5];
else
   for i = 1:5
      z(i) = i/x;
   end
end

and

#Program 2
for i = 1:10
   if x > 5
      z(i) = x^i;
   else
      z(i) = x/i;
   end
end
#Child program
for i = 1:10
   if x > 5
else
   for i = 1:5
      z(i) = i/x;
   end
end

Where lines 1 and 2 of the child program come from parent 2, and the rest of the lines come from parent 1. Obviously, this cannot work as the child program is garbled nonsense.

But, he says, because s-expressions in lisp correspond to nodes in the AST, we can perform that sort of operation in lisp. For instance (his example), given parent 1 and parent 2 as follows:

#parent 1
(+ (* x y) abs z)
#parent 2
(- (* (+ x z) x) (+ z (/ y x)))

we can swap the s-expression (*x y) from parent 1 with the s-expression (+ z (/ y x)) from parent 2 to get two children:

#child 1
(+ (+ z (/ y x)) abs z) # parent 1, but (+ z (/ y x)) where (* x y) had been.
#child 2
(- (* (+ x z) x) (* x y)) #parent 2, but with (* x y) where (+ z (/ y x)) had been.

This swap can (he says) be performed in lisp by finding an opening parenthesis and a matching closing parenthesis in each parent program, and swapping everything between them in the two programs. As I said, the Conjure Conj talk I linked above gives a similar example.

So though your the functional example is very helpful (especially since one of my goals in looking into this is understanding Julia better), I don’t think it can get me all the way there.

Anyway, I wanted to look into this so I could understand how Julia works better, and even if I don’t find answers to all the questions, I understand Julia a lot better.

Anyway, I think I confused things by putting eval() in my program above. I really am interested in generating a code that can be stored and operated on (by mixing it with other code), not in a convoluted method for generating a function with a random operator—though that’s not what I made it sound like last night.

I think eval() was left over from my sandboxing with code in julia, and my attempts to make sure I understood what was happening. That is, I was using it as a check so I could make sure I get the code I thought I got, and wasn’t just misreading the generated code (which, when I do things wrong can look right, but not be).

You can try this tutorial:

and notebooks here:

Some of the syntax is a bit old, but I think most of the metaprogramming stuff should still be OK.
And the ideas of course are all the same.

A more up-to-date version of the notebooks is here:

Hope that helps!

2 Likes

This is exactly what I was looking for. Thank you!

1 Like

I suspect that is the main concern: Just randomly mucking around with the full blown power of syntax trees are very likely to generate “garbage” that doesn’t run, for example getting into infinite loops, etc. Nevermind the learning curve to understand how to use the syntax.

When I played with Evolutionary Programming almost 20 years ago it was in the context of evolving a formula. It was simply done in C, by defining a tree: Leave nodes are literals or variables, branch nodes are binary operators, for example *, +, -, , with links to a left node and a right node. Unary operators like sqrt, exp, etc. are also easy as non-splitting branch nodes. You can easily perform cross-over by taking a node and the sub-tree below it and swapping it with a node in a different parent.
Extending it to the ternary operator ?: (which is actually just an alias for if syntax) should also not be too difficult.

Adding for loop should again be possible, but you probably then also need to think about storing variables etc.

While it might sound like it quickly converges to the full AST, I would still argue it is better to add it manually to force thinking through what you want to allow and what not. For example:

  • May the program loop and go into an infinite loop?
  • May it allocate new variables? If so how much, what happens if it allocates too much and crashes your whole Julia session, etc.
4 Likes