I have set an ambitious goal of modeling some Wolfram Physics Project features into Julia. Because actually, Julia works faster and more efficiently, while the subject is very interesting.
A bit later I will be happy to present a beta version of a small package in which everyone can participate.
In the meantime, I’d like to ask a few questions.
One of the key concepts in WPP is the Multiway system. Its essence is simple.
Let there be an initial string "A" and dictionary of rules: ("A" => "BBB", "BB" => "A").
Using them, we first get an obvious result: “A” → “BBB”
But the next step we can do in several ways (greedy method, first occurence method etc), of which I am interested in Multiway method, when we collect all possible results, so: “BBB” → [“AB”, “BA”] (this is like the Multiverse).
Looks like Julia’s built-in features don’t allow this. We will have to make an array from the string and analyze it manually. In this case, probably should abandon the strings at all?
But maybe I don’t know all the possibilities, so I’ll take any advice.
Hmm, ok, may be I should do it with Regex, but what about replacement (and to collect different ways)?
And! What is maximal string length that not overload Regex searching?
I would also consider a domain modeling approach where you can create your own data type and maybe implement the AbstractTrees interface over it.
As a general rule, if performance is paramount and you have a limited set of rules, and all the key / replacement values are known at the compile time, then you might want to give up the String anyway and replace it with a more memory friendly type (you might still generate a String output as final result).
I’m not sure it does what you want, but using similar techniques to write a “Rule to Julia” compiler might be a performant (though not necessarily simple) way to do what you want.
Yes, your answer is very close to what I’ve found. AbstractTrees is musthave for Multiway System, thank you!
Sooner I’ll publish a small “SubstitutionSystem” project and will be glad of your advice!
The problem is just find sub-vector in vector! How to make my_findfirst_vec,
so my_findfirst_vec([1, 1], [0, 1, 0 , 0, 1, 1, 1, 1]) return 5:6 (0_0)
I cannot find such recipe in community
If you know all replacement rules in advance the data structure you’re looking for is Aho-Corasik automaton (sometimes known as index automaton). It allows e.g. to find all matches in time linear with the length of the input string.
However it is not clear to me how do you derive A → [AB, BA]. To me your rules generate an infinite chain (or rather a tree) of replacements which you could in theory traverse breadth-first. The question is what do you want to obtain?
(to avoid infinite chains/branches one usually adds shift-invariant well-order on the free monoid of strings (e.g. shorter-then-lexicographical) – this of course is not trouble free, as choosing the right order for your application is a problem of its own
I’m going to implement functions SubstitutionSystem and MultiwaySystem from Wolfram ecosystem on Julia.
In some ways they are similar to Aho–Corasick algorithm, but in many ways they are not.
First of all, there are no problems with infinite string expansion and numbers of replacement results.
In the near future I will post a link to Git, and I really hope that Julia’ community will help me to create a good package!
Aho-Corasik is just (just?!) a data structure that allows you to find all occurrences of LHSes in a given input string in optimal time. Here by “string” I mean anything that is linearly ordered. I’m pretty sure Mathematica uses it behind the scenes to describe the “evolution of MultiwaySystem”. After reading their docs it’s still very much unclear to me what does it really mean, but hey, maybe your package is going to do a better job! I’m looking forward to reading it!