Now in v1.10 the parser is written in Julia, but we can still enjoy the FemtoLisp REPL with the --lisp
flag. Is FemtoLisp purely an Easter egg now? Or does it do anything?
It’s still used for lowering for the time being. The flisp parser is also still available as a fallback and for bootstrap.
It still does. It did parsing and “lowering” I think. Now only the latter, unless you opt into the older parser. That’s still possible in case there are bugs, but will likely go away, there doesn’t seem to be bugs anymore, then few.
I expect that only other use to go away, though I’m not the expert, on when or how much of a problem. And with it likely the “Easter egg”… It might become an optional JLL/package in case you miss it. Do you?
I’m not sure the command line option will survive, does anyone rely on that undocumented feature?
What do you mean with bootstrap? I’m not sure I understand it well enough, I see changes that may be related to it(?). I think I understand lowering well enough, but for the flisp part of it, and bootstrapping, would you say it’s a lot of code, as in likely easy to replace?
If the parser is written in Julia, how do you parse the parser without already having a Julia parser?
One answer would be to have bootstrapping be done from a (yet to be made) Julia with S-expressions. This could be automatically generated by the parser so you only need 1 implementation of the parser.
Yeah, that would be a nice way to do it. In the meantime the flisp parser parses the Julia parser.
If we have a binary of the previous parser it can be used to build the next version.
That’s not very appealing because in order to build any version from scratch you have to generate a long chain of compiled versions and if any step of that chain stops working you no longer have a reproducible build process. Checking binaries in is ugly and then you just have these binaries that you need but can’t really trace the origin of.
By contrast, Oscar’s suggestion of having a simple s-expr parser written in C would work like this:
- Parser is written in Julia
- Initially the Parser is parsed with the legacy flisp parser
- The parser parses itself and emits s-exprs for the parser code
- Commit these generated s-exprs along with Julia source for parser
- Also implement a simple s-expr parser in C that can parse saved s-exprs
- Bootstrapping: the simple s-expr parser parses the saved s-exprs, which produces a working Julia parser that can be used to parse Julia code
Normally the Julia parer goes straight from Julia source => Julia AST, which is fine when you have a fully working parser already. This approach effectively uses s-exprs as a “virtual machine” for parsing, decoupling the complex parsing part, i.e. Julia source => s-exprs, from the platform-specific part, i.e. s-exprs => Julia AST. This is analogous to how JVM byte code is used to separate the Java language front-end, which can “just” produce VM byte code from the platform-specific part that needs to take byte code and actually run it on a specific platform. But in this case, “interpreting” s-expr byte code is just a matter of turning a file with s-exprs into Julia AST, which is really very straightforward (much more straightforward than implementing a JVM). This way you can do the Julia source => s-exprs part on a system where you already have a working parser, save the s-exprs (which are system-independent), move them to another system where you don’t have a working parser, and use a simple s-expr parser to convert that into in-memory Julia AST on the new system, which can then actually be evaluated.
When using one of these binaries in a new build, one would be required to trust every binary ever used in its “ancestry”. Each would be a vector of attack not only for the parser being built, but for anything the new parser ever generates (its “descendants”).
This reminded me of wasm, since it uses s-exprs for its text format, and the wasm binary format acts similar to bytecode for the JVM.
But that has me wondering—would such a “bootstrapped” s-expr Julia parser make for easier porting to new environments (for the parser, at least)? Or perhaps act as a “target format” for alternate front-ends?
Yes, porting to new environments is one of the main use cases: on a new system you “just” need to compile the C s-expr parser (and have a working LLVM backend). It could also be a target for people wanting to use a different syntax front end for Julia.
I guess I’m playing devil’s advocate, but I don’t think you’re strictly correct. There’s been academic research on this topic (countering Ken Thompson), and Linux (and other) software distributions have to deal with the same and related issues: https://reproducible-builds.org
Afaik with reproducible builds you only need to trust the original inputs, intermediate builds are deterministic and verifiable.
The Rust compiler is built using its own previous minor version.
The Zig folks used WASM/WASI to self-host.
https://ziglang.org/news/goodbye-cpp/
That’s a very cool approach, but I think for us the s-expr approach is much simpler. Why can we do that whereas Zig can’t? One reason is that Julia AST is already basically s-exprs, so this is really easy for us, whereas Zig would have to design a syntax serialization format that’s simple to parse yet general enough for the entire Zig syntax (which presumably wasn’t designed with s-exprs in mind). The other major difference is that Zig is trying to fully boostrap their entire compiler, so they need a way to run the whole thing on a new system, not just the parser. Julia, on the other hand, isn’t fully bootstrapped—our core runtime is written in C—so we just don’t have as hard of a boostrapping problem as they do and a simpler solution works for us.