Juliac using packages

I use juliac to generate small binaries from julia code. It is a great tool.
There are unfortunately only few packages that can be applied when using juliac (e.g. DataStructures.).
Applying other useful packages (e.g. TOML, XML, CSV, DataFrames) result in errors when compiling: Verifier errors occur
(e.g. “unresolved call from statement DataFrames.length”).

What is the reason for failing when compiling?
Is it, because the packages are not type-stable enough, for juliac?

It would be very beneficial to use packages when generating stand-alone binaries using juliac.
Is it possible, or is there interest, to re-implement some of the packages, for applying in juliac?

5 Likes

Yeah those packages are type unstable. Maybe something like TypedTables works? But dealing with formats like those in a static way that’s looks like what Julia usually does is quite tricky sadly. You would need to implement something that’s more akin to what the C++ or even the C version of libraries look like. The sciml ecosystem mostly works for example, due to years of getting rid of all type instabilities in it.

4 Likes

Direct reimplementation to 100% type-stable, generic code would often cause other compilation problems, see: Why DataFrame is not type stable and when it matters | juliabloggers.com. Even statically typed languages will handle similar situations dynamically, just fundamentally limited (single dispatch, static type/interface restrictions on dynamically dispatchable methods, designated virtual methods, sum types, pattern matching) to the point a compiler can compile every possible call and objects can simply look those up. Julia chose different limitations, but that also means we can’t reasonably make a lookup table of compiled functions (not the same as the global method table) in advance for a lot of things out there. I expect actively developed packages to follow base Julia in the shift toward supporting JuliaC, but it’s very likely that different, more limited implementations will crop up. Limited doesn’t mean simple here, a lot can happen within limitations.

2 Likes

File IO packages like those fundamentally need a special design to be type-stable. In particular, the user needs to explicitly pass types for the expected structure, be it CSV column names + types or TOML tree structure.
Naturally, this wasn’t a priority or focus of many people before, and even now there’s only a relatively small subset of Julia users who really need full type-stability end-to-end. But it’s possible!

After data is loaded, the situation is different – there’s no shortage of type-stable tables and operations on them!
Types include at least base Julia vector-of-namedtuple, StructArrays, TypedTables; tabular operations – base Julia map/filter/…, and SplitApplyCombine, DataManipulation, etc packages.
I have no idea whether these work with juliac out of the box, but at least they are mostly designed to be type-stable already.

1 Like

Thank you! Your replies give hope that it is possible to use packages for compiling with juliac. Proabably the packages have to be adapt or to restrict.

I tried TypedTables indeed. It works when the column names are known within the Table contructor, e.g. t = Table(a = [1, 2], b = [2, 4]). However, I need to construct a table dynamically, i.e. from a vector of column names. juliac does not seem to compile this so far. For example,

function (@main)(args::Vector{String})::Cint
  column_names = [:a, :b]
  values = [[1, 2], [2, 4]]
  n = (; zip(column_names, values)...)
  return 0
end

results in error messages.
Already NamedTuple{(:a,:b)}([[1,2],[2,4]])cannot be compiled (see https://github.com/JuliaLang/JuliaC.jl/issues/93).

The NamedTuple problem seems to be that it doesn’t infer from a Vector{T} argument and a tuple of symbols of length N that the resulting type for the values needs to be NTuple{N, T}. So in your case, NamedTuple{(:a,:b)}([[1,2],[2,4]]), you have 2 symbols (:a, :b) so therefore you’d expect value type NTuple{2, Vector{Int}}. But instead you get a UnionAll. Here’s an example. I had to work around the fact that printing of the UnionAll type is broken (it tries to do the @NamedTuple{a::X, b::Y} style but there’s no X and Y)

julia> function without_value_types()
           nt = NamedTuple{(:a, :b)}
           nt([1, 2])
       end
without_value_types (generic function with 1 method)

julia> # this is a UnionAll type, Julia seemingly doesn't infer how many Int64s there should be,
       # even though it can only be 2 without erroring
       Base._return_type(without_value_types, Tuple{}).body.parameters
svec((:a, :b), NTuple{N, Int64})

julia> function without_value_types_tuple()
           nt = NamedTuple{(:a, :b)}
           nt((1, 2))
       end
without_value_types_tuple (generic function with 1 method)

julia> # this is concrete because (1, 2) has length 2 in the type
       Base._return_type(without_value_types_tuple, Tuple{}).parameters[2]
Tuple{Int64, Int64}

julia> function with_value_types()
           nt = NamedTuple{(:a, :b), Tuple{Int,Int}}
           nt([1, 2])
       end
with_value_types (generic function with 1 method)

julia> # and giving the value type explicitly as Tuple{Int,Int} is also fine
       Base._return_type(with_value_types, Tuple{}).parameters[2]
Tuple{Int64, Int64}
1 Like

This is fundamentally type-unstable

1 Like

@Jules Thank you! From your examples and description, I found that the compiling with juliac succeeds, as soon the names and the iterators are tuples instead of vectors.
For example:

function (@main)(args::Vector{String})::Cint
  column_names = (:a, :b)
  values = [[1, 2], [3, 4]]
  nt = NamedTuple{column_names}((values[1], values[2]))
  t = Table(nt)
  return 0
end

Constructing a named tuple from a vector of column names or values does not work. Also converting a vector to a tuple with e.g.

column_names = Tuple(column_names)

or

column_names = NTuple{length(column_names), Symbol}(column_names)

fails when compiling.
So it seems to be fundamentally type-unstable, as @aplavin mentions.

Well all parts of the NamedTuple type need to be knowable at compile time, so runtime names or runtime value types don’t work. But you can still have type stable table code with runtime names, just not with a NamedTuple.

1 Like

If I forget about the experimental trim option (and hence type stability issues), is it easy to use juliac as a convenient command-line front-end of good old PackageCompiler to generate standalone app bundles?

@jules

Can you give an example?
I used a matrix (of strings), but I would prefer something more in the direction of typed tables.

@greatpet Yes, without the trim option, there are not these type-stability issues. For example, for generating a typed table from a vector of column names and values, juliac generates a 189 MB stand-alone executable that does this job.

In case you know, is the precompile_execution_file option of PackageCompiler exposed to juliac? (Of course, you’d only need such an option for non-trim builds.)

1 Like

@greatpet At least I did not find the the precompile_execution_file option in the source code of juliac.

1 Like

How to use the sciml ecosystem if one cannot load the data for it?

Regarding the issue of constructing a NamedTuple from a vector of values, and thus constructing a TypedTable from a vector of values, it can be compiled when annotating the types for the compiler. Here is an example:

function (@main)(args::Vector{String})::Cint
  column_names = (:a, :b)
  values = [[1, 2], [3, 4]]
  values_tuple = NTuple{length(column_names), Vector{Int64}}(values) 
  nt = NamedTuple{column_names}(values_tuple)
  t = Table(nt)
  zeige(t)
  return 0
end