Details:
I am calling C functions and using C structures from Julia by using the Clang.jl to automate the wrapping process. In the C code, there are multiple arrays of one or more dimensions. In order to have a one-to-one mapping between C and Julia structs, Clang.jl converts C arrays into tuples. Moreover, arrays of multiple dimensions are translated to nested tuples. Here is a toy example:
typedef struct {
int x[11][22][33];
} s1;
is converted into
struct s1
x::NTuple{11, NTuple{22, NTuple{33, Cint}}}
end
Problem:
The problem is that the code takes a long time to compile (~6 minutes on the first run and ~0.2 seconds for subsequent ones). After doing some research, I realized that large (and nested?) take a long time to compile as discussed in this issue and this one.
To study this issue, I used SnoopCompile.jl to analyze the inference time and squash type-instabilities using Cthulhu.jl but, after more time that I care to admit, I realized that inference was hardly the bottleneck of the initial run.
I already had a great discussion with the maintainer of Clang.jl and we came up with some solutions/workarounds. See this discussion for more information.
Additionally, I work with (relatively) large tuples of structs so this could contribute to the lengthy compilation process.
Possible solutions:
- Precompile: This the most obvious one. However, the code that is slowing me down is in the package I am working on. Therefore, if I precompile and make changes I would have to compile again .
- Flatten the nested tuples: My theory was that if I create linear tuples instead of nested ones, the compilation time will decrease and I will finally be happy. For example, replace:
struct s1
x::NTuple{11, NTuple{22, NTuple{33, Cint}}}
end
with
struct s1
x::NTuple{11 * 22 * 33, Cint}
end
Alas, I only saved 10 seconds out of 6 minutes. Not good. Let’s keep moving.
3. Create and allocate complex structs using C APIs: The idea was suggested by the maintainer of Clang.jl @Gnimuc. I am still working on this one but I am having issue avoiding garbage collection on some of the variable passed to C. Results of this approach are still pending.
4. Create aliases to the structs with arrays: The approach is to create another struct (s1bar
for example) with all the arrays converted to pointers. Then, populate s1bar
in Julia and pass it to C. Finally, copy the data appropriately from s1bar
to s1
. s1bar
could look like this:
typedef struct {
int *x; //[11][22][33]
} s1;
Questions:
- Is there a workaround in Julia?
- Can I profile the compilation process? I want to zone in on the functions that are the longest to compile and tackle them somehow
Workarounds:
One of the workarounds I found in the Julia discourse is to turn off optimizations in Julia: julia -O0
. This approach gets the compilation time to ~1 minute and 30 seconds at the cost of runtime slowdown, obviously.
Any help would be much appreciated. Thank you in advance.