Making Int's size runtime selectable with no perfomance kill

Dear all,

In my application (a mesh generator for CFD) I would like the user to determine wether Ints should be Int32 or Int64. The reason is that Int32 is usually large enough, meshes are still rearly larger than 2 billion points.

What I did is that in my main() function I define

const Label::DataType = eval(parse("Int$(parsedArgs["intsize"])"))

so the user can pass 32 or 64 as a command line argument.

Also, I define two types in the top-level scope of the modules as follows

const Face{T} = SVector{4, T}
const Cell{T} = SVector{8, T}

The code generates large lists of types Label and Face{Label}.
I use Label to parametrize all my functions and types using

where {Label <: Integer}

For instance,

type Block{Label <: Integer}                                                    
     vertexLabels::Vector{Label}                                                 
     vertices::Vector{Point}                                                     
     points::Vector{Point}                                                       
     cells::Vector{Cell{Label}}                                                  
     boundaryFaces::Vector{Vector{Face{Label}}}                                  
     edgePoints::Vector{Vector{Point}}                                           
     edgeWeights::Vector{Vector{Float64}}                                        
     curvedEdges::Vector{CurvedEdge}                                             
     nCells::SVector{3, Integer}                                                 
     gradingType::String                                                         
     grading::Vector{Any}                                                        
 end

and

function make_block_edges!(block::Block{Label}) where {Label <: Integer}

The code works. But using Label has totally killed the performance and using ProfileView I see that most of the diagram is read and filled with things from inference.jl. So, obviously I’ve srewed this up. I am wondering if I could get any help on what I did wrong and how to do this properly.

Kind regards,
timofey

Just have the user pass Int32 or Int64 e.g

function make_block(::Type{T}) where T <: Integer
    Block{T} (...) 
end

make_block(Int64) 

Also,
nCells::SVector{3, Integer} will be slow. Either parameterize or use a concrete type`.

2 Likes

Hej! Thank you for the reply.

This is more or less what I do, I think. I pass Labelas an argument to the function that creates the blocks. I can’t have the “user” provide the type itself though, I have to parse it from text first.

Good catch with the nCells fixed that now.

If Label is a constant, then do you even need Block to be a parameterized type?

You could do something like this:

const Label = parsedArgs["intsize"] == "32" ? Int32 : Int64
type Block                                                    
     vertexLabels::Vector{Label}                                                 
     ...
end
1 Like

Hello!

True, but then Label must be in the module-scope, whereas currently it is getting parsed inside the main() function.

In other words, I currently have a module that provides all the functionality (parametrized wrt Label) and a script that uses that module with a main() function.
Basically, If the definition of Block is not in the same scope as where Label is defined, I have to parameterize, right?

OK. Yes, that seems to be the sensible way to do it. I’ve done similar things and gotten the same performance as with a fixed type, so I have no idea why it didn’t work for you.

1 Like

Just look at the string and instantiate a Block{Int64} or Block{Int32} in main then? A function barrier when you call something from main will take care of type instability.

It appears that the Integer in the type definition was in fact the main performance killer. Before parametrizing I had Int64 there…
Such things are scary of course… who knows where I have more stuff that makes things 2x slower :slight_smile:.
Anyway, thank you for the help!

You should really rethink the eval parse stuff though. It prevents your code from being precompiled for example for no benefit. Almost every time you do a eval parse, there are better ways of doing it.

Ok, thank you! I will try to think of a different way. Although I have actually grown fond of eval parse!

If the only two alternatives are Int32 and Int64, then something like this should do it:

const Label = parsedArgs["intsize"] == "32" ? Int32 : Int64

Besides the precompile issue, eval can also be a security issue, and can lead to really strange error messages when given unexpected input, e.g. if you have parsedArgs["intsize"] = "32;String".

1 Like

I think the simplest solution is to just tell the users to give the typename directly as the cmd-line argument and set the time of the argument to ::DataType. But I like your solution to, it is really neat!

I have no problem sacrificing parse eval here, but I use it in another part of the code which reads expressions from a JSON file and evaluates them. This allows the user to define variables in the JSON and then use them in the mesh definition. Like, defining L=1 and then placing a point with the x coordinate equal to L. I’ll have to think on how to do that in another way…

Unless you want to support arbitrary julia code, you just write a small parser for the stuff you need (e.g variables).

The use case here seems quite simple. Depending on the value of a string you want to instantiate a type with different parameters. Just direct translation gives you if str == "..." instantiate(...) etc. Creating a new global variable that is read by the module when it is compiled seems like a very round about way of doing it.

You don’t even need to write a parser. Using Julia’s built-in parse works beautifully. (This is one of the things I really love about Julia!) All you need then is an expression evaluator that stores variables in a dictionary instead of polluting the local namespace. I wouldn’t be surprised if there’s already one around somewhere. Otherwise it could probably be done in like ten lines of code.

If you only support a small amount of syntax, writing your own parser / interpreter should make it easier to give good error messages and reject bad syntax than having to mess around with Julia’s AST.