Functions in an input file-- how?

Stephen_Vavasis · May 14, 2017, 4:14pm

In finite element software, there is a notion of an ‘input file’, that is, input to the solver program, that contains the mesh, the boundary conditions, material properties, algorithmic choices and so on. In their full generality, boundary conditions are functions that map (timestep, coordinate) to a scalar or vector. What is the best way to implement functions as part of the input file? My current thinking is that the input file should have anonymous functions written in the Julia source language (with the -> operator). The solver can call ‘parse’ on these and then store the resulting expression in a dictionary: Dict{DOFIndex, Function}, where DOFIndex is the index of the degree-of-freedom to which the condition is applied. This raises a few questions:

(1) Is there a better approach?

(2) After the solver has parsed these anonymous functions and stored them in the Dict, what steps are necessary to prevent an excessive amount of run-time dispatch? Type annotations?

(3) If, down the road, this code is published, then I would like the solver to have safeguards that ensure that the anonymous functions in the ‘input file’ don’t have unwanted or even malicious side effects. Is there a notion of ‘safe’ functions in Julia, or perhaps a way to execute a function in a sandbox?

Thanks,
Steve Vavasis

yuyichao · May 14, 2017, 5:08pm

Why does it have to be input file? Can’t you just let the caller of your library supply those mapping as function arguments?

IIUC “input file” are usually used in stand alone cmdline/GUI tools in general. Scriptable library usualy don’t do that as the lowest level interface. You can certainly (and easily) build an interface for standalone tools based on a scriptable API using a julia script.

No. there’s no safe mode.

JaredCrean2 · May 14, 2017, 5:26pm

The way I handled this for my code was to create a small interface that lets the user interact with the code programmatically rather than textually (ie. an input file). The script the user has to write looks like:

using PDESolver  # load the package


# define IC and BC functions
function myIC(args..)
  # function body
end

function myBC(args...)
  # function body
end

# add these functions to the dictionary of IC and BC functions
register_ic(myIC)
register_bc(myBC)

run_solver(ARGS[1])  # take the input file as a command line argument
                     # The input file can specify myIC and myBC as 
                     # the IC and BCs to use because they are
                     # already registered

One nice attribute of this approach is that if the user doesn’t want to register their own ICs and BCs (presumably your solver has a few common ones built in), then the script becomes very simple:

using PDESolver
run_solver(ARGS[1])

I would include this script in your package, so users don’t have to use the scripting interface if they don’t need the additional functionality it offers.

kristoffer.carlsson · May 14, 2017, 5:30pm

Building a pure textual interface seems quite out of place for a language like Julia where you can just pass in the actual functions to the solver.

PetrKryslUCSD · May 14, 2017, 11:48pm

I am just at this moment also pondering a similar question. My code consists of modules which use stored functions to call back into the user code, for instance to define the thickness of a shell at any particular point or to define a local coordinate system.

So far so good: the problem is how to ensure good performance? I understand that this used to be a hard problem in Julia, given that a function is a function, and the compiler does not get to see the return type when it compiles the workhorse modules. I am hoping that this has been resolved by now. I know that there have been some attempts to provide fast anonymous functions (Tim) and function wrappers and so on.

Has this problem been solved?

Thanks,

Petr

yuyichao · May 15, 2017, 12:38am

That problem is long solved (in 0.5). Function wrappers (at least FunctionWrappers.jl) is actually solving exactly the opposite problem on top of 0.5.

Stephen_Vavasis · May 15, 2017, 1:16am

With regard to the suggestion from several posters that a Julia FEM solver in should not have an ‘input file’ but rather should be programmatically scripted, I will agree, with some reservations. In the world of commercial software (Abaqus, MSC Nastran, etc), the idea of ‘input file’ is deeply entrenched. Furthermore, there is some benefit in distinguishing the ‘input’ to a program from the program itself. However, since Julia promises a ‘fresh’ look at scientific computing, then maybe it’s time for fresh reconsideration of the paradigm, and @JaredCrean2 's suggestion looks promising.

With regard to the other issue of how to avoid excessive run-time dispatch, I don’t quite understand the proposed solution. Suppose my code looks like this. Here, bc is a Dict{DOFIndex, Function}.

   for dof in doflist
      v = bc[dof](timeval)
      u[unpack(dof)] += v
    end

How much run-time dispatch is present in this code, and if there is a lot, how can it be mitigated?

ChrisRackauckas · May 15, 2017, 2:54am

To add to this, instead of iteratively building some global in the background, just have a model type that you mutate to build. Then register_ic!(m,myIC) etc. could iteratively build onto this model m. You can look at JuMP as a good example here.

If you have a good Julia interface, writing something that parses an input file to that interface isn’t difficult (string macro or something like that). So I wouldn’t worry about that: I’d get a Julia interface first.

Every function is a different type, so bc[dof] is type-unstable. So v probably won’t be inferred, making the next setindex! dynamically dispatch at runtime when it gets the type information.

But that’s just not the right way to write this. First of all, you can strictly type a collection of functions if you instead use a Tuple. Then a recursive algorithm on the tuple can be type-stable.

Many times, level of inference actually isn’t needed and can actually get in the way by increasing compile times (every new function you want to solve would recompile, so there’s a tradeoff which depends on how quick the function calls are. Depending on where you are in the tradeoff, @yuyichao’s suggestion of FunctionWrappers.jl could be the right solution, or in many cases just using a type-assertion to properly type v is perfectly fine for halting “the growth of type-noninferability”.

kristoffer.carlsson · May 15, 2017, 6:43am

I would argue this is the wrong way (inverse way) of arranging things. You do not want to have a Dict{DOFIndex, Function}, you want something similar to a list of (Function, Vector{DOFIndex).

In my code, each boundary condition has an associated function and a list of dofs that it is applied to. The loop then (as can be seen in https://github.com/KristofferC/JuAFEM.jl/blob/master/src/Dofs/DirichletBoundaryConditions.jl#L133) is over all boundary conditions instead. For each boundary condition I go through a function barrier so that dynamic dispatch only happens once per boundary condition. You typically don’t have that many boundary conditions so this work very well in my opinion.

In my opinion, storing the functions in tuples and doing some recursive shenanigans, or using FunctionWrappers is totally not the right approach here since it is so easy to contain the runtime dispatch to be of the order of the number of boundary conditions insteaf of order of dofs.

ChrisRackauckas · May 15, 2017, 1:14pm

kristoffer.carlsson:

In my code, each boundary condition has an associated function and a list of dofs that it is applied to. The loop then (as can be seen in https://github.com/KristofferC/JuAFEM.jl/blob/master/src/Dofs/DirichletBoundaryConditions.jl#L133) is over all boundary conditions instead. For each boundary condition I go through a function barrier so that dynamic dispatch only happens once per boundary condition. You typically don’t have that many boundary conditions so this work very well in my opinion.

In my opinion, storing the functions in tuples and doing some recursive shenanigans, or using FunctionWrappers is totally not the right approach here since it is so easy to contain the runtime dispatch to be of the order of the number of boundary conditions insteaf of order of dofs.

Well yeah, of course that’s an option and I do the same thing as you in my old FEM codes. But I’m not sure that actually covers his use case. That’s done because for simple FEM models, it’s easy to just write a single function. But it sounds like he’s trying to build a large FEM simulation from a file or have some iterative model-building API similar to what @JaredCrean2 proposed (which in turn is similar to JuMP). Your solution would require that the API for BCs would be “give me one function that does all of the Dirichlet BCs”. But it sounds like he wants a much more user friendly approach: let them choose points and register BCs to those points using smaller functions.

Of course one solution for structuring this better internally would be to build a single Dirichlet BC funciton on all points with a Dirchlet boundary condition. I.e, take in a bunch of user functions, and use a generated function or recursive on tuples algorithm to type-stability calculate the BC at all Dirchlet points. But that’s just internal organization and that shouldn’t end up being a big API constraint. Using the tools I mentioned, you can still have the user API build a Dict{DOFIndex,Function} (or just a Vector of a type for this relation), then have one step that changes that to a tuple of functions and builds a fast overarching BC function. Changing the API to say “I only will take in one big function that covers all of your Dirichlet BCs” because you want to skip writing one generated function / recursive algorithm gets rid of a ton of API-friendliness for almost no complexity change to the internal algorithm.

(Unless you’re actually proposing that the user writes a bunch of functions in an input file, and that is read using a string macro when reading the input to generate a single function for them? That’s also an option which gets to the same place in the end, but I think staying away from big macros is usually a good idea.)

kristoffer.carlsson · May 15, 2017, 1:27pm

I don’t think you understood what I said… You do not need generated functions or tuple recursiveness (which performance is very hard to reason about). You only need to make sure that dynamic dispatch doesn’t leak into inner loops.

I never said a single function. In fact I explicitly wrote " list of (Function, Vector{DOFIndex)". And as I showed in the code I linked, I am looping over a vector of boundary conditions, each having a separate function for a set of dofs. Then you have a function barrier so that the Function is known for the loop over the corresponding dofs.

Which is exactly how I do in my code. See for example: https://github.com/KristofferC/JuAFEM.jl/blob/master/examples/hyperelasticity.ipynb

dbc = DirichletBoundaryConditions(dh)
# Add a homogenoush boundary condition on the "clamped" edge
add!(dbc, :u, getnodeset(grid, "clamped"), (x,t) -> [0.0, 0.0, 0.0], collect(1:dim))
add!(dbc, :u, getnodeset(grid, "rotation"), (x,t) -> rotation(x, t), collect(1:dim))
close!(dbc)
t = 0.5
update!(dbc, t)

No, no generated or recursive stuff. Just do as I wrote in the start of this post. Recursive tuple unpacking is extremely finicky and limited by hard coded constants in inference .

JaredCrean2 · May 15, 2017, 1:32pm

That’s exactly the approach I use as well. Getting the dynamic dispatch to be order of number of boundary conditions is exactly the kind of mitigation the OP was asking for.

No. This solution is quite generic because it allows setting arbitrary functions for arbitrary sets of DOFs.

ChrisRackauckas · May 15, 2017, 1:51pm

I see, and if the sets are sufficiently large the amount of dynamic dispatch is just small enough that you don’t care. Yeah, that would work in all of the cases I’ve had to deal with. In that case, yes, a generated function to try to bring this lower would be over-engineering in the vast majority of cases.

kristoffer.carlsson · May 15, 2017, 2:02pm

I don’t see why you expect generated functions to give you any performance benefit here. They are just normal functions subject to the same dynamic dispatch penalties.

PetrKryslUCSD · May 15, 2017, 2:14pm

Would you please care to elaborate a bit? Are you referring to return type annotations? Or something else?
And I don’t understand the bit about solving the opposite problem…

Many thanks.

Petr

ChrisRackauckas · May 15, 2017, 2:21pm

Once you know your list of functions, you can put them in a tightly typed collected (like a tuple) and then build a function which evaluates all of the BCs without dynamic dispatching. Essentially, you can make it do fs[1](t,x) then fs[2](t,x) since indexing tuples with literals (but not variables) is inferrable. So it would be:

Transform whatever list into a tuple
Generated function to build a BC function on that tuple where all indexing is with literals. That now builds one BC function that avoids dynamic dispatch.
Pass this stuff into an inner function. You’ll incur a dynamic dispatch there, but once is fine.
Internally call the single generated BC function using the strictly-typed tuple of functions.

I agree this is completely unnecessary if the number of points in each boundary condition is sufficiently large, but it would be a way to handle the edge case of wanting to internally use a large list of functions in an inferrable way.

PetrKryslUCSD · May 15, 2017, 2:22pm

@JaredCrean2, @kristoffer.carlsson:

I have followed this discussion with great interest. But the use cases you listed do not cover all possibilities of when the library code might wish to call back into the user code. Example: Laplace(u) = f. In order to solve this with the finite element method, the function f() would need to be evaluated for each quadrature point. 2 million triangles means 2 million evaluations of the function. Can this be handled not only by directly passing the function into the code, but possibly even storing it in some type as f::Function and then calling it when the time comes?

Petr

kristoffer.carlsson · May 15, 2017, 2:26pm

For Neumann boundary conditions, just pass the function in as a normal argument into the assembly routines and the routine will be specialized for that function and run fast.

You can store it in a type too but then you should store it like:

immutable Neumann{F}
    f::F
end

and not

immutable Neumann
    f::Function
end

ChrisRackauckas · May 15, 2017, 2:27pm

In v0.5 and onwards, every function is its own type and Function is an abstract type. That’s nice because now they are fast and can inline anywhere. Anonymous functions are essentially just a generic function which creates some type whose name is unique you don’t usually care about it. Example:

f = (x) -> x+2
typeof(f)
##11#12

That ##11#12 is the type of f, and you’ll see that every anonymous function you make has a unique type (every function too, but more aptly named). Since Julia functions will automatically specialize on concrete types, this gives you type-stability and speed. Yay.

However, take a function like:

function g(f)
  # do some stuff
  x = f(...)
  # do something else
end

where the actual function is called once in a very long function call. This g would specialize on every function f you give it. This means that it will have to recompile every time. Or if looping over functions, it would have to dispatch each time if the function types cannot be inferred. If the actual function call part is small, then this recompilation is really unnecessary: it doesn’t really make anything faster anymore, but it will cause lots of recompilation. So FunctionWrappers fixes this opposite problem where now Julia may be over-specializing on functions by default, which can add dispatch or more compilation.

This fact about Function being an abstract type is exactly the difference between the two Neumann types @kristoffer.carlsson shows. What you use depends on what you’re trying to do: specialize, or avoid recompilation.

kristoffer.carlsson · May 15, 2017, 2:27pm

In Finite Element code, the function passed is typically small and is evaluated millions of times so you definitely want to specialize.

Topic		Replies	Views
Function in type definition General Usage	24	7102	May 19, 2017
Avoiding Dynamic dispatch General Usage	11	4387	October 13, 2017
Suggestions to wrap the inputs for a Diff Equation solver (types and structures) General Usage question , diffeq	14	511	January 6, 2021
Strict typing and restricting types in a method's signature General Usage	25	6311	December 3, 2016
How to add (anonymous) functions? General Usage question	18	1039	November 27, 2018

Functions in an input file-- how?

Related topics