I want to formalize a julep for a design idea, I am asking the help of the community , please read and criticise
freely.
The problems it is addressing:
- Type piracy: It is too easy to break code in module M, even without directly evaluating code into it, by adding methods to Base functions or any other modules that module M depends upon.
exmaple:
import Base: +
+(a::Float64,B::Float64) = 0.0
sum(rand(5))
0.00
- Conflicting names: if both module M and module N export the same function name
f
, along with other useful stuff. Then usage of functionf
must be qualified even though there is a good chance that it is distinguished by its argument types(different modules usually handle different types)
Julia issues a warning in this case , even if there is no usage of f
in the enclosing module.
- Binary compilation of Modules: due to the way multiple dispatch currently works , it is impossible to compile module M to a useful encapsulated binary since any subsequent module import or loading may change one or more of its functions, this is specially true for Base , as the current state of affairs encourages changes to it
As a consequence Julia is just eventually fast, in practice most of the time when developing is spent on recompiling old code. Revise solves this issue per a running session, I believe it can be solved globally so that Julia will be slow the first time , but just the first time … ever.
Background
Recall that there are two types of method calls, invoke
which invokes a specific function
in a specific module for the specific argument types , and call
which dispatches to the most appropriate method
to call based on the argument types.
Most of the time the compiler is able to infer the argument types in compile time , and the choice has zero cost since it is replaced by an invoke statement.
If it is not ,then the function call is more expensive but on par with function calls in Python and R.
To speed things up, once the appropriate method is compiled , it is cached and a pointer to it is saved in the method table for faster access.
This julep introduces a direction that can keep 99% of Base as it is, while solving the 3 issues mentioned above.
Design idea
I propose to handle multiple dispatch (and only multiple dispatch) differently.
A. remove the notion of “extending” functions in Base or any other Module
B. given a module M
which defines a function f
, and using module N
which exports function f
will cause fusing both function signatures into f
in M
calling M.f will dispatch either to f in M or N according to standard dispatch rules(most concrete signature)
calling N.f will invoke N.f
just having A and B would break the concept of interfaces, at least the more complex ones where part of the implementation of the interface is broken into several sub-functions.
Therefore:
Resolving dispatch can be defined recursively upwards in the AST (toward the caller) as follows:
if func
is not an exported function it is resolved in its Module (according to B).
If it is an exported function it is resolved in its module ,and in its calling function scope … if the calling function have it imported( using the keyword using
or trivially if they are in the same module ).
The most concrete implementation is chosen. If there is more then one implementation with the same concreteness then
the “closest” module in terms of AST distance is chosen.
if the are 2 equally close then and only then a warning
is issued that use of function must be qualified.
example: given the following AST ( capital letters denote the Module containing the function, prefix _ denotes it is an exported function).
(M) f → (B) _sum → imp → _start
function start
will resolve to a method in B , then in imp(trivially) , then in sum
(trivially again since it is the same Module) then in M since sum
is an exported function. the most concrete method will be used.
Criticism
All criticism is accepted , this thread is intended for discussion and exchange of opinions and ideas.
The best way would be to give an example that works in the current design which breaks in the proposed design
and possibly what would it take to make it work in the proposed design, I will give an example in the first comment for this thread
Binary compilation
An exported function will cache in the top most module that resolved it.
The proposed design guarantees that resolving a method call in module M is fixed if M and its imports does not change. This in turns allows binary caching (.o files) and potentially making the task of building an executable
or compiling a function to be reduced to just linking (given that all methods were previously compiled).
Of course dynamic code can invalidate a binary cache , so some Revise mechanism is still needed
Thats it for now , waiting for your input!!