Functions accepting functions and keeping their types

I’m new to Julia and looking for a good way / advice on passing functions as arguments and keeping the types. My searches did not yield anything that seemed applicable, or at least I understood to be applicable.

As a sample project below I use parser combinators. Here we have two functions match and repeat that yield a parser than transforms a string into a (remainder :: String, result :: SomeType).

The function match returns Char, so I would like repeat( match('z') ) to return Vector{Char}. How can I accomplish this? Keep in mind, I want to reuse repeat to other parsers that might yield a totally different type.

Some options I thought of:

  • Add a parameter to repeat to denote the input and output result type β†’ ugly
  • Annotate the signature of the function β†’ can’t do this in Julia I think

Any insight would be appreciated :slight_smile:

function match(c :: Char)
    function charParser(s :: String) :: Tuple{String, Char}
        if s[1] == c
            return (s[2:end], c)
        else
            throw(ErrorException("Expected a specific charater")) 
        end
    end
    return charParser 
end



function repeat(parser)
    function repeatParser(s :: String) :: Tuple{String, Vector{Any}}
        resultList = Vector{}()
        
        try
            
            while true
                (s, result) = parser(s)
                push!(resultList, result)
            end
        catch e

        end
        return (s, resultList)
    end
    return repeatParser
end

input = "zzz"

parser = repeat( match('z') )

println(
    parser(input)
)

Output:

("", Any['z', 'z', 'z'])

Not sure how advisable this is, but you can maybe use Core.Compiler.return_type:

julia> Core.Compiler.return_type(match('z'), (String,))
Tuple{String, Char}

I thought what you want to do is similar to what Julia does for array comprehensions like [Char(i) for i in 97:100]: it makes a vector of the correct type for the generator function. Looking at the implementation I found this return_type function.

1 Like

Interesting!

So then I would do something like this to extract the type of the parser and then the return type is known:

_, return_type = Core.Compiler.return_type(parser, (String,)).types

If you’re willing to make match a callable object rather than a plain function, you can associate some traits to it, like its return type.

Applied to your example, this technique would look like:

# A parser matching a single character
struct Match
    c :: Char
end

# Make objects from this type callable
# -> actually match the character in the given string
function (m::Match)(s::String)
    if s[1] == m.c
        return (s[2:end], m.c)
    else
        throw(ErrorException("Expected a specific charater"))
    end
end

# Match objects produce a Char output
ret_type(::Match) = Char


function repeat(parser)
    function repeatParser(s :: String)
        T = ret_type(parser)  # adequate type for the given parser
        resultList = T[]
        try
            while true
                (s, result) = parser(s)
                push!(resultList, result)
            end
        catch e

        end
        return (s, resultList)
    end
    return repeatParser
end

Output from simple tests:

julia> parser = repeat( Match('z') )
(::var"#repeatParser#1"{Match}) (generic function with 1 method)

julia> parser("zzz")
("", ['z', 'z', 'z'])

julia> parser("zaz")
("az", ['z'])

julia> parser("")
("", Char[])
6 Likes

You could also just take the first element from the parser and use that to construct your vector. That way you do not have to rely on the internals of Core.Compiler.return_type which is not supposed to be used, it can break with any new release of julia

function match(c :: Char)
    function charParser(s :: String) 
        if s[1] == c
            return (s[2:end], c)
        else
            throw(ErrorException("Expected a specific charater")) 
        end
    end
    return charParser 
end



function repeat(parser)
    function repeatParser(s :: String) 
        (s, result) = parser(s)
        resultList = [result]
        try
            
            while true
                (s, result) = parser(s)
                push!(resultList, result)
            end
        catch e

        end
        return (s, resultList)
    end
    return repeatParser
end

note, this version needs to handle the empty case better.

You also create a variable

Vector{}()

which will always have unknown element type, and your force the Tuple{String, Vector{Any}} return type for no reason. In general, you shouldn’t have to annotate the return types, inference should handle that for you if the code is reasonably written.

2 Likes

@ffevotte Callable objects/functors I saw before, but didn’t grasp them. Thanks for the explanation, I roughly get the idea. Thanks!

With this I kindof get the code I had in mind if I could annotate return types:

struct Parser{O}
    f :: Function
    outputType :: Type{O}
end

function(p::Parser{O})(s::String) :: Tuple{String, O} where {O}
    return p.f(s)
end

Parser{O}(f) where {O} = Parser(f, O)

matchZ = Parser{Char}(
    s -> begin 
        if s[1] == 'z'
            return (s[2:end], s[1])
        else
            throw(ErrorException("Expected a specific charater"))
        end
    end)

function rep(p :: Parser{O}) where {O}
    return Parser{Vector{O}}(
        s -> begin
            resultList = Vector{O}()  
            try
                
                while true
                    (s, result) = parser(s)
                    push!(resultList, result)
                end
            catch e

            end
            return (s, resultList)
        end
    )
end

println(
    matchZ
)    
println(
    matchZ("zz")
)    
println(
    rep(matchZ)("zz")
)    

@baggepinnen Good point, thanks :slight_smile: Definitely a good idea to construct with the first element then.

I like annotating because it helps me see what flows in/out of functions. But I understand I’m not doing myself a favor by forcing it to be Any.

Having a more generic Parser type is a good idea, but brings a few caveats. In particular, you should make sure all its fields are concretely typed (and be aware that Function is an abstract type). Also, if your type is parameterized by O, there is no need to store O again as a field.

All in all, your previous solution could evolve to something like:

struct Parser{F, O}   # O -> return type
    f :: F            # F -> function type (better if it is concrete)
end

# A more convenient constructor
Parser{O}(f) where {O} = Parser{typeof(f), O}(f)

function(p::Parser)(args...)
    return p.f(args...)
end

function match(c::Char)
    function f(s)
        if s[1] == c
            return (s[2:end], s[1])
        else
            throw(ErrorException("Expected a specific charater"))
        end
    end
    return Parser{Char}(f)
end

function repeat(parser :: Parser{F, O}) where {F, O}
    Parser{Vector{O}}() do s    # "fancier" with a do block
        resultList = O[]
        try
            while true
                (s, result) = parser(s)
                push!(resultList, result)
            end
        catch e
        end
        return (s, resultList)
    end
end

which works like before:

julia> parser = repeat(match('z'))
Parser{var"#2#3"{Char, Parser{var"#f#1"{Char}, Char}}, Vector{Char}}(var"#2#3"{Char, Parser{var"#f#1"{Char}, Char}}(Parser{var"#f#1"{Char}, Char}(var"#f#1"{Char}('z'))))

julia> parser("zzz")
("", ['z', 'z', 'z'])

julia> parser("zaz")
("az", ['z'])

julia> parser("")
("", Char[])
1 Like

Nice, thanks!

I notice it takes a bit to change my perspective wrt the typing in Julia. Glad I picked a non trivial example and to get responses like these on the forum.

The approach outlined by @ffevotte is interesting, but I would recommend an approach similar to what @baggepinnen recommended. In my opinion, trying to restrict the return types of functions is fighting the language. Functions in Julia are generic, which means that any function foo can have many different methods, each of which has a different type signature. Even if all the methods of foo conform to a parametric signature, like [T] -> T (a function that accepts a vector of T and returns a T), anyone can add a new method to foo that violates that parametric signature, like Int -> String. (Even if a function is nominally private, it’s still possible for users to extend it.)

In general, I would recommend a generic programming approach that uses a combination of duck typing and parametric methods. You generally want to make the signatures of your functions as minimally restrictive as possible.

Here’s how I would approach this problem:

function match(c::Char)
    function parser(s)
        first, rest = Iterators.peel(s)
        (first == c) ? (c, rest) : nothing
    end
end

function match(n::Int)
    function parser(s)
        first, rest = Iterators.peel(s)
        # tryparse doesn't work on characters
        x = tryparse(Int, string(first))
        if isnothing(x)
            nothing
        else
            (x == n) ? (n, rest) : nothing
        end
    end
end

function repeat(parser; init=[])
    function repeat_parser(s)
        parsed = parser(s)
        if isnothing(parsed)
            return init, s
        end

        T = typeof(parsed[1])
        results = T[]

        local rest
        while !isnothing(parsed)
            first, rest = parsed
            push!(results, first)
            parsed = parser(rest)
        end

        results, join(rest)
    end
end

Here it is in action:

julia> parse_z = repeat(match('z'));

julia> parse_z("zzzaaa")
(['z', 'z', 'z'], "aaa")

julia> parse_1 = repeat(match(1));

julia> parse_1("111aaa")
([1, 1, 1], "aaa")

Note that I’ve added an optional init argument to repeat that defaults to an empty array of Any. You can use this to ensure that you get a typed empty array when the parser doesn’t find any matches. Here is the default behavior, which returns an array of Any:

julia> parse_1("aaa")
(Any[], "aaa")

And here is the behavior when you specify init:

julia> parse_1_Int_init = repeat(match(1); init=Int[]);

julia> parse_1_Int_init("aaa")
(Int64[], "aaa")

Notes:

  • I flipped the order of your tuple, since (results, rest) is more natural when you are using Iterators.peel.
  • For performance, x === nothing is recommended over isnothing(x). However, I think that was fixed recently, but I’m not sure which version of Julia that fix was in or will be in.
4 Likes

Thanks @CameronBieganek

I’ll probably write a complete example to parse, say JSON format, to figure out how it would look like on a larger piece of code.

As a follow up question: How would you indicate how functions/methods should be used, if not with type annotation? In other languages it is nice if your editor warns that you pass a function as parameter that is not expected to work. Do we let the Julia compiler figure this out?

The best way to indicate how a function is to be used is to write a docstring for the function. Of course docstrings are for humans, not for the compiler…

Julia is a dynamic language, so type errors will only crop up at runtime. So the best you can do is write good unit tests. (The VS Code linter will catch some errors, like undefined variable errors, but it won’t catch type errors.) That being said, we do now have JET.jl for doing static analysis. Here is an example:

g(x::Int) = 2x
h(x::Int) = string(x)
foo(f, x) = f(x) + 1
julia> using JET

julia> @report_call foo(g, 1)
No errors !
(Int64, 0)

julia> @report_call foo(h, 1)
═════ 1 possible error found ═════
β”Œ @ /Users/bieganek/projects/misc_julia/discourse/jet.jl:6 Main.+(f(x), 1)
β”‚ no matching method found for call signature (Tuple{typeof(+), String, Int64}): Main.+(f::typeof(h)(x::Int64)::String, 1)
└──────────────────────────────────────────────────────────

I think there is work being done to integrate JET.jl into the Julia VS Code extension, but I’m not sure what the status of that is. Note that if you are using JET.jl, you can still write your Julia code as you normally would. In other words, you don’t need to annotate function return types in order to use JET.jl.

2 Likes

Thanks for all the input. It helped me tremendously in understanding Julia better!