Can one make a constant with a parametric type?

Is it possible to make a type-parameterized constant? An example of when this is useful is with immutable containers. If you have an ImmutableVector{T} class, ideally ImmutableVector{T}() === ImmutableVector{T}()—that is, there is exactly one instance of the empty immutable-vector of type T. This works if your immutable-vector type is a simple struct, but probably the ImmutableVector{T} contains a private-by-convention field that stores the actual array contents as an array:

struct ImmutableVector{T} <: AbstractArray{T,1}
   n::Int
   cells::Vector{T}
end
# then don't define setindex!

In cases like these, two different instances of the empty array won’t be identical because their mutable fields won’t be identical. It would be nice to be able to define:

const empty{T} = ImmutableVector{T}(0, T[]) where {T}
ImmutableVector{T}() = empty{T}

but this doesn’t work. Is there a clever way to do this?

I am not sure I fully understand the question, but when the vector is empty, wouldn’t the fields be identical (ie 0 and Vector{T}())?

Also, why store the n, isn’t that just length(cells)?

This is just a dummy implementation to demo the problem; in the real immutable vector class, the structure is a nested tree, so the n is necessary.

And Vector{Int}() === Vector{Int}() yields false. Two empty vectors are equal but not identical, so, no, when two vectors are empty, they are not identical. Without the constant empty{T}, every empty vector needs to be newly allocated and represents a unique object. However, there’s never any reason with immutable vectors for there to be more than 1 instance of the empty vector of a given type.

The reason to do this is because empty vectors are common and used all over the place to signify an empty (immutable) slot of some kind; ideally, having all of these empty vectors be a single object would be a performance improvement, and allow people to test for emptyness using identity instead of equality. I’m aware the performance difference may be small depending on how these are used, but I’m still interested in the answer.

I would suggest an API defining/using isempty.

That said, a very convoluted solution could store one of each ImmutableVector{T}(0, T[]) in a table (eg a Dict by T), generated on demand, and have an outer constructor deliver one of these. Something like (not tested):

const EMPTY_VECTORS = Dict{Any,Any}()
function make_immutable_vector(n, cells::Vector{T}) where T
    if n == 0
        get!(EMPTY_VECTORS, T) do
            ImmutableVector(0, Vector{T}())
        end
    end
    ImmutableVector(n, cells)
end

but it is kind of ugly and I doubt that the performance improvements (if any) justify it. YMMV.

Yeah, I considered this solution but it’s not performant for many obvious reasons. Many compilers with similar type-systems to Julia support these kinds of constants, so I thought Julia might also. Are there any plans to add this as a feature?

Actually, using a Dict with a type-assertion on the result works fine for this case, and the performance isn’t bad:

julia> struct ImmutableVector{T}
          n::Int
          cells::Vector{T}
       end

julia> const EMPTY_VECTORS = Dict{DataType, Any}()
Dict{DataType,Any} with 0 entries

julia> function empty_vector(::Type{T}) where {T}
         get!(EMPTY_VECTORS, T) do
           ImmutableVector{T}(0, Vector{T}())
         end::ImmutableVector{T}
       end
empty_vector (generic function with 1 method)

julia> empty_vector(Float64)
ImmutableVector{Float64}(0, Float64[])

julia> @btime empty_vector(Float64)
  25.144 ns (0 allocations: 0 bytes)
ImmutableVector{Float64}(0, Float64[])
2 Likes

Did you actually benchmark it?

I am not quite sure what “this” refers to here. What you are calling a “type-parameterized constant” is actually a function mapping from types to values that is guaranteed to be ===. You can already implement this with a pool.

Can you provide an example for a language with semantics close to Julia?

You can also abuse @generated functions to do this:

julia> @generated function empty_vector_gen(::Type{T}) where {T}
         v = ImmutableVector{T}(0, T[])
         quote
           $v
         end
       end
empty_vector_gen (generic function with 1 method)

julia> empty_vector_gen(Float64)
ImmutableVector{Float64}(0, Float64[])

julia> v1 = empty_vector_gen(Float64) 
ImmutableVector{Float64}(0, Float64[])

julia> v2 = empty_vector_gen(Float64)
ImmutableVector{Float64}(0, Float64[])

julia> v1 === v2
true

julia> @btime empty_vector_gen(Float64)
  1.210 ns (0 allocations: 0 bytes)
ImmutableVector{Float64}(0, Float64[])

However there are some downsides:

  • Changes to the ImmutableVector constructor called within the empty_vector_gen may not properly affect the results of empty_vector_gen, depending on when the generator is actually run (which isn’t guaranteed by anything).
  • The v1 === v2 property is also not guaranteed because the compiler can choose to re-run the generator whenever it wants.

You might be able to work around the second point by combining both approaches: using a global dict but doing its lookup inside the @generated function.

I’d suggest starting with the Dict–it’s simple, performs pretty well, and is easy to read.

1 Like

Can you provide an example for a language with semantics close to Julia?

How about C++?

cout << "Type int sign? " << numeric_limits<int>.is_signed << endl;
cout << "Type unsigned sign? " << numeric_limits<unsigned>.is_signed << endl;

template <T> const T* getbaseinst() {
   static const T* base = NULL;
   if (base == NULL) base = new T();
   return base;
}

Or Scala?

val emptyMap = Map[String,Int].empty

Or Java, in which this can be cleverly implemented with a few lines that are easy, thread-safe, robust (i.e., a user can’t screw it up via misuse), and performant:

public final class InitVal<T> {
   private static Map cache = Map();
   synchronized static Object get() {
      if (!cache.containsKey(T)) cache.put(T, MyType<T>())
      return cache.get(T);
   }
}
...
initInt = InitVal<Integer>.get();

This Java solution is basically what @rdeits has proposed (more below) but with privacy and thread-safety.

These are not identical type-systems to Julia by a long shot, but in the first two cases, the compiler tracks the separation of types for you, and in Java the compiler makes it easy to implement this in a performant way. (Admittedly I’m not sure offhand if Scala implements its empty as a lazy-val/function or not, but the syntax is clearly core to the language).

Did you actually benchmark it?

Yes: a solution similar to the Java version above and to that provided by @rdeits (the first solution, not the generator solution) and it was an order of magnitude slower than what he’s getting. That said, rdeits is using syntax I haven’t seen before (like end::Tag and the get! function), so I’ll try again. (My solution was designed to be thread-safe also, though, so we’ll see how much of the difference is that syntax versus thread-safety.) I certainly don’t mind keeping track of the values in a dictionary myself, but since this is a common underlying operation it would need to be fast.

You can already implement this with a pool.

(I assume that by pool you mean a dictionary that tracks the values as in rdeits’s solution.) This may be a solution, but I wouldn’t call it a straightforward working solution. For one, it is only a working solution if you are operating in a single-threaded environment or protect your dictionaries via mutexes or similar.

In fact, it worries me a little bit that other libraries could be using this solution given that it may not be thread-safe. As multi-threading starts to creep into Julia more, these kinds of things will make some very hard-to-track-down bugs.

All that said, I’ll try again with a thread-safe version of @rdeits’s proposals.

2 Likes

So the generator is an interesting solution, but I’m going to leave it aside for now. I went ahead and reimplemented the dictionary-lookup method using your basic suggestions:

julia> sLock = ReentrantLock()
julia> sDict = Dict{DataType, Vector}()
julia> function emptytest(::Type{T}) where {T}
          lock(sLock)
          sTmp = get!(sDict, T) do
              Vector{T}()
          end::Vector{T}
          unlock(sLock)
          return sTmp
      end

julia> @btime emptytest(String)
  86.560 ns (0 allocations: 0 bytes)
0-element Array{String,1}

So I’m still not getting the performance you seem to—does anything look off in my code? (For reference, @btime Vector{Symbol}() takes 15-16 ns on my system.) If I comment out the lock and unlock functions, the above timing drops to about 45 ns. Regardless, this is about 8 times faster than the implementation I referenced in earlier posts. Somewhat to my surprise, the difference that mattered most seems to be using f(::Type{T}) where {T} in the function signature instead of just f(T::Type).

Additionally, even if the performance were closer to a microsecond, I thought it would still be nice to have this ability for general use, so I wrote a macro:

# This helper function makes sure than a function-arg declaration has a name.
_memconst_fixarg(arg::Expr) = (arg.head == :(::) && length(arg.args) == 1
                               ? Expr(:(::), gensym(), arg.args[1])
                               : arg)
"""
    @memconst name(args...) = expr
    @memconst name(args...) where {...} = expr

@memconst is a macro for declaring parameterized constants whose values are
a function of some parameters but which can be memoized after their first
calculation and simply returned after that. To access the value, simply
treat name as a function.
"""
macro memconst(assgn::Expr)
    (assgn.head == :(=)) || throw(
        ArgumentError("memconst must be given an assignment expression"))
    # Parse the assignment statement.
    lhs  = assgn.args[1]
    expr = assgn.args[2]
    if lhs.head == :call
        fsym = lhs.args[1]
        args = [_memconst_fixarg(a) for a in lhs.args[2:end]]
        lhs = Expr(:call, fsym, args...)
    elseif lhs.head == :where
        fsig = lhs.args[1]
        fsym = fsig.args[1]
        args = [_memconst_fixarg(a) for a in fsig.args[2:end]]
        fsig = Expr(:call, fsym, args...)
        lhs = Expr(:where, fsig, lhs.args[2:end]...)
    else
        throw(ArgumentError("memconst assignment LHS must be a call or where expression"))
    end
    # Make an expression for the tuple of arguments.
    argtup = Expr(:tuple, args...)
    # Symbols we will need in the generated code.
    sDict = gensym()
    sLock = gensym()
    sTmp  = gensym()
    quote
        $sLock = ReentrantLock()
        $sDict = Dict{Tuple, Any}()
        $lhs = begin
            lock($sLock)
            $sTmp = get!($sDict, $argtup) do; $expr end
            unlock($sLock)
            return $sTmp
        end
    end |> esc
end

This implementation runs pretty fast:

julia> @memconst emptyVecTest(::Type{T}) where {T} = Vector{T}()
emptyVecTest (generic function with 1 method)

julia> @btime emptyVecTest(String)
  60.052 ns (1 allocation: 16 bytes)
0-element Array{Symbol,1}

Also, this macro can be used with multi-argument parameterized constants.

julia> @memconst emptyDictTest(::Type{K}, ::Type{V}) where {K,V} = Dict{K,V}()
emptyDictTest (generic function with 1 method)

julia> @btime emptyDictTest(Symbol, Int)
  68.722 ns (1 allocation: 32 bytes)
Dict{Symbol,Int64} with 0 entries

I think this is basically a win, but I’m curious if anyone sees opportunity to shave a few more nanoseconds off. :slight_smile:

julia> struct ImmutableVector{T} <: AbstractArray{T,1}
          n::Int
          cells::Vector{T}
       end

julia> @generated ImmutableVector{T}() where {T} = ImmutableVector{T}(0, T[])

julia> ImmutableVector{Int}() === ImmutableVector{Int}()
true

julia> empty(T) = ImmutableVector{T}()
empty (generic function with 1 method)

julia> empty(Int) === empty(Int)
true

julia> @btime empty(Int);
  1.302 ns (0 allocations: 0 bytes)

I should probably have read the thread more carefully before replying. My code above is essentially what @rdeits wrote, and the same caveats apply.

I am still not sure if you are concerned about the performance of creating or testing “empty” vectors. But for either, I would recommend trying small unions, which can yield a very idiomatic solution. Something like

struct ImmutableVector{T} <: AbstractArray{T,1}
    cells::Union{Nothing,Vector{T}}
    function ImmutableVector(cells::Vector{T}) where {T}
        new{T}(isempty(cells) ? nothing : cells)
    end
end
Base.isempty(v::ImmutableVector) = v.cells ≡ nothing
Base.size(v::ImmutableVector) = isempty(v) ? (0, ) : size(v.cells)
Base.getindex(v::ImmutableVector, i) = v.cells[i]

(I dropped n as it is redundant here). Then (on my aging laptop, Julia 1.4)

julia> using BenchmarkTools

julia> v = Vector{Int}()
0-element Array{Int64,1}

julia> V = @btime ImmutableVector($v)
  5.911 ns (1 allocation: 16 bytes)
0-element ImmutableVector{Int64}

julia> @btime isempty($V)
  0.027 ns (0 allocations: 0 bytes)
true
2 Likes