Define a static type variable is too Verbose

Let’s say I want to define a static-type variable inside a function,
the data structure is very complex, it is a Vector, containing a Pair, with String => Matrix
Once defined, I want the Type to be stable and never change.

function test1()
    complexobj = Vector{Pair{String, Matrix{Float64}}}()

    # do some work...

    complexobj = 1 # This is ok! complexobj will be "Any", Not good

    return complexobj
end

Test1 is bad, If I accidentally re-define complexobj, the program will NOT alert me.


function test2_wont_compile()
    complexobj::Vector{Pair{String, Matrix{Float64}}} = Vector{Pair{String, Matrix{Float64}}}()

    # do some work...

    complexobj = 1 # This will fail

    return complexobj
end

Test2 is exactly what I want! and It is usually the same logic as C++/C#/Java, Once I define complexobj 's type, If I accidentally re-define it, complier give me an error.
That is what I want already.

However, this is extremely Verbose, No? I have to write the Type (very long sentence) 2 times!

In C# or Java script, I guess I can replace the first Type sentence by “var”

e.g. If Julia could have something like this and act same as test2, it will be great:


function test3()
    var complexobj = Vector{Pair{String, Matrix{Float64}}}()

    # do some work...

    complexobj = 1 # This should also Fail

    return complexobj
end

Or, why I cannot define a const inside a function? this one failed:


function test4()
    const complexobj = Vector{Pair{String, Matrix{Float64}}}()

    # do some work...

    complexobj = 1 # This should also Fail

    return complexobj
end

Alternatively, I can create a “Struct” to store this complicated data structure.
However, I have many places such complicated structure and each time the data structure is different.
Each time, I will have a “ad-hoc” task and each time the complicated data structure is different
e.g. one day I may want a
Dict{String, Vector{Vector{}}}

another day I want to do something like:

Pair{String, Vector}

maybe another day, I want to have something like:

Vector{Pair{ MyCustomStruct, Vector{MyCustomStruct}}

etc etc.

My job are very adhoc that each day I may have a new task with new diff requirement. So each time I need to have a new data structure (usually quite complicated)
So I need to stay very flexible (instead of creating a generic “struct”)

Is there a way to reduce the Verbose of

complexobj::Vector{Pair{String, Matrix{Float64}}} = Vector{Pair{String, Matrix{Float64}}}()

But still can make sure once complexobj type is defined, it will never change, like test2?

Thank you

2 Likes

You are looking for statically typed language features in a dynamically typed language, that’s why you won’t find it.

That said, even though you won’t get the exact semantics as static language,

is exactly what macros are for. You just need to declare a variable of your choice based on the type of the RHS.

3 Likes

May I know how to write such Macro? It should be easy but I just tried many ways but it does not work…

I want to have something like:

@var complexobj Vector{String, Matrix}

then it will define a static typed variable complexobj.

I tried something like this in macro but does not work… I never wrote macro before:


macro var(varname, vartype)
   quote
      varname::vartype = vartype()
   end
end

Thank you

1 Like

a function for checking maybe?

@inline function static_assign!(x,val)
@assert typeof(val) == typeof(x)
x.=val
end

I think you will be better off if you don’t try to program Julia like C++ & friends.

Specifically, while you can reduce the verbosity with macros, or just reusing T = Pair{String, Matrix{Float64}}, the right place for catching this kind of error in Julia is unit testing.

I find type assertions (::T) useful mostly for helping the compiler, when it cannot figure out types.

3 Likes

As mentioned above, you seem to be asking for the features of a static language in a dynamic one. Some time ago Stefan wrote a great explanation of the difference which is worth reading:

https://stackoverflow.com/questions/28078089/is-julia-dynamically-typed/28096079#28096079

Ultimately I think the right way to solve these kind of problems for julia code is a good linter which would integrate with the julia compiler’s static analysis capabilities and warn you about “odd looking” code patterns of various types. There’s already a couple of linters which exist but I haven’t tried them extensively. Lots of unit tests are also good.


Having said all the above — and in the interests of showing how a macro would work which maybe kind of does what you want… it’s possible to make a macro which dynamically checks that an assignment is “type stable”:

"""
    @assign x = rhs

Assign `rhs` to `x`, checking that `typeof(rhs) === typeof(x)`.
"""
macro assign(ex)
    if !(ex isa Expr && ex.head == :(=))
        throw(ArgumentError("expression should be of the form `var = value`"))
    end
    name = ex.args[1]
    rhs = ex.args[2]
    tmp = gensym(name)
    esc(quote
        $tmp = $rhs
        if !(typeof($tmp) === typeof($name))
            error(string("Attempt to change the type of ", $(QuoteNode(name)), " to ", typeof($tmp)))
        end
        $name = $tmp
    end)
end

Hence:

julia> function bar()
           a = 1
           @assign a = 1.2
       end
bar (generic function with 1 method)

julia> bar()
ERROR: Attempt to change the type of a to Float64
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] macro expansion at /home/tcfoster/staticvars.jl:16 [inlined]
 [3] bar() at ./REPL[11]:3
 [4] top-level scope at none:0

julia> function foo()
           a = 1
           @assign a = 100
       end
foo (generic function with 1 method)

julia> foo()
100

julia> @code_native foo()
	.text
; ┌ @ REPL[13]:2 within `foo'
	movl	$100, %eax
	retq
	nopw	%cs:(%rax,%rax)
; └

Notice that

  • The error from bar() is a dynamically generated error, not a compiler error!
  • The compiler has eliminated the check from foo(), giving zero overhead in type stable cases

A more ergonomic and less intrusive version of such a macro could be written which applies the above rules to all assignments within a function (hah, we could call it @typestable) but it would need to understand julia’s variable scoping rules.

3 Likes

IMO this is a concern for optimizing code. Type changes per se within a function should not break anything.

Also, in the problem that presumably motivates the simplified example, if one is collecting into a vector (eg by push! — presumably the point of creating complexobj is not to return it empty), then push!(1, ...) would fail immediately without further checks.

I think doing it at the time when the variable was defined, like initially requested, is the easy solution:

"""
    @var x = rhs

Define `x` to be of fixed type, the type of `rhs`, and assign `rhs` to it.
"""
macro var(ex)
    if !(ex isa Expr && ex.head == :(=))
        throw(ArgumentError("expression should be of the form `var = value`"))
    end
    name = ex.args[1]
    rhs = ex.args[2]
    tmp = gensym(name)
    esc(quote
        $tmp = $rhs
        ($name)::typeof($tmp) = $tmp
    end)
end

Test

julia> function f(x)
           @var y = x
           y = 1.2
           return y
           end
f (generic function with 1 method)

julia> f(1)
ERROR: InexactError: Int64(1.2)
Stacktrace:
 [1] Type at ./float.jl:703 [inlined]
 [2] convert(::Type{Int64}, ::Float64) at ./number.jl:7
 [3] f(::Int64) at ./REPL[2]:3
 [4] top-level scope at REPL[3]:1

julia> f(1.3)
1.2
3 Likes

Just define a short and easy to write typealias somewhere at global scope

const VPSM = Vector{Pair{String, Matrix{Float64}}}
complexobj::VPSM = VPSM(...)
3 Likes

Hah you’re right! I didn’t know we already had this feature in lowering. I’ve never used it before.

To add a little bit more to what has been said,
re: Don’t try and write static language type code in a dynamic language.
In julia type constraints are for dispatch, not for letting the compiler tell you what error you’ve made.

So in general you just leave them off, til you need another dispatch, then specify them to the minimal extent that is required for purposes of distinguishing between the different methods.

This is not to say static languages are bad, just that julia isn’t one of them.
Static languages rock, it is awesome that they can catch errors at compilation time.
But there is a trade-off, and it is one that julia made on the dynamic side.
Certain kinds of generic programming which is common and trivial and completely unnotable in julia,
have fancy names and tons of work going into them to even be possible in static languages (A recent discussion on slack was about Generic Paramorphisms. Which is some kind of exceptional feature that some hardcore static languages can do, but most can’t. But in a dynamic language is so easy we don’t even have a word for it.)

I personally think I would enjoy a language that was a lot like julia but statically typed, but there are a bunch of things that make such a language really hard to implement.

2 Likes

Whilst Julia is dynamic, certain “static” features definitely would be nice to have. For instance, there is this related issue about type-stable blocks Type stable block · Issue #10980 · JuliaLang/julia · GitHub.

I would go as far as saying that it is impossible (while keeping the language convenient enough for practical use). Counterfactual type calculations in a type system as rich as Julia’s are very difficult (recall Nullable).

The beauty of Julia is that it allows 99% of your code to be a fast as C by figuring out types itself, and in the remaining 1% you can make it behave dynamically and not lose anything in the practical sense. And all of this happens in one language.

3 Likes
> """
>     @var x = rhs
> 
> Define `x` to be of fixed type, the type of `rhs`, and assign `rhs` to it.
> """
> macro var(ex)
>     if !(ex isa Expr && ex.head == :(=))
>         throw(ArgumentError("expression should be of the form `var = value`"))
>     end
>     name = ex.args[1]
>     rhs = ex.args[2]
>     tmp = gensym(name)
>     esc(quote
>         $tmp = $rhs
>         ($name)::typeof($tmp) = $tmp
>     end)
> end

Thank you for the Macro it works!

For any kind of Macros, will it be always Type-Stable?

e.g. I run below code and it seems @var is Type-Stable, so that is good.
Can I assume that all Macros are Type-stable?

function testv()
    @var xx = Dict{Int, Vector{Float64}}()
    push!(xx, 1 => [3.0])    
    return xx
end

@code_warntype testv()

Body::Dict{Int64,Array{Float64,1}}
1 ─ %1 = invoke Dict{Int64,Array{Float64,1}}()::Dict{Int64,Array{Float64,1}}
│ %2 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), :(:ccall), 2, Array{Float64,1}, 1, 1))::Array{Float64,1}
│ (Base.arraysize)(%2, 1)
└── goto #3 if not true
2 ─ (Base.arrayset)(false, %2, 3.0, 1)
3 ┄ goto #4
4 ─ invoke Base.setindex!(%1::Dict{Int64,Array{Float64,1}}, %2::Array{Float64,1}, 1::Int64)
└── return %1

My Final Solution:

In my code, I have separated “Time-critical” task, versus “Non-critical” task

“Time-critical” task:

  • Iterating a huge Array of (10m , 1000) size
  • Compute some indicators, operations, etc on the data
  • Output the data in Aggregated / Groupby style
    So that I just output a very small table, say Array of (1000, 10) size

e.g. my huge data contains 20 year’s Hourly Temperature from 10K places around the world
Then my Output will be only a Monthly Average Temperature, per each Country.

“Non-critical” task:

  • Read that huge Array (10m, 1000) from disk to memory using Serialization
  • Initialization of all parameters or mappings of Input (huge array) and Output (smaller array for output)
  • After obtaining the Output in Array (1000, 10) size
    Convert it to a DataFrame
  • Perform many other operations on this DataFrame
  • Generate a final report based on this smaller DataFrame

During Critical Task, I will make sure that most of my variable are “Static type”

However, during Non-critical task, I dont do static type, since the data load is small so I really dont care about type-instability.
In Non-critical task, most of my operation is done using “DataFrame” and it involves in a lot of type-instability

FYI, I am not trying to write any Generic Library/Module/Project for other people to use. I am more writing some script to process data/find patterns within data myself, then present the powerpoint to other people.
so I usually write some standalone, quick and dirty type of code in Julia, just for the sole use of myself

(i dont share my code with other colleagues since they dont care anyway… My job is more data analysis, rather than Project/Software development)

1 Like

Just for the “ergonomy” of coding, I use quite often the const type assignment:

const MyFancyType = Vector{Pair{String, Matrix{Float64}}}
complexobj::MyFancyType = MyFancyType()
2 Likes

Technically they are, since they transform Expr to Expr, but that is not a relevant question for macros.

It is better to think of the type stability of the generated code in the context of the macro invocation.

1 Like