[ANN] JET.jl: the next generation of code checker for Julia

Julia’s extreme expressiveness and composability come from its dynamism — at the cost of that, a static type-level analysis of Julia code has been remained as a longstanding problem.

JET.jl is a fresh approach to static analysis of such a dynamic language; it can analyze a pure Julia script and detect type-level errors within a practical speed.

JET is powered by both abstract interpretation routine implemented within the Julia compiler as well as a concrete interpretation based on JuliaInterpreter.jl. The abstract interpreter enables a static analysis on a pure Julia code without any need for additional type annotations, while the concrete interpreter allows effective analysis no matter how heavily it depends on meta-programming or external configurations, which are common obstacles to static code analysis.

Demo

Here is a small demonstration of how JET can detect type-level errors.

Say we have this strange file:

demo.jl

# heavy computation
# -----------------

fib(n) = n ≤ 2 ? n : fib(n-1) + fib(n-2)

fib(1000)   # never terminates in ordinal execution
fib(m)      # undef var
fib("1000") # obvious type error

# language features
# -----------------

struct Ty{T}
    fld::T
end

foo(a) = bar(Ty(a))

macro myinline(funcdef) esc(:(@inline $funcdef)) end

@myinline bar(n::T)     where {T<:Number} = n < 0 ? zero(T) : one(T)
@myinline bar(v::Ty{T}) where {T<:Number} = bar(v.fdl) # typo "fdl"
@myinline bar(v::Ty)                      = bar(convert(Number, v.fld))

foo(1.2)
foo("1") # `String` can't be converted to `Number`

We can pass it to JET and find possible errors:

julia> using JET

julia> report_file("demo.jl")
[toplevel-info] entered into demo.jl
[toplevel-info]  exited from demo.jl (took 3.71 sec)
═════ 4 possible errors found ═════
┌ @ demo.jl:7 fib(m)
│ variable m is not defined: fib(m)
└─────────────
┌ @ demo.jl:8 fib("1000")
│┌ @ demo.jl:4 ≤(n, 2)
││┌ @ operators.jl:387 Base.<(x, y)
│││┌ @ operators.jl:338 Base.isless(x, y)
││││ no matching method found for call signature: Base.isless(x::String, y::Int64)
│││└────────────────────
┌ @ demo.jl:25 foo(1.2)
│┌ @ demo.jl:17 bar(Ty(a))
││┌ @ demo.jl:22 Base.getproperty(v, :fdl)
│││┌ @ Base.jl:33 Base.getfield(x, f)
││││ type Ty{Float64} has no field fdl
│││└──────────────
┌ @ demo.jl:26 foo("1")
│┌ @ demo.jl:17 bar(Ty(a))
││┌ @ demo.jl:23 convert(Number, Base.getproperty(v, :fld))
│││ no matching method found for call signature: convert(Number, Base.getproperty(v::Ty{String}, :fld::Symbol)::String)
││└──────────────
(included_files = Set(["demo.jl"]), any_reported = true)

Note that:

  • JET can analyze a pure Julia script, and we’re not required to spread out additional annotations just for the sake of type analysis
  • code executions were “abstracted” and the analysis terminated no matter how heavy the actual computations would be (in this case fib(1000) won’t terminate in practical time)
  • still Julia’s advanced language features like macros are fully supported, because JET can appropriately interprets top-level definitions

Usages

As shown above, JET is very simple to use.
Just install it and pass it whatever file you want to analyze.

julia> ] add JET # JET is now published at the General package registry

report_file can analyze an arbitrary Julia script:

julia> report_file("some/awesome/code.jl")

When analyzing a package, analyze_from_definitions configuration option allows us to analyze it even without any entry points:

julia> report_file(pathof(JET); analyze_from_definitions=true) # this way JET can analyze JET itself

report_and_watch_file will automatically re-trigger analysis on each code update detected:

julia> report_and_watch_file(pathof(JET); analyze_from_definitions=true) # re-trigger analysis when any code inside the JET module gets updated

And JET’s analysis can be configured very flexibly, e.g. you can tune various parameters of abstract interpretation, customize how JET “concretize” (i.e. interprets) top-level code, etc.

Head for the documentation and find more detailed usages and various configuration options.

Caveats

First of all, JET doesn’t guarantee that your code is type safe even when JET doesn’t report any errors within it. In another word, JET is so permissive that it ignores some cases that might be false positive and it reports an error only when it “likely” happens at runtime.
For example, there won’t be any objection to reporting “type Regex has no field non_existing” error for a call of getfield(::Regex, :non_existing), which actually leads to the error at runtime, but it’s not clear if we want to get a report of possible “type Regex has no field …” error for every call of getproperty(::Regex, sym::Symbol) where sym's actual value isn’t known until the runtime (and we know, this kind of situation is very common in Julia). JET ignores the second case, just because I didn’t want a sound, but unpractical analysis (i.e. such an analysis that could be slow or even never-terminate and produce low precise error reports).
This is one of JET’s core design choices, which I admit that is very opinionated, but I’m also planning to allow users to configure JET’s analysis strictness to the possible extent in the future.

The next thing to note is that JET deeply relies on Julia’s compiler implementation and JET’s analysis result can vary depending on your Julia version.
JET is only supported for Julia v1.6 and above, and in general, the newer your Julia version is, the more accurately and quickly you can expect JET to analyze your code, assuming Julia’s compiler keeps evolving all the time from now on.

Finally, JET is very experimental at this point. This means, you may still face bad analysis performance and end up with tons of false positive errors (especially if you try to analyze code that has lots of dependencies).
JET’s idea, abstract interpretation based type check, isn’t such an established technique, and there’s a lot of room for improvement. One beauty of JET’s design to borrow Julia compiler’s type inference implementation is that we can expect JET analysis to be more accurate and faster as Julia compiler evolves, but the inference quality might not yet be as ideal as we expect in the context of error check.

In any case, if you have any issue, enhancement request or critique, please post it at JET.jl’s repository or ping me here on the discourse or the Julia slack. Your feedback would be very valuable in order to make JET more practical and useful.

Roadmap

Lastly I’d share the roadmap of JET.
I’m very welcome to elaborate and discuss more details if you’re interested in contributing to JET:

  • editor/IDE integration: GUI would definitely be more appropriate for showing JET’s analysis result
    • smarter code dependency tracking: the watch mode currently re-analyzes the whole code on each update, which is the most robust and least efficient option. When integrated with an IDE, fancier incremental analysis based on smarter code dependency tracking like what Revise.jl does would be needed
    • LSP support: ideally I hope to extend JET to provide some of LSP features other than diagnostics, e.g. auto-completions, rename refactor, taking type-level information into account
  • more documentation: especially JET needs a specification of its error reports
  • more accurate and faster analysis: abstract interpretation can be improved yet more, e.g. with alias analysis
  • configurable strictness: there’re cases where more strict check is appropriate, and vice versa
  • performance linting: JET can be used to report performance pitfalls, e.g. report a dynamic dispatch found inside a heavy loop

Sounds exciting ? Hope you will like this project !

Shuhei,

101 Likes

Congrats. I look forward to playing with this.

If you have an abstract interpretation core that is generic in the sense that it supports swapping in different abstract domains, it would be cool to separate that out into its own package. This would allow people to write lots of static analyses on top of it, not necessarily related to bug checking.

4 Likes

Great package, I’m looking forward to try it out on my packages.

However, from a user interface perspective I can’t help feeling that this is unfortunately complicated and something you must look up every time you want to run it on a package:

It would be much nicer with e.g. report_package("JET") and defaulting report_package() to run on the package in the active project.

The watch option seems like something that would go a nicely as a keyword argument though (and maybe even default to true, unless the Revise overhead is too heavy or makes the analysis less robust).

5 Likes

I tried running it on my project, and it reported errors on a few lines of this form:

const MOI = JuMP.MathOptInterface

Here it reported the error cannot assign a value to variable JuMP.MOI from module CLR. I tried removing the lines and my code continued to work, so I guess it was reasonable to remove that.

Anyway, it was a bit difficult to find the above, because the line number it gave was always the line number of the include("file_name.jl") statement of the file containing the mistake, and not the line number in the actual file.

Anyway, it was a bit difficult to find the above, because the line number it gave was always the line number of the include("file_name.jl") statement of the file containing the mistake, and not the line number in the actual file.

Could you post it as an issue with MRE ?

1 Like

Problem is likely that MOI is already exported from JuMP as a constant.

1 Like

This is a great projet, I was asking for a tool like this for a number of years.

Jan

1 Like

Just tried it today with some short code examples. It works great. I’ll try it on some production code soon. :slightly_smiling_face:

Maybe it’s already in the README, but I noticed that JET.jl doesn’t report type errors in a function unless the function is actually called. For example:

# No type error from report_file("try.jl")
# filename: try.jl
f(a::Int64)::String = a^2

# Type error found from report_file("try1.jl")
# filename: try1.jl
f(a::Int64)::String = a^2
println(f(5))

I believe that is by design. The space of possible julia types is truly gigantic. For generic functions, there are formally an infinite number of possible type signatures due to parametric types.

JET could of course special case concrete methods like the one you show, but in general, the only sane way to handle arbitrary julia code is to investigate signatures that can possibly be called.

3 Likes

Maybe try

report_file("try.jl", analyze_from_definitions=true)
1 Like

Yeah, that’s by design, and @Mason already explained the reason.
But I thought this can be less useful in some situations, especially when developing a package where only “definitions” are available, and hence there is analyze_from_definitions configuration as @adamslc already suggested.
Julia’s type inference can go though with abstract types, and so you can analyze your method even if its signature is not fully concrete, but it of course leads to imprecise analysis.
So I recommend you use JET using something like test file, which contains top-level call sites that provides concrete method signatures, if available.

2 Likes