VS Code Julia language-server caching?

I currently have to use VS Code with a rather heavy/large Julia environment. The first time the language-server started, it needed ages until everything was “ready” (is indexed the right term?), even though I had already run pkg precompile before. However, on the next start, things went quickly.

I’d like to understand what the language server does a bit better - and where does it cache the result (indices?). I assume somehow it does cache, since it was so quick to get “ready” the second time. Also, when one of the packages I use changes (quite a few added via pkd dev), how much will the language-server have to “re-cache”?

@davidanthoff, sorry to bother you with this - haven’t used the VS code so heavily before (but I’d like to do it more and more, I’ve really grown to love what’s possible here).

1 Like

Indexing really happens in two steps: we load each package in the active env, and then we extract signature and doc information. Both take time, but especially the package loading part if you have an environment that has a lot of packages.

We do cache these results on disc, so once they are indexed, the next time it should take almost zero time to get at that info. If you update your env with new package versions, we’ll have to index those packages that have updated, unless that specific version has been indexed and cached before.

We recently changed where we store the cache. Previously we stored it in the extension folder, so anytime we shipped a new version of the extension, the cache was wiped out and everything needed to be indexed from scratch. We now store the index in a location that is managed by VS Code but stays in place across extension updates, so that should significantly reduce the indexing needs. You can see where the cache is located when you look at the Julia Language Server output pane, it should say something about the symbol server location.

We have more plans to improve on this whole situation. Our current favorite is to index every version of every package in the cloud, and then first try to download the cache file from the cloud before we index on your system. Nothing implemented yet, but right now we think that would have the potential to further reduce indexing on user machines a lot.

11 Likes

Thanks a lot for the detailed explanation!

Does the Julia extension notice when a package under development is recompiled (.ji file changes)? I have several packages open in my workspace, due to tight inter-dependencies, and use a share enviroment in “.julia/environments/my-prj-dev” that has everything necessary.

Oh, and sorry, one more question: When Revise.jl revises code, does that get propagated to the index?

So in theory any code that is part of the VS Code workspace that you have open is not handled by the indexing story at all, but instead directly parsed by CSTParser and then analyzed. That step updates every time you press a key in the editor :slight_smile: We do have some open issues that we don’t fully use that information in some contexts in the way we should, but those are really bugs that we are working through right now.

Oh, and sorry, one more question: When Revise.jl revises code, does that get propagated to the index?

No. If you update code files that are inside the workspace, everything updates, if you update files outside of the workspace (say files in a deved package in the current env), we don’t get update notifications for that.

1 Like

I’ve been toying with the idea of having eglot-jl manage the symbol server cache separately from the SymbolServer.jl package (like vscode currently does). Is the on-disc format of the symbol-server cache considered part of the stable api of LanguageServer.jl (will I be notified by seeing a major version bump to LanguageServer.jl if manual intervention in the symbol server cache is ever required by an update)?

Oh, great, so as long as I do all editing in VS code (that was my intention anyhow), things should always be up to date (module bugs), even if I have several packages open (multi-root workspace)? That sounds awesome!

Good question :wink: We hope to break compat of the format rarely, but I’m sure we will. In particular, we are currently exploring a JSON based format that would be Julia version agnostic (the current serialization format is not Julia version agnostic).

The way we handle this in VS Code is that we create a folder for the cache that includes v1 in the folder name (see here), and if we break the format we plan to just increase that number. We haven’t figured out how we would delete the old cache in that case.

We can certainly agree that we will increase the major version of LanguageServer.jl if we break the format. Or maybe SymbolServer.jl, where the serialization code is actually located? Or maybe we should add a function get_cache_format_version() to SymbolServer.jl that one can call at runtime, and we would increase that version number if we break the format?

3 Likes

That is the theory. I’m pretty positive that is not working correctly right now, though :slight_smile: But we are trying to work through that scenario right now, so hopefully we’ll be able to fix it soonish.

2 Likes

Modulo an issue like https://github.com/julia-vscode/SymbolServer.jl/issues/54 cropping up again, it should be possible to use LanguageServer.jl without even knowing that SymbolServer.jl exists, so the major version bump probably applies to both packages. That said, I’ll make sure to ping you/Zac somewhere if I start managing the cache separately so y’all can be aware of a downstream target with potential for breakage there.

That is the theory. I’m pretty positive that is not working correctly right now, though

I think I’ve noticed some occasions, yes. :slight_smile: But I’ll happily be along for the ride, even it’s a bit rough for now. I’ve never been a big fan of IDE’s myself in the past, but this is different (and pretty awesome).

1 Like

I wonder why do you need caching in the first place, e.g. querying the doc dynamically is basically instantaneous :

julia> using Colors  
    
julia> @time Base.Docs.doc(Base.Docs.Binding(Main, :RGB))
  0.000075 seconds (19 allocations: 1.375 KiB)
  RGB is the standard Red-Green-Blue (sRGB) colorspace. Values of the individual color channels range from 0 (black)
  to 1 (saturated). If you want "Integer" storage types (e.g., 255 for full color), use N0f8(1) instead (see
  FixedPointNumbers).

But loading the package can take a long time, and by caching things we can then have everything much quicker available the next time because then we don’t have to load any package.

1 Like

But loading the package can take a long time, and by caching things we can then have everything much quicker

Amen to that! Thanks a lot for that caching feature!

Maybe we don’t have the same workflow but I almost never look at the doc of packages I’m not currently using.

Unrelated question - is there currently and Julia profile viewer integrated with VS code?

It is not just the docs we need, we also need this information for code navigation (got to definition and things like that), the linter etc. The language server essentially needs a lot of information about all the packages in your current environment the moment you open a Julia file.

1 Like

No, but we’re getting a really nice one soon. VS Code is going to ship with a profile viewer (flame graphs, tree drill down, a query language for profiles and inline overlay of profile information in the code files themselves) in one of the next versions. All we need to do is export Julia profiles into the right format, that work is currently ongoing in https://github.com/davidanthoff/ChromeProfileFormat.jl.

5 Likes

Ohhhh … slight drool … I can’t wait! (But I will :-)). Please will you write an announcement when that’s available, pretty please?

Yes, that will get a minor version bump and an announcement!

For the initial latency, could you use Revise & CodeTracking if they are loaded? Demo:

f(x::Int) = x + 1

f(x::AbstractFloat) = x + 2

g(y) = "hey there"

computex(::String) = 0

computex(::Dict) = 0.0

function myfunction(d::AbstractDict, key)
    x = computex(d)
    y = f(x)
    return g(y)
end

function myfunction(d::AbstractString, key)
    x = computex(d)
    y = f(x)
    return g(y)
end

If the user’s cursor is on line 19, and you want the function signature, you can do this:

julia> using CodeTracking, Revise

julia> includet("/tmp/lsdemo.jl")     # Revise has to parse the file to cache signatures

julia> sigs = signatures_at("/tmp/lsdemo.jl", 19)
1-element Array{Any,1}:
 Tuple{typeof(myfunction),AbstractString,Any}

Once you have the signature you should be able to query the documentation.

Revise’s signatures test script has more tricks that might be interesting. The highlighted lines in this link extract the signatures from Base; the line at the bottom is proof that it failed on fewer than 40 signatures (out of the >11000 in Base). The ones it misses tend to be tricky, like +(::OrdinalRange, ::OrdinalRange) which is defined by an @eval inside a function body. If the @eval had been at top-level it would have figured it out, even for some fairly tricky cases:

julia> signatures_at("julia/base/atomics.jl", 422)
4-element Array{Any,1}:
 Tuple{typeof(Base.Threads.atomic_add!),Base.Threads.Atomic{T},T} where T<:Union{Float16, Float32, Float64}
 Tuple{typeof(Base.Threads.atomic_sub!),Base.Threads.Atomic{T},T} where T<:Union{Float16, Float32, Float64}
 Tuple{typeof(Base.Threads.atomic_max!),Base.Threads.Atomic{T},T} where T<:Union{Float16, Float32, Float64}
 Tuple{typeof(Base.Threads.atomic_min!),Base.Threads.Atomic{T},T} where T<:Union{Float16, Float32, Float64}

which is pretty fun when you see how arcanely these methods are defined.

But you can go farther than this. In this demo, F12 on f doesn’t take you to the correct definition of f—LanguageServer sometimes struggles (for understandable reasons) with multiple dispatch. I’m not saying it’s easy or will always be successful, but in some cases there is a way to figure it out. First we have to figure out what argument type f will have, for which we’ll use inference:

julia> sig = sigs[1];

julia> meth = Revise.JuliaInterpreter.whichtt(sig)
myfunction(d::AbstractString, key) in Main at /tmp/lsdemo.jl:18

julia> src = Core.Compiler.typeinf_code(meth, sig, Core.svec(), false, Core.Compiler.Params(typemax(UInt)))
(CodeInfo(
    @ /tmp/lsdemo.jl:18 within `myfunction'
1 ─      (x = Main.computex(d))::Core.Compiler.Const(0, false)
│   @ /tmp/lsdemo.jl:19 within `myfunction'
│        (y = Main.f(x::Core.Compiler.Const(0, false)))::Core.Compiler.Const(1, false)
│   @ /tmp/lsdemo.jl:20 within `myfunction'
│   %3 = Main.g(y::Core.Compiler.Const(1, false))::Core.Compiler.Const("hey there", false)
└──      return %3
), String)

julia> src = src[1];

julia> src.code[2]    # the second entry in `src.code` has the call to `f`
:(_5 = Main.f(_4::Core.Compiler.Const(0, false)))

julia> src.slottypes[4]    # _4 indicates a "slot" and so we can ask what type it has
Int64

OK, so in this case we can fortuitously determine that f will be called with an Int64. (This would not have been possible if the signature of this method of myfunction had not specified AbstractString, so there will be many cases where this kind of lookup will fail.) From there, problem solved:

julia> which(f, (Int64,))
f(x::Int64) in Main at /tmp/lsdemo.jl:1
1 Like