The language server interaction with the code needs to be sorted out. There are too many files in a typical project where the LSP is no help whatsoever: no hover help, no symbol lookup.
I second this. In a typical research project I will have my source code that I utilize in my scripts/notebooks, in conjunction with added packages from the Registry. I get practically no support for functions that come from my source code (like the features you mention). The worst offender is that I don’t get support in the “Ctrl+Click” sense, that brings you to the source code of a function, which is the thing I would need most often for function I wrote, not ones that come from a package.
Part of the challenge as I see it is that a lot of those issues have pretty deep-rooted challenges that may go all the way down to the core language. For example, anything that gets limited by the current approach to lowering.
It could help to take a couple of these big issues and map out critical path(s) from them all the way upstream. e.g. if the problem is in the UI part of VS Code, that path is pretty short. If the problem is in the language server, then it could be tackled there. If the bug is with tools the LS depends on (syntax parsing, linting, etc), then those projects should come up. And finally, if the roadblock is with the Julia compiler (I’m including native parsing and lowering here), then that would be good to identify too.
I’d advocate for this approach because I too often see people getting frustrated or feeling powerless because they’ve identified the wrong level for change. The VS Code extension and LS.jl are worst hit by this because they’re the most user-facing parts of the stack. Getting a more complete picture would help with finding and allocating resources towards solving some of these long-standing issues!
While I understand that it is an additional burden for the volunteers that currently maintain julia-vscode to document the existing packages, without better documentation we will not be able to make any progress.
As long as @davidanthoff and @pfitzseb do not find the time to improve the developer documentation (which is fine, you are volunteers and free to set your own priorities), the only actionable point is to reverse engineer and document the existing packages, starting with documenting “JuliaWorkspaces”.
Julia enthusiasts, please feel challenged to make it easier for new contributors to fix the open issues by improving the developer documentation of the current code base!
We are in the middle of two major transitions in the LS, so things are even more complicated than normally
The first transition is that we want to adopt JuliaSyntax.jl for parsing and probably also its node types for representing code. Most of the LS at the moment is powered by CSTParser, which has its own parsing implementation and brings the main node type along that is used throughout the LS. At the same time, we have started to use JuliaSyntax in the LS (yes, at the moment everything gets parsed twice, once by CSTParser and once by JuliaSyntax) for some things, namely the test item detection stuff. The roadmap here is that I want to completely get rid of the CSTParser parser and exclusively use the JuliaSyntax parser. The medium term plan is that we will have one parsing pass that then generates trees for the old CSTParser node types and the JuliaSyntax node types. Once we are at that stage we’ll need to spend some more time thinking about node types and what exactly is the right fit for the LS.
The second transition is towards a more functional/immutable/incremental computational model for most of the logic in the LS. At the moment the LS uses mutable data structures throughout, and keeping track of where state is mutated, and when is really, really tricky (well, at least for me). It also makes it completely hopeless that we might use multi threading at some point, for example. So this summer I started tackling that problem, and the strategy for that is that we use Salsa.jl as the core underlying design for the LS. There is an awesome JuliaCon video about that package from a couple of years ago for anyone curious. So that whole design is essentially inspired by the Rust language server. The outcome of that transition will be a much, much easier to reason about data model.
Very roughly, StaticLint/CSTParser/SymbolServer has all the code pre these transitions, and JuliaWorkspaces has the code that is in this new world of the two transitions I mentioned above. So the division is by generation of when stuff was added to the LS, not by functionality. My expectation is that once the transition is finished, StaticLint and SymbolServer will be no more as individual packages but their code will have been incorporated into JuliaWorkspaces. The final design I have in mind is that the LanguageServer.jl package really only has the code that implements the LSP wire protocol, but not much functionality in it, and all the functionality lives in JuliaWorkspaces. The idea being that we can then create for example CI tools that use the functionality in JuliaWorkspaces directly (like GitHub - julia-actions/julia-lint), or command line apps etc.
Right now I’m in the middle of moving the SymbolServer functionality into JuliaWorkspaces. I won’t give timelines, but generally I’ll have a non-teaching semester in the spring and my hope is to finish both transitions then.
@davidanthoff Does this mean we should not try to fix bugs or document the existing packages before the refactoring is finished? Or are there already areas where other developers can help out?
It may not be this simple, but could individual issues be opened for the community to port specific features from StaticLint/CSTParser/SymbolServer to JuliaWorkspaces?
I’d say it depends a bit. I don’t think there is much point in documenting anything around SymbolServer at the moment. StaticLint I’m not sure, my best guess is that a lot of that code will just be moved around a bit but otherwise won’t change too much. By far the biggest change that I hope to make is to move the semantic data that is currently stored in a field in the syntax tree into side tables instead, but I think there is a way to do that that is not too invasive.
Fixing bugs can always continue and is always great. Any bug fixed even in the old code is a bug fixed in the new stuff also, as this transition is really mostly a “move code around” type of exercise. The only exception to that is really bug fixes that require a major redesign in data structures or something like that, I think those will have to wait.
That is a great idea for a little later in the process, I think! But at the moment I need to first move SymbolServer and the analysis passes from StaticLint, until that is done, really nothing else can be moved. And moving those core parts is tricky, I don’t think anyone who doesn’t have a lot of work on the LS under their belt could do that.
But once that phase is done, there is a lot of functionality that I think can be moved in small discreet chunks, and then help would be great.
There are actually areas where folks could really help out that are not entangled in this transition process. Here are some ideas:
It would be great to have better error recovery in JuliaSyntax, see here.
JuliaWorkspaces handles project files natively now (which I can already tell will make handling of env much, much easier). For that we need to parse TOML files. We currently use the base TOML parser, but really what we need is a JuliaSyntax-like toml parser, i.e. one that carries detailed location information, recovers gracefully from errors etc. Someone started something like that a while ago (GitHub - AbdulrhmnGhanem/TOMLCSTParser.jl: A concrete syntax tree parser for TOML), but I think at this point it would actually be nicer to have something that mimics the API and style of JuliaSyntax.
If anyone wants to start documenting JuliaWorkspaces, that would also we awesome. I don’t expect any major changes to anything that is in there right now, i.e. I think more stuff will be added, but the general structure is good, I think.
I don’t how far requirement gathering has come, but maybe one thing that might give a little guidance is: try to navigate a large, unfamiliar codebase, spanning several packages. Try to figure out what is being called etc.
I have mixed success with things like “Go to Definition”. Just now, I tried it on my own package within a file in tests/ and it did not find anything for any of the exported names of the actual package.