Improving the experience of working with Julia packages with llms.txt

With the 1.0.0 rewrite of gRPCClient.jl, I realized that there are not going to be any examples of using the current version of the package in LLM training data. In an effort to try and bootstrap the package for usage with LLM / agents, I used Documenter.jl to take all of my documentation markdown files + inline documentation to produce a single output Markdown file which I named llms.txt. Essentially the idea is to follow the loose standard defined here: https://llmstxt.org/

I then made sure that my documentation action then copies the llms.txt to the root directory of the documentation site, ie https://juliaio.github.io/gRPCClient.jl/llms.txt.

Have any other package maintainers tried something like this before? Regardless of your personal feelings on LLM assisted coding, it seems like it would only be a good thing to improve the experience of using these tools to write Julia code; I say that as someone who finds a great deal of satisfaction in working with Julia as a language, especially when I’m writing it by hand. As such, anything that could help drive adoption of Julia would be a good thing in my eyes. Keep in mind this could also help potential users discover available packages and have up to date information about them, even if they do not use them directly to generate code beyond examples.

Once I have more experience with how well this works for gRPCClient.jl I will report back my findings.

Lets try to keep discussion ontopic here instead of letting it devolve into a general critique of “AI”. There is a time and place for that, but it should be done in separate thread.

4 Likes

Seems like an ideal usecase for agent skills: About agent skills - GitHub Docs

Now if we could manage some kind of repository with skills for specific Julia packages.

1 Like

Funny you should mention that. After not having much luck with converting my documentation to llms.txt, I had Claude Code create a skill to initialize its context for working with the package.

When I just converted my existing documentation, Claude Code made many additional tool calls looking for various information. It didn’t seem like it was helping at all. But then I essentially had it summarize and compress the repo for its own use, and it was able to quickly one shot various tasks without fetching much additional context. This was quite similar to other times ive used skills, the model spends alot less time making tool calls and thinking.

That’s a good idea, maybe even an index that a model could use to select from many different package skills. Internally we are looking at setting up an index like this so the model can easily get the most efficient form of context for packages it may need to use.

In the meantime I added a section to my README.md and documentation pages that points to llms.txt. This seems to work pretty well, but unfortunately for me since we moved gRPCClient.jl from JuliaComputing to JuliaIO, some of the models still look at the old JuliaComputing repo location unless they are specifically prompted to look at the JuliaIO version.

Here is what I used for README.md:

### For LLMs & Agents

Fetch [`llms.txt`](https://juliaio.github.io/gRPCClient.jl/llms.txt) before proceeding. It contains compressed, complete documentation optimized for limited context windows and is the authoritative reference for using this library.

Then in the documentation website:

## For LLMs & Agents

This library provides an [`llms.txt`](llms.txt) file designed to bootstrap your context with everything needed to use it effectively.

The file is a compressed, curated snapshot of the library's essential knowledge — trimmed of noise and optimized for limited context windows. Rather than crawling docs, parsing source files, or relying on potentially stale training data, agents should load `llms.txt` first as their authoritative reference.

**It contains:**
- API surface and usage patterns
- Key concepts and terminology
- Common workflows and examples

If you are an LLM or agent operating on this codebase, fetch `llms.txt` before proceeding. It is the single source of truth for how this library works and is purpose-built for programmatic consumption.
1 Like

Doesn’t there need to be one of these for each released version of gRPCClient.jl?

I’ve had good luck blocking the LLM from using the network, and having it crawl a package using @doc in a repl. If the docs are unclear it will sometimes printout code snippets or try running functions from the package. This is automatically using the version of the package I have choosen for the env, so it doesn’t get confused with older or newer versions of the package.

Yes, I was trying to figure out how to handle that. Ideally the model would be able to look inside the installed package for the correct version, but I’m not sure what the ideal way to structure that would be. Really these are mostly there as a fallback incase the model doesn’t have access to the actual repository. For example if you are just using Claude or ChatGPT in the browser.

In order to keep it up to date with each version it can be regenerated or you can check if any of your changes or additions make it inconsistent. In my experience LLMs are pretty good at doing the later automatically.

So I have been experimenting more with this using various agent coding systems (Augment Code, Claude Code), and what I found is that its generally most helpful for projects external to the package that do not have usage examples within the project or in LLM pretraining data.

Possible Workflow Idea

PackageContext.jl could be a package that assists users with managing their own indexed context windows for dependencies. You would install the package outside of your workspace and tell it to create an index for any combination of internal or external dependencies using Claude Code in print mode. So Claude Code would go through the package code, create a compressed context window, and then provide that within an index. The agent can be instructed to check the generated index.

With this approach only base Julia is required since package versions can be inferred from the Project.toml and Manifest.toml. This means that when we instruct the agent on how to search for and load context, we don’t need to load anything besides base Julia. This works around latency issues with Julia CLI usage patterns. It’s likely simple enough that it could even be compiled AoT.

This also has the advantage that users could tweak prompts based on what works best for them. It’s not a one-size-fits-all approach, and it doesn’t require any buy-in from package maintainers.

I mulled over a system where packages could register context windows, but the issue with that is it would require loading all project packages. Even with precompile disabled would likely having significant startup latency for tool calling. It also wouldn’t work with AoT compile, and would require buy in from package maintainers. I think a system where the user is responsible for maintaining their own context index is a far better solution.