Various cloud and local LLM services allow users to upload documents before asking questions about the documents. If a package manual can be provided as a single webpage or PDF file (within the context window size of the LLM), then you can ask questions about using the package even when the (up-to-date) package is otherwise outside the knowledge of the LLM.
P.S. maybe I just need a tool for crawling through doc pages to combine them into a single document.
I don’t think you are likely to find many, if any, authors willing to do it for you. I definitely wouldn’t just so a user can use an LLM. If you want a PDF, you can generate one yourself by playing with Other Output Formats · Documenter.jl and going to the package’s docs/make.jl file. How good that PDF will turn out with any extra effort is a different question.
IMO this could be a pretty good idea (though of course no package author faces no obligation whatsoever if they don’t want to)
but I suspect some of the tone of the responses so far are driven by a generic hostility to AI-assisted coding workflows, and like it or not I don’t think these are going to become any less popular anytime soon.
Just playing with LLMs for fun as of late; definitely didn’t mean to sound entitled. I tried running a downsized version of DeepSeek-R1 with 14B parameters on my laptop. It was able to produce correct Julia code for this simple question, on the first try: “Write Julia code to print out the total size of the 10 largest files in the current directory.” Surely a very basic task, but still impressive that it can be solved by a local model on my meager GPU with 8GB of RAM. The code produced is here:
files = []
for f in readdir(".")
if isfile(f)
try
size = stat(f).size
push!(files, (f, size))
catch e
continue
end
end
end
sorted_files = sort(files, by = x -> -x[2])
top_10 = first(sorted_files, 10)
total_size = sum(x[2] for x in top_10)
println("Total size of the top 10 largest files: $(total_size) bytes")
I think that soon making documentation in format which AI can easily digest and accompany it with data sets for fine tunning AI models will become a good manner. Packages without this stuff will not stand a competition.
Something’s amiss here. If an artificial intelligence requires specially crafted data, it doesn’t sound as intelligent as it’s made out to be. Or put another way, AI with such requirements will not stand a competition.
Looks like I did miss the web search features available in both ChatGPT and local LLMs, which can easily descend into subsections of doc websites. I just found out how to enable web searches in my local LLM deployment (Ollama + OpenWebUI). Here’s a screenshot, running Meta’s Llama3.2 3B model to inquire about a recently created Julia package. Both the doc site and the Discourse announcement page were read by the LLM before an answer was produced.
I don’t think it read beyond the top-level pages of the 6 websites listed explicitly in the screenshot. I can probably tweak the settings to increase the number of pages the LLM reads, which will help if the individual doc pages appear among the top results returned by the search engine. I can also ask specific questions, e.g. about a particular function in the package, to guide the LLM to search for the specific doc page.
The way this is done is not to have every package author creating specialized documents to upload to LLMs but to make a separate tool that takes the typical form that package author publish their documentation and generates a single file from it.
Exactly. Not only that, but this doesn’t sound too far of either. Documenter.jl is the standard tool most Julia packages use for their docs. The Documenter.jl team has put a lot of effort to make it so that it is modular and can accommodate various plug-ins. Many of them readily available already in the “market” like DocumenterCitations.jl. As the whole infrastructure is well standardized it really isn’t much of a stress for someone to make a conversion tool.
Most likely the people that use AI the most can come together and make a nice package!