Improving LLM-generated Julia code, especially for Makie visualizations

garazha-ilya · May 22, 2026, 9:41am

Hi everyone,

I’d like to ask for advice on how to improve the quality of Julia code generated by LLMs, and to spark a discussion about what the community could do to make LLMs more useful for Julia tasks.

What I’ve tried so far
I wanted to use a local LLM to create plots with Makie. To give it better context, I downloaded the entire Makie documentation and set up a RAG system using Anything LLM + Gemma 4 MoE. Unfortunately, that didn’t help much – the generated code still looked very “Python-like” and rarely worked out of the box.

The frustration
Many AI coding agents, when asked to visualize some abstraction, produce Python or JavaScript code that actually runs. It’s clear that these models have been trained (or fine-tuned) on enough examples that they can even self-correct when they hit a problem, resulting in a working solution right away. I’d love to see something similar happen for Julia – ideally a tool that can generate ready-to-use Pluto or Bonito notebooks, or at least produce working scripts for Makie visualizations.

Question for the community
What would it take to get there? Could we create a large, high-quality public dataset of Julia visualization code (and more generally, idiomatic Julia examples) and fine-tune a local LLM on it? Are there already efforts in this direction that I could contribute to? I’d really appreciate any suggestions, experiences, or pointers to existing projects.

I already read similar topics, but maybe it’s time to make some updates?

Thanks in advance!

adienes · May 22, 2026, 12:03pm

this might be an annoying answer (because it is an expensive one), but the biggest problem for you is probably the underpowered choice of tool. The SOTA agents (Claude Opus 4.7, Codex 5.5) are much much better. especially with thinking turned to max, I already get quite good Julia code out of them.

but maybe [Help Wanted] Help contribute test cases to improve LLM performance on Julia code will interest you, if you have particular workloads that agents have really struggled on?

technocrat · May 22, 2026, 3:09pm

I think you are right about the dominance of Python in the training bases of most models. To get rid of that would probably require training a model from scratch using just a Julia base, say the entire registry. While that’s possible in theory, very few of us have the near-petabyte storage potentially required.

So, the task becomes how better to prompt to recursively translate the initial output into idiomatic Julia. That’s something I’d be happy to work with you on using my own local models. Can we start with a few of your MWEs?

madppiper · May 22, 2026, 3:17pm

Out of curiosity, has anyone published some Claude code skills yet? Lookig at the MCP developers. Looks like a natural progression of this?

sdanisch · May 22, 2026, 4:35pm

I think the biggest thing you can do is to let it execute code and look up docs !

Satvik · May 22, 2026, 5:03pm

I’ve published a (not very reviewed, buyer beware) skill for letting Claude use a Julia REPL via tmux or zellij. This lets it iterate a lot faster, because it can avoid startup & compilation time. GitHub - Satvik/julia-repl-skill · GitHub

I use several more custom skills, but they’re pretty specific to my work, e.g. “how to convert Julia functions of vectors to vectors into our time series framework.” If I do find anything else that seems generalizable, I’ll toss it up on the repo as well.

madppiper · May 22, 2026, 5:05pm

That is awesome - i’ll check it out!

greatpet · May 23, 2026, 10:02am

I’ve always been curious about this, but I’m not aware of any such successful effort for a niche open source programming language (Julia or something else), so I suspect it must be hard and costly. From what little I read, fine-tuning is mainly for adjusting tone and style. You need “continued pretraining”, a much heavier procedure involving over a billion training tokens, to inject new domain-specific knowledge into a local LLM.

Topic		Replies	Views
Could a Julia fine tuned version of Llama 2 code be created General Usage question	14	1753	September 10, 2023
Fine-tuning an LLM for Julia, updates Tooling generative-ai	1	859	December 31, 2024
Poor Impression on Agentic Development in Julia with Codex. Skill issue? General Usage question	18	1872	April 20, 2026
Julia ranking fairly good in code generating using ChatGPT Community	3	1528	November 19, 2023
An LLM fine-tuned for Julia, call for comments + help Tooling llm , generative-ai	32	4269	February 2, 2026

Improving LLM-generated Julia code, especially for Makie visualizations

Related topics