A new LLM benchmark for Julia programming

arjunguha · May 21, 2025, 1:38am

I know some people in this community are interested in seeing LLMs get better at Julia. But, you can’t make any progress in machine learning without a good benchmark.

We have started work on a new LLM benchmark that supports Julia. It is very early, but I think it is already much higher quality than prior efforts (including my own prior work on MultiPL-E). Moreover, I think the benchmarking methodology makes it much particularly easy to add new problems. The latter is really important, because writing a good benchmark is painful!

There is more information in the repository readme, including some preliminary results. If others are interested, I’d be happy to work together. I’m hopeful this will be a useful community resource.

Topic		Replies	Views
Benchmarking Julia for common scientific programming tasks Community	7	1722	August 27, 2018
LLM AI just for Julia? A proposal: Julia plus science LLM? General Usage machine-learning	4	1656	June 24, 2023
Generate independent LLVM code from Julia General Usage question , llvm	0	690	April 5, 2017
On Machine Learning and Programming Languages Machine Learning	48	8905	January 25, 2018
Do rules for good problems for benchmarking Julia exists? Performance	9	946	March 19, 2020

A new LLM benchmark for Julia programming

Related topics