[ANN] Julia LLM Leaderboard - Help us make it more relevant for every day problems!

svilupp · February 23, 2024, 6:11pm

I can’t wait to try this one: Phind

alfaromartino · February 23, 2024, 6:51pm

Phind didn’t pass my usual test (so far, actually no LLM has been able to do so). I always ask the question “in Julia, do slices behave as copies or views?”. It was giving the wrong answer.

svilupp · April 5, 2024, 8:48pm

It’s not yet a full release (there is a bit of work left still), but I wanted to share some preliminary findings about the new Anthropic models.

Turns out Claude-3 Haiku broke the value-for-money barrier previously owned by GPT-3.5-Turbo:

I’ve been really impressed by Claude Opus but the real star of the show here is Claude Haiku!
I paid $30 for the Opus evals, but only $0.4 for Haiku evals!!!
(and Opus had a crazy bad availability, I had to restart many times)

Btw. you can now use Anthropic models also for data extraction/function calling (released today) - it’s available in v0.18 of PromptingTools among other things.

What’s next? Exciting GSoC project (hopefully), new test cases, new models, and a new category of evals… A lot to look forward to.

Topic		Replies	Views
Fine-tuning an LLM for Julia, updates Tooling generative-ai	1	703	December 31, 2024
A new LLM benchmark for Julia programming Tooling generative-ai	0	212	May 21, 2025
[ANN] New organization: JuliaGenAI for advancing generative AI tooling Community announcement , generative-ai	5	1502	February 19, 2024
[ANN] List of Awesome Generative AI in/with/for Julia Language Package Announcements announcement , machine-learning , llm , generative-ai	11	2234	December 20, 2023
[ANN] PromptingTools.jl - Your Daily Dose of AI Efficiency! Package Announcements announcement , productivity , generative-ai , prompting	7	1769	April 11, 2024

[ANN] Julia LLM Leaderboard - Help us make it more relevant for every day problems!

Related topics