An LLM fine-tuned for Julia, call for comments + help

Palli · April 25, 2024, 11:19pm

You need to choose at a minimum what model to finetune, and Lama3 is outdated already, and (I’ve only scanned the Slack thread to that point) this one, was good:

Maybe this is useful ? bigcode/starcoderbase · Hugging Face

Arctic LLM seems best now (the Base model updated 2 hours ago) and/or Phi-3 (for a small one, also new), would now be on my short-list (also WaveCoder and its paper also, for “LLM-based Generator-Discriminator data process framework to generate diverse, high-quality instruction data from open source code”, and also hybrid Mamba/Transformer ajibawa-2023/Code-Jamba-v0.1 · Hugging Face still the only “Julia” tagged model on HF):

It’s a question is it better to start with a larger model (presumably better, but not always), or do they learn slower when fine-tuning? I.e. because of the size? You have a potential “catastrophic forgetting” with fine-tuning (for out-of-distribution data, so start with non-awful for Julia?), maybe not a bad thing forgetting the other languages Python etc…

One other thing maybe ruling out models is the tokenizer is fixed, some recent with about 30.000 possible tokens, and I’m thinking do they include the Unicode options Julia has? We want all the math operators supported/supportable…

Topic		Replies	Views
LLM AI just for Julia? A proposal: Julia plus science LLM? General Usage machine-learning	4	1656	June 24, 2023
Could a Julia fine tuned version of Llama 2 code be created General Usage question	14	1582	September 10, 2023
Fine-tuning an LLM for Julia, updates Tooling generative-ai	1	750	December 31, 2024
Are there efforts to improve ChatGPT for Julia code? Tooling	26	4059	July 24, 2023
[ANN] Julia LLM Leaderboard - Help us make it more relevant for every day problems! Package Announcements announcement , generative-ai , prompting	22	3648	April 5, 2024

An LLM fine-tuned for Julia, call for comments + help

Related topics