Are there efforts to improve ChatGPT for Julia code?

Is there a genuine effort to work with OpenAI on updating the training for ChatGPT. It’s working flawlessly for R and sed/awk bash scripts I’m testing. However, since Julia has developed so rapidly it’s giving me maybe 50-75% correct answers and 25-50% wrong answers. I just asked one on dealing with missing values, I copied and pasted, and failed… I’ve had this a lot over the last few weeks and don’t know if the Julia devs are working with OpenAI to solve this or not.

If I ‘thumbs down’ in ChatGPT does that help us? Or if i provide the correct answer to it later?

Thanks in advance.

4 Likes

I don’t believe a formal relationship between Julia devs and OpenAI exists at this time, but @logankilpatrick works for OpenAI (separate from his volunteer capacity in the Julia community) and has asked for Julia users to provide feedback for correct/incorrect code snippets using thumbs up/down.

4 Likes

Thank you, this is good to know. It’s just been a little frustrating and would save us newbies a lot of time to at least develop some code initially as we learn.

I would be wary of copying, pasting and running code in whatever language before learning it, let alone for bash and things like sed!

Also, I would not like efforts of Julia devs to be used towards such a goal thinking about julia beginners. How would someone guarantee things like good practices and performance tips are encouraged? I have not seen a clear evidence that ChatGPT should be used to teach something or provide insights about technical knowledge.

Another point: One of the nicest things about this forum is that real julia experts and core developers share insight with users trying to learn. I would not want low-effort questions becoming widespread and making everything messier. Something like: How to make this [random thing generated by ChatGPT] faster? where the user didn’t really try and now requires other users to spend time understanding the reasoning behind some code generated by AI seems like a bad way of using everybody’s time.

19 Likes

A convention of putting a comment such as:

# last developed on Julia 1.8.3

or

# last tested on Julia 1.9.0beta4

at top of code, might help the LLM context window understand the various versions and crucially learn to deprecate old code and plagiarize… eh… intuit newer working code.

4 Likes

ahh, this is a great idea! I like this idea, hopefully that would help. I know others posted in the other post that it was struggling with packages (which I also observed) so hopefully packages can use this convention to improve performance in the future. I’ve just been struggling a lot with Julia vs other languages.

I guess I don’t understand this mindset. Why don’t you care about people learning Julia as beginners? We are all beginners at some point no matter who you are. If people come to Julia and ChatGPT doesn’t work, people move to other languages, feels like if you care about Julia you should care that ChatGPT (and likely other AI chatbots) is generating code that doesn’t work.

To your last point, it doesn’t make any sense. People wouldn’t have to ask questions on Slack and Discourse that are basic because there is no documentation for it in Julia yet. If ChatGPT can learn and spread this knowledge without, we can write our own documentation without having to bother devs and others on discourse, which leaves a ton more time for devs to work on their code and not answering very basic questions about coding… So seems counterintuitive to have them spending useless hours on things ChatGPT could replace easily. Also, I don’t know why you assume it would be terrible code, if it’s trained on good performing code, there would be no need to re-write all this code…

I guess we just disagree here.

2 Likes

Errr, this is not at all why I said. I am happy with disagreeing with you, but no need to put words in my mouth.

I do care about beginners, that is why I spend a lot of time here since I was one (maybe I am one still) and try to get people in my department to adopt it by offering help. What I am not sure about is that ChatGPT is a good tool for teaching. This other sentence is a very strong one that I think is not substantiated: If people come to Julia and ChatGPT doesn’t work, people move to other languages, feels like if you care about Julia you should care that ChatGPT (and likely other AI chatbots) is generating code that doesn’t work. But this is not a hill I am looking to die on, especially because it was a request for core devs I think?

4 Likes

ChatGPT is good at lying. Is that what we want in our code? Confabulations and gaslighting?
Example: ChatGPT Has All the Answers – But Not Always the Right Ones | Engineering.com

5 Likes

I’m sorry, I guess that’s how I read your post.

I don’t care that most or even many do, just that at least 1 (maybe core) developer is working with OpenAI to find solutions to this problem. I just find it fairly significant that I can get all sorts of correct answers in R, sed, awk, even PowerShell that are working fine, yet my Julia code I’m trying to utilize is maybe 50/50. It’s just quite frustrating for new users.

I don’t think it’s a good idea to learn a new language with ChatGPT, but if teaching students R has taught me anything, its that students will 100% take the path of least resistance and just ask ChatGPT first (as we gain brand new students). And if I had a choice, a student may leave Julia since it won’t be solved with much of the incorrect code. It’s just easier to stick with R or another language that may be generating correct code (for example). The barrier is greater for Julia without these AI models. I only thought it would help if someone pointed out to them all the errors we are getting as users. I’ve downgraded but still don’t get useful answers to many questions yet.

3 Likes

I think the motivation

“Let‘s put effort in ChatGPT, so students who use it get results with less understanding what they are doing” is at least the wrong motivation here.

Learning is sometimes frustrating, and I remember your thread about your frustration for sure. But a goal should definitely not be to improve how students can avoid to learn something.

For me personally, I would not help in a forum post that starts with code from ChatGPT without a users own efforts, because to me that shows a huge lack of interest, asking then the forum (or me as a replying user) to just solve there overall problem without them spending effort. That is not how a forum should work (unless I get money for that and even then I would probably not take such a job).

Teaching – and this forum – is about understanding Julia and how to code. Learning is sometimes frustrating until something is understood for sure.

For me, the development of ChatGPT is far secondary, because first ChatGPT itself has to improve on quite a few things: copyright issues, providing sources where the information they present comes from to improve reliability. These are far larger problems in a new machine-based interaction and problem solving approach than ”currently asking for X gives better answers than asking for Y”.

20 Likes

Again, I’m not sure why some of you think I’m suggesting everyone drops everything to work on this. It would simply be maybe one or few important people from Julia work closely or regularly to update whatever they are training on, or figure out why they are training on old solutions and not the newest code. I’m not sure if they just pull Julia 0.7 or old 1.0 manuals maybe… I only asked Stefan if they were aware as I presumed that you guys who are experts are not going to ChatGPT, while new students and people learning Julia will automatically go to google or now ChatGPT to learn and ask questions. It’s not the best solution but I’ve been around new students in our field and it’s still how 95% (in our field) of people learn to code (google or now AI options).

Many of us are embarrassed when we spend 3 hours trying to solve a ‘simple’ problem with no documentation to be found online or not knowing what to google. Julia is far far more difficult than even R. I’ve learned several different languages and it’s not even close. Nothing is compatible in Julia and nothing ‘just works’. Just the other day I had Union{Missing, Float64}, but nope UnicodePlots doesn’t recognize this as a thing lol… For a new user coming from R where these things are not really an issue (often, maybe not this example). Dealing with a million different types and getting things to work in Julia is very difficult coming from R where I can pretty much do as.data.frame() on anything lol. So our options are 1) give up Julia since R will work, just be slow (I’ve done this a lot myself…) 2) google 3) ask Discourse/slack which is a waste of your guys time if we can use #4 or 4) try ChatGPT or other AI.

So I just thought ChatGPT could really cut down this learning time for new users and wanted you all to be aware of how bad it was (assuming you guys haven’t used it). Hopefully OpenAI cares, but thought it wouldn’t hurt to ask here. I myself can reach out to them if no one else cares to, just wanted to know, sounds like Julia at least Discourse doesn’t seem to be aware or care much.

I’m the first one to suggest not learning by googling or ChatGPT, I only bring it up because I do get exhausted searching online and don’t find solutions quickly or at all so I now start with ChatGPT and move to docs if I can find them. Even if I read the docs it’s so difficult to remember a million different things when you start learning.

It really sounds like you guys are anti ChatGPT, which is fine. I can try to reach out to OpenAI but I’m not really sure what to tell them as I’m no expert, that’s why I posted here hoping someone who knew all the resources better could help with them. Or maybe there are other AI bots that are much better, I can try some of those.

1 Like

I’m not anti-gpt or code gen. I use Tabnine assistant in my code editor. I use ChatGPT frequently.

Techncically though, it’s going to be very tough to get a GPT model to produce correct code for languages like Julia. Because a transformer model does not have a concept of symbolics (it just sees token patterns), it has to just rely on piles of examples. Julia’s code base is much smaller than R or Python, and it has tons of unique sublanguages (Turing, Flux, JuMp, special chaing macros, etc) which limit the cross applicability from one domain to another. My guess is that domain-specific sublanguages in other languages would suffer from similar suggestion problems. For example, asking for ChatGPT to write an RStan model for you is going to give you problems.

It’s unclear to me what OpenAI could really do here to improve suggestions. Since its approach to text generation relies completely on distilling semantic structure from closely related examples, not understanding the algebraic/abstract relationship between symbols, correct code generation is going to be pretty tough for any case where you are writing something new or in a less popular language. And trustworthy code generation is going to be impossible in any language (code can be frequently, yet coincidentally, correct, but not trustworthy).

Pretty great conversation about this on a recent Ezra Klein Podcast, if you’re intererested.

2 Likes

The thing is, ChatGPT is only as good as its training data is. And, yes, one of the downsides to Julia is that there simply isn’t as much material online. With a cutoff in mid-2021, ChatGPT is missing nearly half of the time that Julia’s been 1.0+.

As a general language model, OpenAI has much bigger fish to fry than Julia’s accuracy. A more dedicated code model like Copilot is better in my experience (and significantly more likely to actually care).

So how do you make them better? The answer is simple: Ask and answer questions on SO. Ask and answer questions here. Write blogs. Write tutorials. Publish course notes.

5 Likes

Thanks Matt. I agree, I knew it was behind, I just was maybe thinking if they needed help with say a web scarping tool for Discourse that would help, or maybe they have already and just trained on old manuals.

I will try copilot, thanks for that!

I was trying to write docs for animal breeders (my field) and animal scientists, but as I said, I was using ChatGPT to find a lot of good solutions, but then I got into missing values on a project and it gave me a bunch of bad results… So I hoped we could work on this if someone was interested to maybe help them or guide them or whatever is needed. I understand very little of how their models work or how their training process works, but someone knows a lot more than me. I sent them a bot request, we’ll see if they answer.

It is also worth noting that many popular packages in Julia only recently (if at all) reached a stable version 1.0. I don’t think syntax for the other languages you mentioned are changing as rapidly as Julia’s since Julia is so heavily dependent on the package ecosystem. Better AI answers will come as our packages mature and online content becomes dominated by post-1.0 solutions.

  • JuMP - 2022
  • DataFrames - 2021
  • Symbolics - 2021
  • Plots - 2020
  • Makie - not yet
  • Pluto - not yet

Yes, this is the main issue. I know Julia went through the 1.0 transition recently and yes packages develop rapidly after that. All fine, just bringing it to people’s attention after I got stuck in rabbit holes with it recently on some tasks for a project (work with large genotype files I need Julia for).

My only point in all of this was to see if OpenAI say needed help scraping code or access to other training data that the Julia community could provide to it. Perhaps maybe we need to document our code differently so their language model can pick up on answers better? I don’t know… I just threw it out there so people were aware and discussing as I assumed many of you here were not using ChatGPT and others because usually you guys are experts while new users like myself need to rely on docs, google, and now ChatGPT (among others). I don’t mind using discourse but sometimes it still takes a few hours to get a one line answer where ChatGPT has got me solutions in 10 seconds and I can move on and get results to my boss.

2 Likes

So I did get a hold of Logan (@ in one of the top posts). He said he would work on this update before JuliaCon this year and is just busy at the moment. Seems like they are aware and will get to a solution soon. It will include Slack and other channels I guess. This should hopefully drastically improve the predictions and looking forward to it.

Thanks all to those who commented below.

To be clear, my intent would be to put out a fine tuned model that is better at Julia than the base model. I would also likely take all the Julia code and articles I could find and embed them for similarly search and such.

This likely won’t make it upstream to the base model (like Matt said, bigger fish to fry right now), but some website / feature here where people can ask questions about Julia and get really great answers seems worth a try. I have something submitted for JuliaCon so that will keep me on track with this. In the meantime, keep giving feedback via the thumbs up thumbs down feature!

And note, this likely won’t include slack data, there’s great stuff there but people think we made some promise that the data won’t be made public in anyway (even though no one made that promise) so I would stay away from that.

15 Likes

Thanks Logan!

Any other suggestions for guidance for us in general? Someone suggested we put ‘last tested on Julia v1.8.5’ or something in the files we post on Github? Is that helpful? Or how about comments for each line of code? Does that help the training data? I’m quite ignorant on how these language models or other whatever models. If we had better guidelines we could maybe help improve the predictions. (I’m posting linear model code for instance for statisticians and would like to know the best route to help others).

Or biggest picture, do we need more blog posts, help files, general documentation from packages, etc we can work on?