Article by Doug Eadline on a paper which discusses Chat GPT generated code.
"ChatGTP performs the best on Julia, with 81.5% of generated code being successfully executed, and performs the worst on C++, with only 7.3% of the executions being successful. "
This may already have been discussed, so please forgive me if so.
Referring to @ChrisRackauckas
"While working with a numerical differential package, he noted that the Python code had a larger training data set. Still, not everyone in that training data seemed to know enough of the details of numerical differential equations to create trustworthy code. "
I guess that is well worth saying. If the training is based on experts of a certain level, you cannot expect the model to exceed this level of expertise.
Which leads to an interesting question - these models are really just linear algebra writ large.
Will we ever see them “intelligent” enough to reject training data which falls far to the left of the curve?
Would even attempting that be “safe”. I guess this is straying far beyond the bounds of this discussion board.
Having worked briefly in deep learning applied to medical imaging, the model was rigorously trained using input from real qualified radiologists. Training it on random members of the public would at best get you the same diagnostic accuracy as - a random member of the public. Or maybe worse!
We’re definitely getting off topic, but with general text generation, you can already tell ChatGPT to respond to something “Like an expert of [insert field]” and get better responses in return. I wonder if you can also prompt engineer code generation…