I think overall the statefull appraoch of Julia might not fit the LLMs so well. It has lots of problems with environments and environment management especially.
I guess you’re at least partly referring to the issues discussed in that thread: Interactive prototyping workflows - #36 by kousu
Therefore, I hold the agent on quite a short leash. Having short tasks, review often within a session and essentially just offloading tedious coding tasks but close to none of the software design aspects. If thats the best way to use AI? Probably not.
As I mentioned, I chat with it, and gradually it’s building my entire codebase. I also try to keep things under control, especially for the performance critical parts. What I find interesting about your package is that, as I understand it, it allows the agent to be immediately aware of the results of its own advice. My initial hope was that instead of me having to test several iterations of its suggestions manually, the agent could do that in the background and present me with either a working solution or a few viable working options. Then I could refine the chosen solution further and, at the end, ask for performance improvements. At that stage, the process would start again, it would test, iterate, and finally produce a version that I could accept.
Well, no need to register it as you can just do julia>] add/dev … to try it out.
Yes, I’m aware of that method. I just wanted to be polite. :- )
> Another feature that’s important to me is keeping a safe working version of a reference file, editing a copy, and updating the reference file only when I’m 100% satisfied with the changes.
Sounds like a problem fully solved by git. Put you files under version control and you can always safely revert all changes made by AI.
Well, yes and no at the same time. For me, there’s a difference between manually reverting incorrect changes and having a system that manages state and direction more intelligently.
EDIT: I just wanted to add that I wrote this as my best attempt at a generalization based on my experience with these models. However, I think coding is much more complex. One thing I’d stand by is the importance of verifying its advice. Testing incorrect suggestions (even though less common than a few months ago) manually is really a waste of time.