What's your AI programming tool stack?

Hey y’all! @tbeason suggested that I put up a little thread asking people about what misc AI tooling they actually use. I mostly just want to understand how people are using these tools in the ecosystem.

First – a note on general language model performance in Julia. @svilupp has a fantastic leaderboard for which LLMs tend to do better. At the moment Claude 3 Opus is the leader in Julia using the test specifications here. Test submissions are welcome!

My stack is a little clumsy right now because I haven’t quite dialed it in, but I use

  • Cursor as my editor. It’s a fork of VS Code with really good AI tooling. I do a lot of front-end stuff as well and it’s really nice to go to a a new file, hit Ctrl + K, and ask for exactly the file I want. You can even provide links to docs, other files, etc. to use as context. Very good tool, I shelled out for the pro version. You get Claude Opus calls for code gen and chat.
  • Claude Pro for very long architectural discussions or lots of guided code discussions, mostly because I use a lot of calls and I like the UI. I also do lots of non-programming chats with Claude, though using Claude via Cursor is better. I just haven’t nailed down the workflow yet.
  • PromptingTools.jl for misc LLM calls, mostly using Ollama as a backend with OpenAI for paid calls. I would switch to Claude but I just haven’t set up the API key and I am lazy.
  • AIHelpMe.jl for Julia-specific stuff when I’m in the REPL. It’s good shit and we should use it more.

What’s yours? What are you noticing as pain points or as things you really like?

12 Likes

Thanks @cpfiffer !

My stack

  • I use ChatGPT Plus for lots and lots of things. It is really surprisingly good. I would consider Claude but do not expect the differences to be large enough to pay the “switching costs”.
  • I am trying out Cursor instead of VS Code. I have not yet found it to be entirely helpful and often jump back to ChatGPT to get AI help for something. It does offer nice autocomplete but it often misses the mark. Likely I just need to learn a bit more about all of its features and how to use them.

Curious to see what everyone else is doing.

2 Likes

IMO the interface for ChatGPT is much better than Claude’s. The difference is not terribly large.

This is my sense too. I think I need to learn to be more competent with the chat functionality in the editor, but I find the fact that the chat history gets wiped almost every time to be a little discomfiting.

I’d love to run this as a full-on community survey for my JuliaCon talk!

But for the time being, here is my response:

  • Cursor - if you haven’t tried, you don’t know the power of auto-completion in the middle of the sentence! I can never go back to Copilot… And all the things Cameron mentioned!
  • ChatGPT Plus (just images for the in-image editing)
  • PromptingTools.jl with Groq Llama70b as the default (part of my startup.jl)
  • AIHelpMe.jl - only when I work with AlgebraOfGraphics/Makie, I get it running automatically, so it’s warmed up :smiley:
  • ProToPortal.jl - personal mini-project that I actually love! I missed a GUI with MY templates in my stack and I love to use it when I cannot be at my computer (deployed on fly.io) + it’s much faster than ChatGPT and I’m an impatient person.

ProToPortal is probably too niche/too personal, but if you’re interested, I just open-sourced it: [ANN] ProToPortal.jl: A Portal to the Efficient Coding On The Go and the Magic of PromptingTools.jl

9 Likes

For a free alternative for people (not companies), Codeium works well.

In general, though, I’m inclined to using chatbots, rather than integrating the tool inside the code editor. I suspect that reading non-relevant suggestions are slowing down my work, rather than increasing my productivity. Still trying to find an approach that works perfectly for me, but not there yet.

5 Likes

Hi @cpfiffer great to hear that. Do you have a .cursorules file for the same that you can share especially for julia ?

I’m also curious to know what this looks like, almost a year later – whether more people are using AI coding assistance, and what the best usage patterns today look like. Especially because most chatter I hear online is about people using these tools for frontend dev, infra, etc – quite different from what I use Julia for, which is almost completely numerical programming, simulations, etc.

I’d appreciate hearing about experiences, or even just pointers to high-quality resources.

As for me: I was mostly only using VSCode + LSP previously. Only recently started testing the waters with Copilot and Aider. Might try out Cline/Cursor at some point.

1 Like

@svilupp first off, thank you so much for the great work you’ve been doing with the LLM-related packages :heart: I would use PromptingTools.jl more often if it had a CHAT REPL mode where users could type a special character like / and start chatting without the ai"message" boilerplate:

/ # typing / as the first character in the line activates CHAT REPL mode
chat> What is the capital of Brazil?
🤖: Brasília is the capital

You can create custom REPL modes in Julia with packages like ReplMaker.jl. The example they give with a LISP REPL gives an idea of the end-user experience. It is really nice.

7 Likes

Thanks for the feedback! I’ve heard the same from many people.

There have been several attempts historically:

But they are a bit outdated now. IT would be great if someone wanted to develop quick & ergonomic REPL mode with PromptingTools!

Alternatively, it could be extended to use AIHelpMe.jl “RAG” which also leverages the available docstrings for your loaded packages or pre-computed knowledge packs to improve AI answers.

Any takers?

1 Like

Please correct me if I am wrong, but isn’t AiHelpMe.jl equivalent to PromptingTools.jl + Julia documentation knowledge base?

If that is the case, I would focus on PromptingTools.jl and give it a single command PromptingTools.digest(url) to digest publicly-hosted knowledge bases. The digestion process could then use DataDeps.jl or similar package to store the knowledge base locally as an artifact. The REPL mode could then load these knowledge bases automatically, and users could configure their knowledge bases using environment variables or some other mechanism.

2 Likes

Thanks for reminding me of this package.

I would describe it slightly differently:

  • PromptingTools.jl → tooling to build a RAG-based system
  • AIHelpMe.jl → stateful application for specific use case (manages processed knowledge - via artifacts, configuration for the use case, extra utils to allow users to process loaded packages, easily change config, etc.). there are other packages supporting the use case (DocsScraper.jl)

I think it would make it quite bloated to move all the application-specific stuff into PT.

I see REPL mode as a front end that anyone can add easily, in the same way how Pere created a GUI with Stipple: GitHub - BuiltWithGenie/PkgAIHelp

1 Like

I have been coding with AI for the last three months, working on three main projects. The first involved Feed Forward (FF) and Long Short-Term Memory (LSTM) Neural Networks (NN), implemented with Flux, and totaled about 3k lines of code. The second focused on Symbolic Regression (SR) (around 2,5k lines). The last project was a relatively short but complex WebSocket (WS) Client. I created two versions of the WS Client, one with 1,5k lines (reliability oriented - pure Julia) and the other with 1k lines (performance oriented - including significant amount of C). I have been using Julia for the last 3.5 years. I am not a professional coder. My previous (rather brief) coding experience was with Atari Basic and later Pascal on 386/486.

For me, the first two projects (NN and SR) were basically a jaw-dropping experience. Everything was going very fast and rather smoothly. However, the WS Client is a different story and surprisngly was significantly more challenging. I aimed for an event-driven, thread-safe, asynchronous architecture with a concurrent queue with locks and atomics, full monitoring, reporting, and retry mechanisms with two sequentially starting WS connections and a WS connection pool.

The AI help in basic simulations, analytics, and visualizations (the first two projects) was very good. For the WS project related to infrastructure that require some logic and particularly precision, I found the AI help also very good, though time-consuming.

I started with Cursor, then went for Windsurf. Later, I briefly used Aide (they changed the focus recently, so as far as I know, this IDE is not available anymore, their current focus is on GitHub autonomous parallel agents). Then I got back to VS Code with Cline and Roo Code, and I am currently testing PearAI, which is VS Code with an open-source stack consisting of Roo Code/Cline, Supermaven, MemO, Perplexity, and Continue.

I am not an expert; however, I would say that Cursor’s context awareness was the best overall, and completions were very fast and mostly spot on. IMO, Windsurf has a great feature where you can preview proposed changes (something similar to VS Code’s “Select for Compare” and “Compare with Selected”), and I also somehow liked their free model more than the one provided by Cursor. PearAI is a beauty; I like it quite a lot. However, despite some engineering attempts, I was not able to make its agent work over remote SSH connections, nor within WSL (Windows Subsystem for Linux). So if one needs those features in combination with the agent and to be super reliable, I would say probably VS Code is the way to go in such a case.

As for the models, so far I have been using various OpenAI, Anthropic, Mistral, and DeepSeek models the most. Recently, I tried Gemini and Groq as well. Overall, (obviously), I like Anthropic Sonnet 3.5 the most. DeepSeek R1’s reasoning capabilities are very useful for getting some in-depth explanations; however, I find the final advice it provides not as good as Sonnet. The recent versions of Mistral seem not to be as adventurous as other models, but I find Mistral to be very reliable; it’s unlikely to introduce changes that render the codebase unusable. I don’t have any experience with o1, but I find GPT-4o reliable and a great companion to Sonnet 3.5. I have relatively small experience with Gemini and Groq. Gemini is free (subject to the privacy statements) and offers a huge context window. Groq has a generous free tier and seems to be super fast. For model selection, I am using OpenRouter.

Again, I am not an expert and as for now, I don’t have any advice regarding the best practices. So far, I found that it all depends on the task. I try to be as generous and at the same time as selective as possible regarding the context. I like to plan first and execute later. What I would like this technology to have is for those models to test their code suggestions before providing the advice, and only offer solutions that actually work. I’m not sure if that’s currently possible. Another useful feature would be a more extended memory. I often found myself having to re-explain concepts that had been discussed shortly before.

I would like to emphasize that my primary interest, at this point, is in using AI to “understand the code”, rather than having it build projects completly autonomously from A to Z. I’ve written those paragraphs from that perspective.

How do you like Aider and Copilot if I may ask?

5 Likes

I must have misunderstood something here, sorry.