I have been coding with AI for the last three months, working on three main projects. The first involved Feed Forward (FF) and Long Short-Term Memory (LSTM) Neural Networks (NN), implemented with Flux, and totaled about 3k lines of code. The second focused on Symbolic Regression (SR) (around 2,5k lines). The last project was a relatively short but complex WebSocket (WS) Client. I created two versions of the WS Client, one with 1,5k lines (reliability oriented - pure Julia) and the other with 1k lines (performance oriented - including significant amount of C). I have been using Julia for the last 3.5 years. I am not a professional coder. My previous (rather brief) coding experience was with Atari Basic and later Pascal on 386/486.
For me, the first two projects (NN and SR) were basically a jaw-dropping experience. Everything was going very fast and rather smoothly. However, the WS Client is a different story and surprisngly was significantly more challenging. I aimed for an event-driven, thread-safe, asynchronous architecture with a concurrent queue with locks and atomics, full monitoring, reporting, and retry mechanisms with two sequentially starting WS connections and a WS connection pool.
The AI help in basic simulations, analytics, and visualizations (the first two projects) was very good. For the WS project related to infrastructure that require some logic and particularly precision, I found the AI help also very good, though time-consuming.
I started with Cursor, then went for Windsurf. Later, I briefly used Aide (they changed the focus recently, so as far as I know, this IDE is not available anymore, their current focus is on GitHub autonomous parallel agents). Then I got back to VS Code with Cline and Roo Code, and I am currently testing PearAI, which is VS Code with an open-source stack consisting of Roo Code/Cline, Supermaven, MemO, Perplexity, and Continue.
I am not an expert; however, I would say that Cursor’s context awareness was the best overall, and completions were very fast and mostly spot on. IMO, Windsurf has a great feature where you can preview proposed changes (something similar to VS Code’s “Select for Compare” and “Compare with Selected”), and I also somehow liked their free model more than the one provided by Cursor. PearAI is a beauty; I like it quite a lot. However, despite some engineering attempts, I was not able to make its agent work over remote SSH connections, nor within WSL (Windows Subsystem for Linux). So if one needs those features in combination with the agent and to be super reliable, I would say probably VS Code is the way to go in such a case.
As for the models, so far I have been using various OpenAI, Anthropic, Mistral, and DeepSeek models the most. Recently, I tried Gemini and Groq as well. Overall, (obviously), I like Anthropic Sonnet 3.5 the most. DeepSeek R1’s reasoning capabilities are very useful for getting some in-depth explanations; however, I find the final advice it provides not as good as Sonnet. The recent versions of Mistral seem not to be as adventurous as other models, but I find Mistral to be very reliable; it’s unlikely to introduce changes that render the codebase unusable. I don’t have any experience with o1, but I find GPT-4o reliable and a great companion to Sonnet 3.5. I have relatively small experience with Gemini and Groq. Gemini is free (subject to the privacy statements) and offers a huge context window. Groq has a generous free tier and seems to be super fast. For model selection, I am using OpenRouter.
Again, I am not an expert and as for now, I don’t have any advice regarding the best practices. So far, I found that it all depends on the task. I try to be as generous and at the same time as selective as possible regarding the context. I like to plan first and execute later. What I would like this technology to have is for those models to test their code suggestions before providing the advice, and only offer solutions that actually work. I’m not sure if that’s currently possible. Another useful feature would be a more extended memory. I often found myself having to re-explain concepts that had been discussed shortly before.
I would like to emphasize that my primary interest, at this point, is in using AI to “understand the code”, rather than having it build projects completly autonomously from A to Z. I’ve written those paragraphs from that perspective.
How do you like Aider and Copilot if I may ask?