I can’t wait to try this one: Phind
Phind didn’t pass my usual test (so far, actually no LLM has been able to do so). I always ask the question “in Julia, do slices behave as copies or views?”. It was giving the wrong answer.
It’s not yet a full release (there is a bit of work left still), but I wanted to share some preliminary findings about the new Anthropic models.
Turns out Claude-3 Haiku broke the value-for-money barrier previously owned by GPT-3.5-Turbo:
I’ve been really impressed by Claude Opus but the real star of the show here is Claude Haiku!
I paid $30 for the Opus evals, but only $0.4 for Haiku evals!!!
(and Opus had a crazy bad availability, I had to restart many times)
Btw. you can now use Anthropic models also for data extraction/function calling (released today) - it’s available in v0.18 of PromptingTools among other things.
What’s next? Exciting GSoC project (hopefully), new test cases, new models, and a new category of evals… A lot to look forward to.