The reason Python is used, is because it’s a good interface to high-speed code written in other fast languages. Julia can also be a good interface, and that faster language. Training massive AI/ML models relies on libraries like Microsoft’s DeepSpeed (“Python 69.0% C++ 20.3% Cuda 9.8%”, Julia could theoretically have replaced all the languages, the numbers are for source code, for runtime about 0% is Python), for distributed training, this is only of interest to big companies that can afford to do massive training of AI models costing millions of dollars.
AI is divided into many subfields, Machine learning (ML), natural language processing (NLP), neural networks/deep learning, and large-language models (LLMs) (what I had in mind for mainstream, fuelling so called chatbots like ChatGPT), text-to-image or video models (like LLMs based on transformer models), computer-vision, reinforcement learning, chess and poker playing AI etc.
Mainstream AI is transformer models, a type of machine learning application. The mainstream is already going to Mamba and other variants, and transformer models are being upgraded to KAN-based:
From yesterday:
While you already see KAN in “100.0% Python”, that’s sort of white lie, if you look at the dependencies “the requirements.txt” file, i.e. the heavy lifting is always done by non-Python, faster languages like C, or increasingly Julia.
If you’re a user of any of the AI models, or code, then Python is a good option, if you want to develop future best AI, then Julia is the best option.
For beginners I recommend this book (written by then 15-year old genius):
I bought it, and this one also seems good: