Why is Python, not Julia, still used for most state-of-the-art AI research?

AI, even ML, is more the deep learning (DL) or neural networks and I would like to know where we’re behind that really matters, do the right nr. 1. We’re behind infrastructure-wise (only?) with scaling to really large [NLP] (neural) models, but I’m not sure it matters too much as the trend is working smarter not harder, in NLP and other neural networks.

The goal should be one-shot or few-shot learning, and probably smaller/different models.

DL may be a dead end, but I think more likely something else is needed, and DL can be part of an AI.

What I see we should be replicating (or I would like to know if Julia more ideal for), is e.g. non-DL:

A.
https://science.sciencemag.org/content/350/6266/1332

The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches.

B.
@DrChainsaw, while there is NAS-related, e.g.:

there’s also newer:: Neural Architecture Transfer
From the video: “NAT is consistently more efficient (3x-9x) than EfficientNet, across various datasets without losing accuracy.”

Neural architecture search (NAS) has emerged as a promising avenue for automatically designing task-specific neural networks. […]
we propose Neural Architecture Transfer (NAT) to overcome this limitation […]
A pre-trained supernet is iteratively adapted while simultaneously searching for task-specific subnets. We demonstrate the efficacy of NAT on 11 benchmark image classification tasks ranging from large-scale multi-class to small-scale fine-grained datasets. In all cases, including ImageNet, NATNets improve upon the state-of-the-art under mobile settings

https://arxiv.org/pdf/2105.09491.pdf

benchmarks show that Retentive R-CNN significantly outperforms state-of-the-art methods on overall performance among all settings as it can achieve competitive results on few-shot classes and does not degrade the base class performance at all. Our approach has demonstrated that the long desired never-forgetting learner is available in object detection.

https://arxiv.org/pdf/2006.10738.pdf

Table 5: Low-shot generation results. With only 100 (Obama, Grumpy cat, Panda), 160 (Cat), or 389 (Dog) training images, our method is on par with the transfer learning algorithms that are pre-trained with 70,000 images.

4 Likes