Reinforcement learning and e.g. deep kernel learning, and status of Julia for such AI

I was scanning the paper, i.e. Reward is enough - ScienceDirect

  1. Reward is enough
    3.3. Reward is enough for social intelligence
    3.4. Reward is enough for language
    3.5. Reward is enough for generalisation
    3.7. Reward is enough for general intelligence

[Note, while the paper doesn’t menion “neural network” or “deep learning”, both are in references, so I assume as with AlphaGo, RL plus neural network may be assumed.]

so while I’m not sure I believe Google’s Deepmind paper, it’s a forward-looking statement, I’m curious what Julia’s status is vs. e.g. Python for reinforcement learning. I know Deepmind’s AlphaGo has been reimplemented, and while Julia ecosystem is playing catch-up with Python for neural networks, maybe for RL it’s faster?

Also for other nontraditional AI, Deep kernel learning (and Gaussian processes), are we ahead?

Symbolic AI (only) was claimed to be a dead-end years ago, and it seems, neural networks (sub-symbolic) are too, at least traditional ones. But not hybrid of the both. Python has “Logic Tensor Networks” which seem interesting, and:

the following Neural Process variants:

  • Conditional Neural Processes (CNPs)
  • Neural Processes (NPs)
  • Attentive Neural Processes (ANPs)

The code for CNPs can be found in conditional_neural_process.ipynb while the code for both NPs and ANPs is located in attentive_neural_process.ipynb .
[…] further details can be found in the CNP paper, the NP paper and the ANP paper.

The Promises and Pitfalls of Deep Kernel Learning

Deep kernel learning and related techniques promise to combine the representational power of neural networks with the reliable uncertainty estimates of Gaussian processes. […]
we find that a fully Bayesian treatment of deep kernel learning can rectify this overfitting and obtain the desired performance improvements over standard neural networks and Gaussian processes.

Neurosymbolic AI: The 3rd Wave

Keywords: Neurosymbolic Computing; Machine Learning and Reasoning; Explainable AI; AI Fast and Slow; Deep Learning.
Despite the impressive results, deep learning has been criticised for brittleness (being susceptible to adversarial attacks), lack of explainability (not having a formally defined computational semantics or even intuitive explanation, leading to questions around the trustworthiness of AI systems), and lack of parsimony (requiring far too much data,computational power at training time or unacceptable levels of energy consumption) [52]
The need for a better understanding of the underlying principles of AI has become generally accepted. A key question however is that of identifying the necessary and sufficient building blocks of AI, and how systems that evolve automatically based on machine learning can be developed and analysed in effective ways that make AI trustworthy.
The current limits of neural networks as essentially a propositional system are also evaluated. In a nutshell, current neural networks are capable of representing propositional logic, nonmonotonic logic programming, propositional modal logic and fragments of first-order logic, but not full first-order or higher-order logic. This limitation has prompted the recent work in the area of Logic Tensor Networks (LTN) [79, 53, 95] which, in order to use the language of full first-order logic with deep learning, translates logical statements into the loss function rather than into the network architecture
In a nutshell, we claim that neurosymbolic AI is well placed to address concerns of computational efficiency, modularity, KR + ML and even causal inference. More researchers than ever on both sides of the connectionist-symbolic AI divide are now open to studying and learning about each others’ tools and techniques. This was not the case until very recently.
Symbolism has been expected to provide additional knowledge in the form of constraints for learning [23, 32], which ameliorate neural network’s well-known catastrophic forgetting or difficulty with extrapolation in unbounded domains or with out-of-distribution data. The integration of neural models with logic-based symbolism is expected there-fore to provide an AI system capable of explainability, transfer learning and a bridge between lower-level information processing (for efficient perception and pattern recognition) and higher-level abstract knowledge (for reasoning, extrapolation and planning).
Henry Kautz’s taxonomy for neurosymbolic AI [42], which was introduced at AAAI 2020: In Kautz’s taxonomy, a Type 1 neural-symbolic integration is standard deep learning, which some may argue is a stretch
Type 2 are hybrid systems such as DeepMind’s AlphaGo and other systems where the core neural network is loosely-coupled with a symbolic problem solver such as Monte Carlo tree search.
Type 5 are those tightly-coupled but distributed neural-symbolic systems where a symbolic logic rule is mapped onto an embedding which acts as a soft-constraint (a regularizer) on the network’s loss function. Examples of these include Logic Tensor Networks [79] and Tensor Product Representations [39], referred to in [13] as tensorization methods. Finally, a Type 6 system should be capable, according to Kautz, of true symbolic reasoning inside a neural engine. This is what one could refer to as a fully-integrated system. Early work in neural-symbolic computing has achieved this (see [20] for a historical overview).

The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence

The problem is, even with massive amounts of data, and new architectures, such as the Transformer (Vaswanietal.,2017), which underlies GPT-2 (Radfordetal.,2019), the knowledge gathered by contemporary neural networks remains spotty and pointillistic, arguably useful and certainly impressive, but never reliable (Marcus, 2020).

We show how Real Logic can be implemented in deep Tensor Neural Networks with the use of Google’s tensorflow primitives.


Thank you for the nice writeup. I’m definitely of the persuasion that neural networks need to fundamentally change to get us closer to anything resembling human level intelligence. I also don’t share deepminds optimistic view that reward is enough but the way they wrote the paper also makes it impossible to falsify their statements.

I would be interested in hearing your point of view on how Julia comes into play here? Obviously the big frameworks are the ones spearheading the current developments but maybe there is a place for Julia in this space?

1 Like

Yes, Tensorflow and PyTorch, are used for neural networks. I thought only, but at least TF is also for RL, while it seems to be a separate subcomponent TF-Agents: Introduction to RL and Deep Q Networks  |  TensorFlow Agents

I’m not sure if non-TF-Agent TF or PyTorch are ever used for the RL part, only for neural network you use with it?

August 2020 blog:

As deep reinforcement learning continues to become one of the most hyped strategies to achieve AGI (aka Artificial General Intelligence) more and more libraries are being developed.

That blog listed 6 libraries, and I looked up and found “20 libraries you should know” (I don’t want that many choices; all Python, and I didn’t even google for that, nor is it in the title, seem just the mindshare is there):

RLlib is a reinforcement learning library that provides high scalability and a unified API for a variety of RL applications. It supports both PyTorch and Tensorflow natively but most of its internal frameworks are agnostic.

RLib was listed 13th, and agnostic seems the exception.

And found a fork of a fork of (listed 1st) OpenAI’s TF-using (since before TF-Agents existed?), “Baselines” code, in the maintained fork using PyTorch (so maybe that’s better):

And June 2021 paper: [2005.05719] Smooth Exploration for Robotic Reinforcement Learning
and Feb 2021 blog post:
Stable-Baselines3: Reliable Reinforcement Learning Implementations | Antonin Raffin | Homepage

It provides a clean and simple interface, giving you access to off-the-shelf state-of-the-art model-free RL algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

27/07/2020: Dopamine now runs on TensorFlow 2. However, Dopamine is still written as TensorFlow 1.X code.

April 21 post:

and the “trajectory view API”. RLlib is a popular reinforcement learning library that is part of the open-source Ray project.

In reinforcement learning (RL), like in supervised learning (SL), we use neural network (NN) models as trainable function approximators. The inputs to these models are observation tensors from the environment that we would like to master

I wasn’t sure that’s the case, neural networks always used with (maybe not historically?), and note caption under picture: “Trajectory of a SpaceX Falcon9 rocket, launching from Cape Canaveral, Florida. RLlib’s Trajectory View API allows […]”

I’m not sure what Julia has for RL, or if comparable feature- or speed-wise (I would expect speed at least as good, at least could):

There’s a web page:

In ReinforcementLearningAnIntroduction.jl, we reproduced most figures in the famous book: Reinforcement Learning: An Introduction (Second Edition). You can try those examples interactively online […]
In ReinforcementLearningZoo.jl, many deep reinforcement learning algorithms are implemented, including DQN, C51, Rainbow, IQN, A2C, PPO, DDPG, etc.

Currently, this is being used to test the refactored version of JuliaRL.

Julia has available AlphaZero.jl, reimplementing Google’s Deepmind company’s code. I understand it’s good, competitive with some other C++ reimplementation.

[Note, it also uses a neural network.]

That said, Deepmind’s latest code is MuZero, and I don’t know it to be reimplemented in Julia.

I see it for Python: GitHub - werner-duvaud/muzero-general: MuZero

MuZero is a state of the art RL algorithm for board games

I was curious how implemented and since depending on ray (also e.g. torch, gym, Facebook Research’s nevergrad, and tensorboard), and while not raylib, that still seems responsible, since I see in the docs for RLlib:

pip install 'ray[rllib]'

I see at ray/ at master · ray-project/ray · GitHub

This file defines the distributed Trainer class for the Deep Q-Networks
algorithm. See dqn_[tf|torch] for the definition of the policies.

I’m not sure what to read into that, I guess two implementations, rather than TF and PyTorch used together (are they ever?).

AlphaZero is also available in Python as contrib here: RLlib Algorithms — Ray v2.0.0.dev0

linked from Intel’s RL library above which is in version 1.0.0 since 2019 (still updated days ago):

Now, Intel Optimization for Tensorflow is also available as part of [Intel® AI Analytics Toolkit] […]

The oneAPI Deep Neural Network Library (oneDNN) optimizations are also now available in the official x86-64 TensorFlow after v2.5. Users can enable those CPU optimizations by setting the the environment variable TF_ENABLE_ONEDNN_OPTS=1 for the official x86-64 TensorFlow after v2.5. There is a comparison table between those two releases in the[ additional information](Intel® Optimization for TensorFlow* Installation Guide Info) session.

Q-Learning in Continuous State and Action Spaces

Advantage Learning, a variation of Q-learning, is shown enhance learning speed and reliability for this task. […]
There are two prevalent approaches to reinforcement learning: Q-learning and actor-critic learning.

Is that outdated since that 1999 paper?

Hi nice overview.
Certainly another Julia possiblity to consider would be all the JuliaPOMDP packages.
The way I see it (and please correct me if I am wrong) this POMDPs packages together with the ReinforcementLearning packages are the ones gaining the most traction. (?)
Of course I don’t want to undermine other packages which could be equally interesting for someone to investigate.

While I am only just starting to investigate these packages I find it quite confusing to differentiate them. Does someone care to mention the main differences between the POMDPs and ReinforcementLearning packages ? As also the way they overlap ?
Must someone choose one over these packages or the suggested workflow is to consider both together, (since they are both compatible with CommonRLInterface ). If yes, how would someone combine them ?

1 Like