ANN: QuickPOMDPs and POMDPs v0.8

zsunberg · November 21, 2019, 9:05pm

Hi All,

I’d like to announce some enhancements to the POMDPs ecosystem. First, we have introduced the QuickPOMDPs.jl package, which makes it much easier to define Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs) without sacrificing much flexibility. Instead of the object-oriented approach of the main POMDPs.jl interface, it uses a composition-oriented approach, allowing each element to be flexibly specified as a function or object.

The classic mountain car problem with visualization thanks to Compose.jl looks like this:

mountaincar = QuickMDP(
    function (s, a, rng)        
        x, v = s
        vp = clamp(v + a*0.001 + cos(3*x)*-0.0025, -0.07, 0.07)
        xp = x + vp
        if xp > 0.5
            r = 100.0
        else
            r = -1.0
        end
        return (sp=(xp, vp), r=r)
    end,
    actions = [-1., 0., 1.],
    initialstate = (-0.5, 0.0),
    discount = 0.95,
    isterminal = s -> s[1] > 0.5,

    render = function (step)
        cx = step.s[1]
        cy = 0.45*sin(3*cx)+0.5
        car = (context(), circle(cx, cy+0.035, 0.035), fill("blue"))
        track = (context(), line([(x, 0.45*sin(3*x)+0.5) for x in -1.2:0.01:0.6]), stroke("black"))
        goal = (context(), star(0.5, 1.0, -0.035, 5), fill("gold"), stroke("black"))
        bg = (context(), rectangle(), fill("white"))
        ctx = context(0.7, 0.05, 0.6, 0.9, mirror=Mirror(0, 0, 0.5))
        return compose(context(), (ctx, car, track, goal), bg)
    end
)

whereas the python implementation looks like this and is much slower. Moreover, the POMDPSimulators and POMDPGifs packages can be used to generate animations in the REPL, Juno, or Jupyter notebooks that look like this:

mountaincarXUonvQ6i

This represents a significant improvement over other frameworks like open ai gym, especially in a classroom setting for a Reinforcement Learning, Decision Making under Uncertainty or AI course. Specifically, it allows easy specification of problems with much more flexibility (i.e. specifying transition probabilities rather than just a simulation model) quickly, easily, and clearly, without any sacrifice in speed (compared to compiled languages, let alone interpreted ones). All QuickPOMDP models are compatible with the solvers implemented for POMDPs.jl, and there is also a bridge to ReinforcementLearning.jl that will hopefully be expanded in the future, and RLInterface.jl

We have also made major flexibility enhancements to the main POMDPs.jl v0.8 interface (thanks to @MaximeBouton, @lassepe, @rejuvyesh, @shushman), adding the ability to specify a custom dynamic decision network structure. This paves the way for implementation of new problem classes like constrained POMDPs and new solution methods that involve factorization.

All of this has been made possible only by the hard work of the Julia developers and community. We are very grateful for the efforts of everyone. In my opinion, Julia is definitely the best tool for research and teaching in this area, and it has been extremely helpful in my understanding!

Topic		Replies	Views
MDPs simple program which: Tooling juliapro	2	635	September 12, 2023
SF Bay Area Oct 12 event: (1) Partially Observable Markov Decision Processes in Julia; (2) Boxed Variables Meetups	0	627	October 6, 2017
Porting an example from QuantEcon.jl to POMDPs.jl Specific Domains	1	623	November 21, 2019
Help I am new: Is this MDP righ? New to Julia question	10	535	March 27, 2022
What is the difference between ReinforcementLearning.jl and pomdps.jl Specific Domains question , package , machine-learning	5	718	June 13, 2022

ANN: QuickPOMDPs and POMDPs v0.8

Related topics