Delusional drone in RxInfer example

Hi

I am recreating the drone example from https://github.com/ReactiveBayes/RxInferExamples.jl/blob/main/drone/drone_simulation.ipynb. It occured to me, that the plot/animation is based on the models posteriors. Does this actually match reality?

begin
  traj = [collect(get_state(start))]

  #simulate drone based on actions
  for i in 1:40
    action = mean(results.posteriors[:u][i])
    new_state = state_transition(traj[i], action, drone, env, dt)
    push!(traj, new_state)
  end
  @gif for k in 1:length(traj)-1
    # plot target
    p = scatter([target[1]], [target[2]], label = "target"; color = :red)

    #plot drone
    plot_drone!(p, drone, State(traj[k]...))

    xlims!(-2.5, 1.5)
    ylims!(-1.5, 1.5)
  end
end

drone_wrong

Uh oh, that’s unexpected. Does anyone have any idea what’s going on? Since the model has full visibility of the state, my understanding is that its prediction should match reality pretty closely.

My analysis

First, a side note: it seems to me that the line τ = (Fl - Fr) * r in the function state_transition should have been τ = (Fr - Fl) * r to match the sign convention of sin and cos. This doesn’t affect the results as it simply swaps the labels of left and right engine, I just want to remove it as a source of confusion.

Looking into the posteriors of the angle and angular velocity, it seems the model approximates the angular velocity pretty well:

plot((d -> d[end]).(traj), label = "Actual")
plot!((d -> mean(d)[end]).(results.posteriors[:s]), label = "Posterior")

plot_ang_vel

And the resulting integral (which should be the angle of the drone) looks very similar

plot(cumsum((d -> d[end]).(traj)), label = "Actual")
plot!(cumsum((d -> mean(d)[end]).(results.posteriors[:s])), label = "Posterior")

plot_ang_vel_int

However, the models prediction of the angle is wildly off:

plot(((d -> d[end-1]).(traj)), label = "Actual")
plot!(((d -> mean(d)[end-1]).(results.posteriors[:s])), label = "Posteriors")

plot_ang

Note, that the integral of the posterior of the angular velocity is far away from the posterior of the drone angle.

It seems to underestimate the size of the peak of the angular velocity. Since angular velocity depends linearly on the forces but the drone angle depends non-linearly on the forces, this leads me to wonder if some (linear?) approximation somewhere is messing things up.

Any thoughts are most welcome.

4 Likes

Hi @Andreas_Poulsen!

Sorry we missed the message.

There’s indeed something wrong.
We will look into this!

Cheers!

1 Like

Hey @Andreas_Poulsen,

you correctly identified that the animation is based on the predicted states instead of the actual states (given estimated controls).
The example in the RxInferExamples.jl repository only does plan T = 40 steps ahead, without any real agent-environment interaction - no real state transition is executed, everything happens “in the mind” of the agent/generative model.
You can have a look at the original code in @bvdmitri’s PhD thesis while we update and debug the example in this repository.

I highly suspect that the model’s inability to capture system dynamics is the reason why the predicted states differ from the actual states given the estimated controls.
The model uses unscented transform (UT) for nonlinear state estimation, which in my experience so far has not fared well.
UT has multiple parameters which require some tuning.
If you increase the time step size (dt) or change the parameters of the unscented transform, then you should be able to match prediction with reality.
I’ve created a small pluto notebook in a separate branch where you can interactively change these (and other) parameters and see the effect of each parameter on the state estimation ability of the given model.
Just run pluto (with run-pluto.sh) and try it out :slight_smile: .

2 Likes

Thank you for taking the time to look into this and reply :smiley:

Indeed, tuning the Unscented transform seems to bring the model closer in alignment with reality, but it still deviates somewhat, at least in my quick experiments. I looked at @bvdmitri’s PhD thesis, but must admit that I found it a bit too complex. I’ll have a look again. If incorporating direct environment feedback into the example is necessary to get accurate control estimation, then I’ll try having another look at it.

Thanks again :smiley:

Thanks for looking into this, @timn!

It seems that the modified demo in RxInferExamples has a different focus—it’s more about planning rather than executing the plan. I agree it can be a bit confusing, especially since the drone infers a plan that doesn’t really align with reality (the reason might be indeed approximation errors). So the drone is indeed a bit “delusional” :smiley:

We’ve also observed this as a broader issue with Active Inference agents biased toward their goals (as in this example). They “tend to assume” they can achieve the goal regardless of the actual circumstances. It’s good that you’ve caught it here.

In my original demo, I ran the simulation forward, got new observations, and re-inferred the plan at each time step to ensure better alignment with reality.

To avoid confusion, we should either clarify the purpose of the modified example or adjust it to use the correct simulation.

2 Likes

So, I’ve had some time to work on this. Here, I’ve updated the Jupyter notebook to incorporate real-world feedback. If it’s okay with you, I’d like to add a little more explanatory text and then open a pull-request to the original repo.

The agent exhibits a weird behaviour: it drops down, before zooming up above the target and finally it descends to the target. Increasing the number of timesteps makes this effect smaller. I don’t understand where the “dipping down” is coming from, as it’s not a part of the plan without real-world feedback. Maybe some of you have an idea here?

Another interesting note: when real-world feedback is included, tuning the hyperparameters of the Unscented transform doesn’t seem to change much. I think this is a good thing, that hyperparameter tuning becomes less relevant in a more realistic setting.

Bests, Andreas

2 Likes

Hey @Andreas_Poulsen, thanks for revisiting the Drone example.

I also ran a few tests and noticed that adjusting both the parameters of the unscented transform and the time step sizes between state transitions significantly improved prediction accuracy. From what I can tell, the root issue seems to be the agent’s poor approximation of the state transition function, which may stem from the unscented transform’s sensitivity to hyperparameter tuning (at least in my case). This could explain the “dipping down” behavior you observed.

I’ll take a deeper look when I get a chance and try to pinpoint the exact cause.

2 Likes