Replies: 5 comments 11 replies
-
|
It has been a long time since I have worked on the mountain car example but as far as I remember it simply underestimates the momentum that it has, which leads to it overshooting over the goal and then slowly moving back towards it. But to verify if that is true I have to spend some time again on the example (which I can do in a few days at the earliest) - or you can plot its predictions (mean +- std from the posterior predictive variance) and see how well it actually estimates. |
Beta Was this translation helpful? Give feedback.
-
|
Summoning @ThijsvdLaar to this thread |
Beta Was this translation helpful? Give feedback.
-
|
This can be as simple as an approximation error introduced by the |
Beta Was this translation helpful? Give feedback.
-
|
I've been summoned. This is a good question, and examples like these can be notoriously hard to debug. The main reason is that the dynamic behaviour of the agent results from an interplay between all kinds of precisions in convoluted ways. I'll write some thoughts below that may give some direction.
[1] Van de Laar, T. W., & De Vries, B. (2019). Simulating active inference processes by message passing. Frontiers in Robotics and AI, 6, 20. |
Beta Was this translation helpful? Give feedback.
-
|
Alright I did the hard part and debugged the example. To debug the example, I have spent some time creating a Pluto notebook that loads the results from running an experiment of the "active inference" agent on the mountain car environment, and then plots the posteriors for every random variable in the model at every time step.
The control at the first step with mean We also plot the estimated engine forces stored in
and since
If we plot the estimates
However, looking at the first dimension (change in position i.e. velocity caused by the control), the estimates are good in the beginning, also have a bad estimate spike at around
If you have a look at This then comes apparent when we look at the plots for the estimates for the (future) observations:
However, the estimates for our position are OFF when we stop moving!
This is due to bad estimates of the dynamics and especially because of the bad estimates of the change of position caused by controls! This means that when we plot the landscape of the agent at
The actual position of the car is at Note that there is another UI element to enable/disable showing the variances of each posterior (disabled by default due to large variances dominating the plot and flattening the means). Also just to advertise our own work here, too. ¹ Just to rant a little bit: More than 3 years ago I spent a long time implementing a framework due to my frustrations with debugging the mountain car agent-environment interaction. My framework allowed you to run any predefined (active inference) agents on any predefined environment and then automatically provided you with plots such as agent-environment interaction plots, and in the case of probabilistic agents their beliefs over time including for every step over its horizon, etc. This was to help people new to active inference understand the decision-making of the agent, but also to e.g. satisfy curiosity i.e. be able to play around hyperparameters and see their effects on the behavior of the agents. The idea for this framework came from RL Baselines3 Zoo. Something else came up and a year later I could not publish my idea anymore as a paper without additional work. However, since this problem is still not solved, I might spend some time working on a solution again. |
Beta Was this translation helpful? Give feedback.















Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I've been looking at the Active Inference Mountain Car example and I was wondering about the final state of the car after it reaches the goal. From my understanding of the goal prior, the agent will try to find a course of action to get there by time
t=Tand then maintain that position. However, in the example results, the car travels past the target position and slowly drifts back towards it. Why does the agent do this? Is there a way to modify the goal prior or another part of the model so that the car stays at the target position indefinitely?I'm new to active inference and would appreciate any insight :)
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions