So there already seems to be some problems with you actions function where
function A(s::State)
return [(Action(a)) for a in Iterators.product(0:12.-s.s2[1],0:12-s.s2[2]) if sum(a)<=s.s1]
end
does not work, and might not do what you expect. First it complains about 0:12.-s.s2[1] since it is not clear if you want 0:12.0 - s.s2[1] or 0:12 .- s.s2[1]. Then the question is if you want to compute 0:(12 .- s.s2[1]) or (0:12) .- s.s2[1], where it is the first one that is actually computed.
For the actual question I’m not quite sure exactly what you want. My understanding is you want to loop over all states and actions, and if an action is not valid for a certain state you want to update some table with a large negative value? If that is it you could probably do it somewhat like this, though it might not be the most efficient
for s in states(mdp)
valid_actions = A(s)
for a in actions(mdp)
if !(a in valid_actions)
solver.Q_vals[state2idx(s), act2idx(a)] = -9999
end
end
end
and here I assume that you have some functions that transform state and action to an index in the table.
That was what i meant with assuming you had a function doing the conversion, you somehow have to select a mapping (or maybe that exists in pomps package) to say what state/action gets mapped to what index.
for s in states(mdp)
valid_actions = A(s)
for a in actions(mdp)
if !(a in valid_actions)
solver.Q_vals[POMDPs.stateindex(mdp, s) , POMDPs.actionindex(mdp, a) ] = -9999
end
end
end
But error was:
ERROR: MethodError: no method matching setindex!(::Nothing, ::Int64, ::Int64, ::Int64)
That seems to say that solver.Q_vals is nothing. Are you sure that this is the field you are supposed to use? On the phone now so too lazy to try to look it up