-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Currently, _vectorized_initialstate
takes the pomdp
and ordered_states
then uses the initialstate(::POMDP)
to initialize b0
; however, SARSOP is able to begin from any point in the belief-space, so it makes sense to allow users to provide an initial belief (perhaps in sparse form?) from which to commence SARSOP.
I'd be happy to write/submit a PR for this, but wanted to open an issue to see if it was something y'all would be willing to accept. 🙂
Changes that I think that would be necessary to support this. I could probably tackle this in the next week or two, if an acceptable PR.
# solver.jl
+ function POMDPTools.solve_info(solver::SARSOPSolver, pomdp::POMDP; b0=initialstate(pomdp))
+ tree = SARSOPTree(solver, pomdp; b0)
- function POMDPTools.solve_info(solver::SARSOPSolver, pomdp::POMDP)
- tree = SARSOPTree(Solver, pomdp)
# the rest of the code ...
return pol, (; ...)
end
+ function POMDPs.solve(solver::SARSOPSolver, pomdp::POMDP; b0=initialstate(pomdp)) =
- function POMDPs.solve(solver::SARSOPSolver, pomdp::POMDP) =
fist(solve_info(solver, pomdp; b0))
# tree.jl
+ function SARSOPTree(solver, pomdp::POMDP; b0=initialstate(pomdp))
- function SARSOPTree(solver, pomdp::POMDP)
+ sparse_pomdp = ModifiedSparseTabular(pomdp, b0)
- sparse_pomdp = ModifiedSparseTabular(pomdp)
# the rest of the codebase ...
return insert_root!(...)
end
# sparse_tabular.jl
+ function ModifiedSparseTabular(pomdp::POMDP, b0)
- function ModifiedSparseTabular(pomdp::POMDP)
S = ordered_states(pomdp)
# the rest of the codebase ...
+ b0 = _vectorized_initialstate(pomdp, S, b0)
- b0 = _vectorized_initialstate(pomdp, S)
return ModifiedSparseTabular(T, R, O, terminal, b0, discount(pomdp))
end
+ function _vectorized_initialstate(pomdp, S, b0)
+ function _vectorized_initialstate(pomdp, S)
b0_vec = Vector{Float64}(undef, length(S))
@inbounds for i ∈ eachindex(S, b0_vec)
b0_vec[i] = pdf(b0, S[i])
end
return sparse(b0_vec)
end
So long as b0
is guaranteed to be compatible with pdf(b0, S[i])
, these are all the changes necessary. Otherwise, there would need to be some handling to ensure that b0
is either compatible with vectorizing or allow folks to specify how to achieve the equivalent of pdf(b0, S[i])
.
Perhaps if there were functions in POMDPTools
that support this kinda interface, that could be interesting but is way out of scope for this issue/PR. (e.g. going from marginal beliefs to relevant SparseCat
and the like.)