Skip to content

Support for custom initial belief (e.g., solving many POMDPs with varying beliefs) #20

@jmuchovej

Description

@jmuchovej

Currently, _vectorized_initialstate takes the pomdp and ordered_states then uses the initialstate(::POMDP) to initialize b0; however, SARSOP is able to begin from any point in the belief-space, so it makes sense to allow users to provide an initial belief (perhaps in sparse form?) from which to commence SARSOP.

I'd be happy to write/submit a PR for this, but wanted to open an issue to see if it was something y'all would be willing to accept. 🙂

Changes that I think that would be necessary to support this. I could probably tackle this in the next week or two, if an acceptable PR.
# solver.jl
+ function POMDPTools.solve_info(solver::SARSOPSolver, pomdp::POMDP; b0=initialstate(pomdp))
+     tree = SARSOPTree(solver, pomdp; b0)
- function POMDPTools.solve_info(solver::SARSOPSolver, pomdp::POMDP)
-     tree = SARSOPTree(Solver, pomdp)
    # the rest of the code ...
    return pol, (; ...)
end

+ function POMDPs.solve(solver::SARSOPSolver, pomdp::POMDP; b0=initialstate(pomdp)) =
- function POMDPs.solve(solver::SARSOPSolver, pomdp::POMDP) =
    fist(solve_info(solver, pomdp; b0))

# tree.jl
+ function SARSOPTree(solver, pomdp::POMDP; b0=initialstate(pomdp))
- function SARSOPTree(solver, pomdp::POMDP)
+    sparse_pomdp = ModifiedSparseTabular(pomdp, b0)
-    sparse_pomdp = ModifiedSparseTabular(pomdp)
    # the rest of the codebase ...
    return insert_root!(...)
end

# sparse_tabular.jl
+ function ModifiedSparseTabular(pomdp::POMDP, b0)
- function ModifiedSparseTabular(pomdp::POMDP)
    S = ordered_states(pomdp)
    # the rest of the codebase ...
+     b0 = _vectorized_initialstate(pomdp, S, b0)
-     b0 = _vectorized_initialstate(pomdp, S)
    return ModifiedSparseTabular(T, R, O, terminal, b0, discount(pomdp))
end

+ function _vectorized_initialstate(pomdp, S, b0)
+ function _vectorized_initialstate(pomdp, S)
    b0_vec = Vector{Float64}(undef, length(S))
    @inbounds for i ∈ eachindex(S, b0_vec)
        b0_vec[i] = pdf(b0, S[i])
    end
    return sparse(b0_vec)
end

So long as b0 is guaranteed to be compatible with pdf(b0, S[i]), these are all the changes necessary. Otherwise, there would need to be some handling to ensure that b0 is either compatible with vectorizing or allow folks to specify how to achieve the equivalent of pdf(b0, S[i]).


Perhaps if there were functions in POMDPTools that support this kinda interface, that could be interesting but is way out of scope for this issue/PR. (e.g. going from marginal beliefs to relevant SparseCat and the like.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions