Implement per-stage decomposition with DiffOpt implicit differentiation and objective sensitivity

*Problem:* The current implementation only solves the full multistage problem in one go. While the existing code contains a simplified manual `rrule` for a per-stage `get_next_state` function, it doesn’t compute accurate sensitivities. The new DiffOpt API (PR #281 for implicit differentiation and PR #303 for objective sensitivity) allows us to obtain exact gradients with respect to parameter variables.

*Proposed solution:*

 - Only allow parameters to be MOI.Parameters: e.g there should be only the possibility of what is currently under `:Param`:
https://github.com/LearningToOptimize/DecisionRules.jl/blob/5f413f5171c1eac02af4b333cd0c166a9614f7a0/src/utils.jl#L1

- **Per-stage solve:** Make sure the function `simulate_stage` is well implemented and that it solves a single stage, with parameters (incoming state, realized uncertainty, target state) exposed via DiffOpt’s parameter interface.

- **Fix get_next_state rrule:** Use DiffOpt’s reverse model differentiation API inside the `_pullback`. This ensures the gradient of the realized state with respect to the target and incoming state is exact.

- **Use objective sensitivity:** With DiffOpt’s forthcoming objective-sensitivity API, compute the derivative of the stage objective with respect to the parameter variables (target state, penalty weight) directly. i.e change line https://github.com/LearningToOptimize/DecisionRules.jl/blob/5f413f5171c1eac02af4b333cd0c166a9614f7a0/src/simulate_multistage.jl#L203
    -  now should be just:

```julia
dual(v)
``` 

- **Testing & examples:** Provide a test comparing the new stagewise gradient computation to the existing dual-based method on small instances (e.g., the battery example). Include a demonstration of training a policy using the stagewise approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement per-stage decomposition with DiffOpt implicit differentiation and objective sensitivity #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement per-stage decomposition with DiffOpt implicit differentiation and objective sensitivity #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions