Support for low-rank constraint on symmetric matrices

Semidefinite programs are challenging to solve and so it's crucial to try to exploit as much structure as possible to solve them.
This makes it challenging when writing a generic SDP solver since it seems tailored implementation would always be faster.
That's where the extensibility of MOI/JuMP can actually be really helpful.
One particular structure is that the matrices `A_i` in the [SDPA format](https://jump.dev/MathOptInterface.jl/dev/submodules/FileFormats/reference/#MathOptInterface.FileFormats.SDPA.Model) are actually low-rank.
This is the case for instance in the MAX-CUT problem where they have rank 1 and in Sum-of-Squares (in particular with specific bases such as the interpolation basis).
Some solvers already support already supports this:

* https://github.com/jump-dev/DSDP.jl
* https://github.com/davidsd/sdpb
* https://github.com/jump-dev/SDPT3.jl (this https://www.diva-portal.org/smash/get/diva2:471330/FULLTEXT01.pdf mentions that you might have to use a STRUL extension with it, not sure what's the status now)
* https://github.com/nanleij/ClusteredLowRankSolver.jl/ (@nanleij)
* https://github.com/kocvara/Loraine.m for which the pure-Julia implementation is soon to be open sourced (@kocvara)
* https://github.com/alpyurtsever/SketchyCGAL supports constraints with the trace which can be expresssed by using `LinearAlgebra.I` as `A[i]`

If we do have it part of the interface, the support might also appear in other solvers as well (@mariohsouto and @joaquimg told me it would be easy to add in ProxSDP.jl for instance).

This is actually supported by YALMIP as detailed in https://www.diva-portal.org/smash/get/diva2:471330/FULLTEXT01.pdf. YALMIP even does not compute matrix products and keep then in symbolic form so that the user does not even have to specify the structure. But let's assume that the user give this explicitly for now.

I guess two important questions to see if it's worth it are:

1. Is that improving the performance a lot ?
2. Is that easy to add support for this ?

I'll try to convince you that it's "Yes and Yes".

About 1., it is improving performance quite a lot and it's been reported in numerous places. For instance, https://www.diva-portal.org/smash/get/diva2:471330/FULLTEXT01.pdf says:

> To be fair, it should be pointed out that the theoretical idea exploited in this paper was incorporated already in LMI Lab (Gahinet et al. 1995), which was one of the first public implementations of an SDP solver, tailored specifically for control problems. However, many years of active research in the SDP field has led to much more efficient algorithms, and it is our goal to leverage on this. A major reason for pursuing the idea in this paper is that it is slightly embarrassing for people coming from the SDP field, that the now over 15 year old LMI Lab solver actually outperforms state-of-the-art general purpose SDP solvers on some small- to medium-scale control problems.

About 2., **tl;dr;** I detail below different natural ideas and the pros and cons, a shorter version is that https://github.com/jump-dev/MathOptInterface.jl/pull/2198 is the one I think we should go for

We have two options, defining a new set or defining a new function. We want to avoid defining new functions as much as possible so let's first try by defining a new set.
A first possibility is to merge the PSD set and the low rank constraints together.
So we could have the constraint that `sum_i <A_i, X_i> = b_i` where `<., .>` is the Frobenius inner product of symmetric matrices and `A_i` are in LDLT form.
```julia
struct LDLT{T}
    L::Matrix{T}
    diagonal::Vector{T}
end
struct LowRankFrobeniusEqualTo{T}
    matrices::Vector{LDLT{T}}
    constant::T
end
```
And then you would do
```julia
X = Vector{MOI.VariableIndex}[]
for ...
    Xi, cXi = MOI.add_constrained_variables(model, MOI.PositiveSemidefiniteConeTriangle(...))
    push!(X, Xi)
end
MOI.add_constraint(model, MOI.VectorOfVariables(reduce(vcat, X)), MOI.LowRankFrobeniusEqualTo(A, 1.0))
```

And then of course, there would be a bridge from `F`-in-`LowRankFrobeniusEqualTo` into `F`-in-`EqualTo`.

There are a few things to think about here.

### What if the user gives an invalid vector of variables ?

The vector of variables need to be a concatenation of PSD matrices of the right size for solvers to work.
If a user gives a vector of variables that isn't like that, we should bridge it into `EqualTo` so that the solver does work but there is no way to see that from the constraint type so we won't be able to see that when bridging and we will pass the constraint as is to the solver.
I think here we should simply not complicate things for a use case that does not exists. The users is not allowed to use non-SDP variables in this set similarly to the `Indicator` case where the first variable has to be binary. If he does not do that, he should expect an error.

### What about variable bridges ?

If a variable of the `VectorOfVariables` get variable-bridged then it will turn into a `VectorAffineFunction` which won't be supported anymore by the solver so it will then turn into a `ScalarAffineFunction`-in-`EqualTo`.
This is actually a good thing because it means the vector of variables wasn't a concatenation of PSD variables so no worry to have on this side.

### What if we dualize ?

We of course want this to work with `Dualization` so what's the dual of this set ? Just like `EqualTo`, this is not a cone so it does not really have a dual. However, we can make it work by implementing `Dualization._dual_set` so that one variable is added to the dual and then in `MOI.Utilities.dot_coefficients`, we can multiply this variables by the matrices `A_i`, expressed in symmetric-vectorized for (without the low-rank structure) so that it would be added in the `VectorAffineFunction`-in-`PSDCone` along with the dual of the other constraints of each PSD variable. The sad part of this story is that we lost the low-rank structure.

In the dual, to keep the low-rank structure, we would need something like (assuming the PSD matrix only appears on `LowRankFrobeniusEqualTo` constraints and not in any `EqualTo` constraint)

In the dual, to keep the low-rank structure, we would need something like
```
sum y_i A_i is PSD
```
where the `A_i` are low-rank.
The `sum y_i A_i` part would be represented by a `VectorAffineFunction` but how do we represent the `sum z_i B_i` ?
Maybe with a set
```julia
struct SumLowRankPSD
    side_dimension::Int
    matrices::Vector{LDLT{T}}
end
```

Wait, isn't that a convex cone ? So why don't we use the dual of that in standard conic form so that it's easy to dualize ?

Ok, so according to [Rockafellar "Convex Analysis", Corollary 16.3.2], `dual(adjoint(M), S) = preimage(M, dual(S))`.
So here we have `M(y) = sum z_i A_i` and `<M(y), Q> = sum y_i <A_i, Q> + <x, Q> = <y, (<A_1, Q>, ..., <A_m, Q>))` so the dual set is
```julia
struct LowRankFrobeniusOfPSD
    side_dimension::Int
    matrices::Vector{LDLT{T}}
end
```
which is defined as the set of vectors of the form `(<matrices[1], Q>, ..., <matrices[m], Q>)` where `Q` is PSD.

Here, that's preventing `Q` to be used as scalar product with something else than low-rank equality constraints because `Q` is not part of the set itself!
If we want to allow `Q` to be used in classical `ScalarAffineFunction`-in-`Equalto` then we need to consider that the dual will become
```
sum y_i A_i + sum z_i B_i is PSD
```
where the `A_i` are low-rank and the `B_i` are not low-rank.
The `sum z_i B_i` part would be represented by a `VectorAffineFunction but how do we represent the `sum z_i B_i` ?
Maybe with a set
```julia
struct PlusSumLowRankPSD
    side_dimension::Int
    matrices::Vector{LDLT{T}}
end
```
Where the first `length(matrices)` dimensions would be `y` and the last dimensions would be `sum z_i B_i`.

Wait, so the first dimensions should be `VectorOfVariables` and the last ones should be a `VectorAffineFunction`, isn't that an issue ? No, we are already doing that for `IndicatorSet`.

Again, according to [Rockafellar "Convex Analysis", Corollary 16.3.2], `dual(adjoint(M), S) = preimage(M, dual(S))`.
So here we have `M(y, x) = x + sum y_i A_i` and `<M(y, x), Q> = sum y_i <A_i, Q> + <x, Q> = <(y, x), ((<A_i, Q>)_i, Q)` so the set is
```julia
struct LowRankFrobeniusAndPSD
    side_dimension::Int
    matrices::Vector{LDLT{T}}
end
```
which is defined as the set of vectors of the form `(<matrices[1], Q>, ..., <matrices[m], Q>, Q)` where `Q` is PSD. Note that this time we include `Q` at the end.

So I think there are two options here.
* The first one is the pair of dual cones `PlusSumLowRankPSD` and `LowRankFrobeniusAndPSD`. It's good because it also allows `Q` to be used outside but if a solver only supports only matrices with pure low-rank or matrices with no low-rank then it won't know how to communicate it with `supports_constraint`.
* The second one `LowRankFrobeniusOfPSD` and `SumLowRankPSD` is simpler and allows solvers that only support that to communicate it but we should check which solver is in this case and also that does not really allow `Q` to be used in the objective. So even the MAX-CUT problem which has `n` rank-1 constraints on one `n`x`n` PSD matrix and a linear objective wouldn't fall into that category because of the objective. The dual version of this is that the symmetric matrix of the objective needs to be in the second part of the `constants` field  of the `VectorAffineFunction` in the `PlusSumLowRankPSD` constraint.

So I would probably start experimenting with the first option and see what happens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for low-rank constraint on symmetric matrices #2197

What if the user gives an invalid vector of variables ?

What about variable bridges ?

What if we dualize ?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for low-rank constraint on symmetric matrices #2197

Description

What if the user gives an invalid vector of variables ?

What about variable bridges ?

What if we dualize ?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions