-
Notifications
You must be signed in to change notification settings - Fork 6
Refactor objective/gradient/hessian evaluation #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor objective/gradient/hessian evaluation #214
Conversation
does not seem to be used anywhere; also the method signature does not match Julia conventions
* replace has_meanstrcture and approximate_hessian fields with trait-like typeparams * remove methods for has_meanstructure-based dispatch
the intent of this commit is to refactor the API for objective, gradient and hessian evaluation, such that the evaluation code does not have to be duplicates across functions that calculate different combinations of those functions * introduce EvaluationTargets class that handles selection of what to evaluate * add evaluate!(EvalTargets, ...) methods for loss and imply objs that evaluate only what is required * objective!(), obj_grad!() etc calls are just a wrapper of evaluate!() with proper targets
for clarity
* explicitly use Cholesky factorization
remove unnecesary arguments
by dispatching over optimizer
by dispatching over optimizer
no wrapper required
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #214 +/- ##
==========================================
+ Coverage 70.52% 71.69% +1.17%
==========================================
Files 52 52
Lines 2446 2120 -326
==========================================
- Hits 1725 1520 -205
+ Misses 721 600 -121 ☔ View full report in Codecov by Sentry. |
to reduce allocations
to reduce allocations
to avoid extra allocation
Co-authored-by: Maximilian Ernst <[email protected]>
Thanks a lot for those changes, and sorry it took that long to review. I think this greatly simplifies the codebase (and also improves some of the linear algebra computations)! Something that is still to do is adding these changes to the online documentation on how to implement new loss functions and imply types - but I will do this later when I make a new release and will just add it to the according issue for now. Regarding the |
Thanks for bringing it up! My two main motivations to use the type parameters were:
To achieve just the 2nd goal, one can potentially avoid extra type parameters in the base abstract type and keep it at the level of individual implementations (e.g. replace That said, I don't know how critical is it to have a hessian-related trait. Potentially, it is a mechanism to query whether all loss terms support exact Hessian evaluation, which might be important for methods like Also, the design decision depends on your vision about the future SemImply implementations and their new features. I am not too much worried about the steep learning curve for the new users in this particular case. |
Okay, some thoughts on this: Some thoughts on hessian computation and standard errors: |
Just to clarify -- how do you see the type and declaration of the For
I agree that it is inconvenient, when the type of Hessian evaluation is fixed at the loss function creation, and it may result in the A far-fetched solution is to separate the constant part of |
That's exactly what I was thinking - we pay an extra struct field, but we buy that if we add another trait in the future, we don't break all users's code that implements a custom imply type. (Sorry, my code was wrong.)
It is... a workaround for now would be to construct a new loss function with all the fields from the old one except swapping the |
Maybe one final clarification -- do you mean that I can change the PR and add |
That's exactly what I meant.
Yes.
Thanks a lot!
This would also be nice. |
@Maximilian-Stefan-Ernst I've switched from type parameters to the fields, as we discussed. You can take a look at the last commit to confirm whether this is how you imagined it and whether it makes it easier for R users. I still think it's not "idiomatic" Julia :), but I am fine with this approach. I've also renamed |
a073af5
into
StructuralEquationModels:devel
Perfect, merged! |
Another iteration of cherry-picking the changes from #193.
The main part of this PR is the simplification of the evaluation API.
Currently, SemImply and SemLoss subtypes declare a separate method for each combination of objective, gradient or hessian requested to be evaluated (objective!(), objective_gradient!() asf).
This introduces a lot of code duplication, and complicates maintenance as any fix should be applied at multiple places.
The PR replaces it with a single evaluate!(objective, gradient, hessian, ...) call for the in-place evaluation, where the evaluation of objective, gradient, or hessian is skipped if nothing is supplied in the corresponding argument.
Since Julia compiles a separate version of the function for each combination of the input parameters, the new API does not incur any performance penalty and effectively achieves what the old code did.
For updating the internal state of SemImply subtypes, the equivalent update!(targets::EvaluationTargets, imply::SemImply, ...) method is introduced,
where EvaluationTargets is a simple structure that specifies (via its type parameters, i.e. at compile time), whether the SemImply internal fields required for objective, gradient and/or hessian needed to be updated.
Also, the PR converts meanstructure and approximatehessian from the Val{T} struct fields of SemImply into just type parameters (without the corresponding field).
That seems to be a more "idiomatic" way of declaring traits in Julia.
For some SEM loss types, ApproximateHessian trait is fixed to either NoHessian or ExactHessian; for others, like SemWLS, the type of Hessian (approximate or exact) could be specified at construction.
As another immediate benefit of evaluation API refactoring, the Optim.jl and NLopt.jl wrappers are simplified, since the new API evaluate!() call directly implements the semantics these packages expect.