-
Notifications
You must be signed in to change notification settings - Fork 36
Implement more consistent tracking of logp components via LogJacobianAccumulator
#998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Benchmark Report for Commit 049b3d3Computer Information
Benchmark Results
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## breaking #998 +/- ##
============================================
- Coverage 82.11% 82.07% -0.04%
============================================
Files 38 38
Lines 4031 4067 +36
============================================
+ Hits 3310 3338 +28
- Misses 721 729 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
LogJacobianAccumulator
LogJacobianAccumulator
+ more consistent tracking of logp components
LogJacobianAccumulator
+ more consistent tracking of logp componentsLogJacobianAccumulator
DynamicPPL.jl documentation for PR #998 is available at: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be reviewable. Truthfully, most of the changes are either accumulator boilerplate (I'll open a separate issue to try to reduce this), or finicky bits about where and what sign to accumulate logjac.
I don't recommend checking every one of them in review but I will say that the test suite did catch a lot of errors and force me to fix them carefully, so we can have some confidence that this does behave as expected.
The main design decision is probably the sign convention to use. This is covered in the top comment.
retval, vi_linked_result = DynamicPPL.evaluate!!(model, deepcopy(vi_linked)) | ||
# NOTE: Evaluating a linked VarInfo, **specifically when the transformation | ||
# is static**, will result in an invlinked VarInfo. This is because of | ||
# `maybe_invlink_before_eval!`, which only invlinks if the transformation | ||
# is static. (src/abstract_varinfo.jl) | ||
retval, vi_unlinked_again = DynamicPPL.evaluate!!(model, deepcopy(vi_linked)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say I fully appreciate the rationale for this slightly counterintuitive behaviour. I would be quite happy to remove it. At least changing the variable name hopefully makes it clearer what's going on (I spent ages trying to find out what was going wrong with the accumulators...).
istrans(vi::ThreadSafeVarInfo{<:SimpleVarInfo}, vn::VarName) = istrans(vi.varinfo, vn) | ||
istrans(vi::ThreadSafeVarInfo{<:SimpleVarInfo}) = istrans(vi.varinfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line happily fixes the following bug:
julia> using DynamicPPL
julia> svi = SimpleVarInfo()
SimpleVarInfo(NamedTuple(), 0.0)
julia> svi_linked = DynamicPPL.settrans!!(svi, true)
Transformed SimpleVarInfo(NamedTuple(), 0.0)
julia> istrans(svi_linked)
true
julia> istrans(DynamicPPL.ThreadSafeVarInfo(svi_linked))
false
After this PR, the last line will return true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice, some comments and questions.
src/abstract_varinfo.jl
Outdated
@@ -306,6 +361,19 @@ function acclogprior!!(vi::AbstractVarInfo, logp) | |||
return map_accumulator!!(acc -> acc + LogPriorAccumulator(logp), vi, Val(:LogPrior)) | |||
end | |||
|
|||
""" | |||
acclogjac!!(vi::AbstractVarInfo, logJ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit unsure about the name logJ
, which isn't snake_case. Do you have a reason to prefer it over logjac
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It matches the maths notation, and is consistent with Bijectors.jl:
pysm@ati:~/ppl/bi (main) $ rg logJ | wc -l
43
I don't know to what extent snake_case matters for things that are mathematical variables.
If all the float-accumulators are unified (and presumably the field will be called logp
or similar), will this still be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the unification I'm introducing a function called logp
, so the field name would remain. Happy to be consistent with Bijectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do realise that in the rest of DynamicPPL we use logjac
though, so for library-internal-consistency's sake we should change it. I'll do a big sed
later.
src/abstract_varinfo.jl
Outdated
vi_new = unflatten(vi, x) | ||
# Reset logjac to 0. | ||
if hasacc(vi_new, Val(:LogJacobian)) | ||
vi_new = map_accumulator!!(zero, vi_new, Val(:LogJacobian)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a chance that some mix-up of using a different invlink transform than what was originally used for linking would cause the logjac to actually be non-zero? Or would that always imply that quite a serious error has been made and we have no need to have well-defined behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, true, I guess somebody could do something horrific and use different StaticTransformations, in which case we should sum the logjac terms and add / subtract them as necessary.
src/accumulators.jl
Outdated
- `Base.copy(acc::T)` | ||
|
||
In these functions: | ||
- `val` is the new value of the random variable sampled from a new distribution (always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the distribution new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of a typo 😅
src/simple_varinfo.jl
Outdated
vi_new = acclogprior!!(vi_new, logjac) | ||
# logjac should be zero for an unlinked VarInfo. | ||
if hasacc(vi_new, Val(:LogJacobian)) | ||
vi_new = map_accumulator!!(zero, vi_new, Val(:LogJacobian)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yay!
As discussed in today's meeting, we want to separate logjac from logprior so that we can more correctly calculate log probs in unlinked / model space.
This PR therefore introduces:
LogJacobianAccumulator
, which accumulates the log-jacobian of the link transform, if the variable is linked. That is, logjac is zero for unlinked variables and nonzero for linked variables (if the link transform is notidentity
).Warning
This represents a sign change with respect to how the Jacobian was previously treated. See below.
The main reason for this is because I primarily think of linking as an 'active' choice, that is to say, logjac should be nonzero when linked (i.e. a correction is needed to change the value of logp from the default, which is unlinked). This means that "including logjac" gets you
logprior_internal
andlogjoint_internal
, whereas "excluding logjac" meanslogprior
andlogjoint
.Previously, the logjac term was stuffed into$\log(p(\mathbf{x}))$ below.
LogPriorAccumulator
. Now,LogPriorAccumulator
only calculates the prior of the unlinked variables i.e.New logp-extracting functions to get the following quantities:
getlogjac
:getlogprior
:getlogprior_internal
:getlogjoint
:getlogjoint_internal
:Warning
Prior to this PR,$\log(q(\mathbf{y}))$ and $\log(q(\mathbf{y})) + \text{log-likelihood}$ . @mhauru and I agree that this behaviour is slightly unprincipled because the log prior (and log joint) should have a well-defined meaning independent of whether the VarInfo has been linked or not (which is an implementation detail that most users should not see). See TuringLang/Turing.jl#2617 (comment)
getlogprior
would returngetlogjoint
would returnConvention: Link transform
For a constrained variable$\mathbf{x} \sim p(\cdot)$ that is transformed to an unconstrained variable $\mathbf{y} \sim q(\cdot)$ , we have that
where
In Bijectors this is calculated as
In this PR,
logjac
always refers to this link transform Jacobian. The new accumulator always tracks this quantity.There are one or two occasions where instead the invlink transform Jacobian is calculated. This PR renames all such occurrences to
inv_logjac
:Once this is merged I will open a PR to the docs page to explain this. The notation used here follows directly on from what I previously wrote at https://turinglang.org/docs/developers/transforms/dynamicppl/.
Closes #992.