Skip to content

Conversation

ferrine
Copy link
Member

@ferrine ferrine commented Nov 30, 2022

Implementing graph replace that preserves references from the inherited graph (#19)

Here are a few important guidelines and requirements to check before your PR can be merged:

  • There is an informative high-level description of the changes.
  • The description and/or commit message(s) references the relevant GitHub issue(s).
  • pre-commit is installed and set up.
  • The commit messages follow these guidelines.
  • The commits correspond to relevant logical changes, and there are no commits that fix changes introduced by other commits in the same branch/BR.
  • There are tests covering the changes introduced in the PR.

Don't worry, your PR doesn't need to be in perfect order to submit it. As development progresses and/or reviewers request changes, you can always rewrite the history of your feature/PR branches.

If your PR is an ongoing effort and you would like to involve us in the process, simply make it a draft PR.

Copy link
Member

@aseyboldt aseyboldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot, but maybe we can improve the condation_subset function a bit, or at least make it easier to understand what it actually does?

@ferrine
Copy link
Member Author

ferrine commented Dec 1, 2022

I would also slightly refactor is_in_ancestors though, my implementation for is_dependent_on does exactly the same thing but for nodes. Adding singledispatch would make API more consistent, But it can go separately

@ferrine ferrine force-pushed the graph-replace branch 2 times, most recently from 6de138d to f009a76 Compare December 1, 2022 12:06
@ferrine
Copy link
Member Author

ferrine commented Dec 1, 2022

I've found that there are realistic cases where one replacement depends on another one. So far fg does these replacements inplace #74 and it is unsafe to apply subsequent replaces. This is the reason I have several passes in the graph_replace .

The question what should user expect if some of the replacements were not applied?

@ferrine
Copy link
Member Author

ferrine commented Dec 1, 2022

Ready for review

@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2022

Codecov Report

Merging #66 (aa51011) into main (491f93e) will increase coverage by 0.05%.
The diff coverage is 87.15%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   74.22%   74.28%   +0.05%     
==========================================
  Files         174      176       +2     
  Lines       48734    48991     +257     
  Branches    10367    10414      +47     
==========================================
+ Hits        36175    36392     +217     
- Misses      10272    10293      +21     
- Partials     2287     2306      +19     
Impacted Files Coverage Δ
pytensor/misc/ordered_set.py 79.80% <ø> (ø)
pytensor/tensor/nnet/corr.py 16.81% <0.00%> (ø)
pytensor/ifelse.py 51.42% <66.66%> (+0.13%) ⬆️
pytensor/link/numba/dispatch/extra_ops.py 92.24% <70.21%> (-5.77%) ⬇️
pytensor/link/numba/dispatch/scalar.py 94.44% <75.00%> (+7.02%) ⬆️
pytensor/graph/replace.py 84.61% <84.61%> (ø)
pytensor/link/numba/dispatch/cython_support.py 86.95% <86.95%> (ø)
pytensor/link/numba/dispatch/basic.py 90.06% <95.83%> (-2.62%) ⬇️
pytensor/graph/basic.py 88.10% <97.05%> (+0.43%) ⬆️
pytensor/compile/builders.py 77.11% <100.00%> (+0.05%) ⬆️
... and 14 more

@ferrine
Copy link
Member Author

ferrine commented Dec 5, 2022

@ricardoV94 @aseyboldt anything else left on the PR?

@aseyboldt
Copy link
Member

@ricardoV94 @aseyboldt anything else left on the PR?

I'm still not 100% sure about the helper function. Can we at least make it private, in case we want to further specify/change its exact behavior?

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like out this is shaping. There are a bunch of places in pymc and pytensor I can see the use for this already.

Left some small suggestions and questions.

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 6, 2022

@ricardoV94 @aseyboldt anything else left on the PR?

I'm still not 100% sure about the helper function. Can we at least make it private, in case we want to further specify/change its exact behavior?

I don't see the reason for that. It's an alternative to clone_replace without the clone, I think it's a safe starting point to share the same external API. The internal helper functions used by it can be made private.

Comment on lines 1777 to 1795
def variable_is_in_ancestors(
node: Variable, check: Union[Variable, Collection[Variable]]
) -> bool:
"""Determine if `node` is in the graph given by any in check.
Parameters
----------
node: Variable
Node to check
check: Collection[Variable]
Nodes to check dependency on
Returns
-------
bool
"""
if not isinstance(check, Collection):
check = {check}
else:
check = set(check)
return any(interim in check for interim in ancestors([node], blockers=check))
Copy link
Member

@ricardoV94 ricardoV94 Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would perhaps inline this function, I think it's simple enough for users to figure out. We can add it later if there's a demand for it.

I am not totally set, so feel free to insist in keeping it

Copy link
Member Author

@ferrine ferrine Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to merge this with is_in_ancestors so it is more polymorphic

NodeT can be either Variable or Apply 
is_in_ancestors(node[NodeT], search: Union[List[NodeT], NodeT]) -> bool

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was saying is I don't think we need this helper function at all. Ancestors gives you an iterable and you check if it is in it. I don't see why we need a user-facing function for the one-liner

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we already had one for nodes, so that opens a valid precedent for variables. I think we should keep separate. There's a subtle difference between checking for nodes and variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the is_in_ancestors to apply_in_ancestors and unified the api with variable_in_ancestors so they are semantically the same. Should we bother with deprecation warnings and add a fallback?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think we should bother with deprecation warning for this case. It's not too much trouble either.

@aseyboldt
Copy link
Member

I don't see the reason for that. It's an alternative to clone_replace without the clone, I think it's a safe starting point to share the same external API. The internal helper functions used by it can be made private.

I only meant the truncated_graph_inputs function, not graph_replace.

@ferrine
Copy link
Member Author

ferrine commented Dec 6, 2022

I don't see the reason for that. It's an alternative to clone_replace without the clone, I think it's a safe starting point to share the same external API. The internal helper functions used by it can be made private.

I only meant the truncated_graph_inputs function, not graph_replace.

Don't see any reason to make the function private, there are many ways to use it outside pytensor

Just a quick example I could imagine

candidates = set(model.basic_RVs)
conditioned_on = {
    rv: candidates & truncated_graph_inputs([rv], candidates - {rv}) 
    for rv in cmodel.basic_RVs
}

Another thing that there is no private function in the basic.py, so any function is assumed to be freely accessible

@ferrine ferrine force-pushed the graph-replace branch 5 times, most recently from 5cd9cd3 to c9fd99b Compare December 7, 2022 12:11
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferrine I am happy with the code! Just some corrections/nitpicks below. Also, can you clean the git history + add deprecation for is_in_ancestors (if you didn't already)?

)


def test_graph_replace():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you group these in a class like clone_replace

@ferrine ferrine force-pushed the graph-replace branch 2 times, most recently from 73007a6 to 17e189f Compare December 9, 2022 14:47
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @ferrine. Only one nitpick: you have a typo in the first commit move cclone_replace to a separate file (double c in cclone_...) 9822a56

@ferrine
Copy link
Member Author

ferrine commented Dec 9, 2022

Who will press the green button?

@ferrine
Copy link
Member Author

ferrine commented Dec 9, 2022

Thanks @ricardoV94 for your tremendous help and support with this pr. Also thanks @aseyboldt for pointing out algorithmic issues when it started out

@aseyboldt
Copy link
Member

Looking forward to using this!
Good work, I guess the honor is yours :-)

@ricardoV94 ricardoV94 merged commit befc177 into main Dec 9, 2022
@ferrine ferrine deleted the graph-replace branch December 9, 2022 21:17
dependent = variable_depends_on(node, blockers - {node})
# ancestors to include that are present in the graph (not disconnected)
# should be added to truncated_inputs
truncated_inputs.append(node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferrine I think we can end up with duplicated inputs here. If another node has duplicated truncated inputs?

@ricardoV94
Copy link
Member

Good work, I guess the honor is yours :-)

Oops, I didn't see these messages!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants