-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
DAS-2067 - Migrate datatree io.py and common.py #9011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-2067 - Migrate datatree io.py and common.py #9011
Conversation
| return list(items) | ||
|
|
||
|
|
||
| class TreeAttrAccessMixin(AttrAccessMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here's the biggest question I have... is this okay for now? (Maybe this is directed at @TomNicholas)
The use of __slots__ is a little new to me (but makes sense from reading up on it). I think to adopt __slots__ and remove __dict__ I would need to slightly rework DataTree, TreeNode and NamedNode. I wanted to check that the scope mainly seems to be self.children and self.parent attributes, and understand why those classes were implemented as they currently are before trying to alter things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to embarrass me, but why did you stop calling init_subclass on the Parent object? everything else with this merge seems reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I accidentally commented out that last line. Will fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, my last comment was true, but... uncommenting out that line means that the same method in the parent class (AttrAccessMixin) gets called by super in addition to this method, right? That would mean we end up calling the validation logic that this child method is trying to avoid for now (checking that __dict__ is not present on the class).
Now if only I'd said that the first time around 😉
flamingbear
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just comments for now
| return list(items) | ||
|
|
||
|
|
||
| class TreeAttrAccessMixin(AttrAccessMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to embarrass me, but why did you stop calling init_subclass on the Parent object? everything else with this merge seems reasonable to me.
| return list(items) | ||
|
|
||
|
|
||
| class TreeAttrAccessMixin(AttrAccessMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this class necessary? Can't DataTree inherit directly from AttrAccessMixin? And override _attr_sources and so on?
I think the reason this class existed separately was just to do with hacking around limitations of datatree being in a separate repository...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially tried to use AttrAccessMixin. Just dropping it in as a replacement for TreeAttrAccessMixin doesn't work because of the checking done inside the AttrAccessMixin.__init_subclass__ method. The check for if __dict__ exists fails. DataTree, TreeNode and NamedNode do all define __slots__, but I think there are attributes being defined in places like DataTree.__init__, for example DataTree.parent and DataTree.children (versus the DataTree._parent and DataTree._children attributes defined in DataTree.__slots__).
I think we talked about this a little bit in one of our calls - things like DataTree.parent and DataTree.children seemed like tricky things to change over.
|
@TomNicholas @owenlittlejohns I think we decided at the Tuesday meeting we are leaving the slots/dict as is and this is ready for approvals. Which I am about to do. |
flamingbear
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has some comments, but most of it was discussed in the special Xarray/Datatree meeting and I think this can be merged.
|
@TomNicholas - just checking in over this PR. I captured the outstanding work to make |
TomNicholas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Merge when you like.
This reverts commit cb3663d.
* upstream/main: [skip-ci] Try fixing hypothesis CI trigger (pydata#9112) Undo custom padding-top. (pydata#9107) add remaining core-dev citations [skip-ci][skip-rtd] (pydata#9110) Add user survey announcement to docs (pydata#9101) skip the `pandas` datetime roundtrip test with `pandas=3.0` (pydata#9104) Adds Matt Savoie to CITATION.cff (pydata#9103) [skip-ci] Fix skip-ci for hypothesis (pydata#9102) open_datatree performance improvement on NetCDF, H5, and Zarr files (pydata#9014) Migrate datatree io.py and common.py into xarray/core (pydata#9011) Micro optimizations to improve indexing (pydata#9002) (fix): don't handle time-dtypes as extension arrays in `from_dataframe` (pydata#9042)
This PR migrates what I believe is the last actual source code out of the
xarray/datatree_/datatreedirectory intoxarray/core. This is another piece of #8572.I made a couple of decisions that I'd love some feedback/extra eyes on:
xarray/datatree_/datatree/io.pyinto their own module, rather thanxarray/backends/api.py. It felt like these were a little separate, but I can definitely be convinced otherwise.TreeAttrAccessMixin. I really just wanted to swap over to usingAttrAccessMixin, but I ran into issues with__slots__. Looking at it, I think I would need to work on theDataTree,TreeNodeandNamedNodeclasses. I can go down that path, but thought I'd put up this PR for discussion before beginning that effort.Tests addedwhats-new.rstNew functions/methods are listed inapi.rst