Skip to content

Conversation

@ArturU043
Copy link
Collaborator

New function: get_structure

  1. Creates an SX python function query that reads one file of the requested DS and outputs to an ak.array an str encoding the file structure
  2. Builds a deliver spec with support for multiple samples and user-defined names
  3. Gets the result encoded str from the servicex.deliver call
  4. (opt) Prints the encoded str in a user-friendly format
  5. (opt) Returns the re-formatted str
  6. (opt) Saves the re-formatted str to samples-structure.txt
  7. (opt) Reconstructs a dummy ak.array from the encoded str and returns the type constructor

The function can be called from the terminal: servicex-get-structure
Options are added to save to .txt, load a single or multiple DS, write all DS in a .json to be loaded by the command.

Many helpers were added for this feature, run_query, build_deliver_spec, print_structure_from_str, parse_jagged_depth_and_dtype, str_to_array, run_from_command

@ArturU043 ArturU043 self-assigned this Mar 25, 2025
@ArturU043
Copy link
Collaborator Author

This is my first attempt at building this feature, and initially, I didn't expect to reconstruct ak.arrays from the encoded string, but after stepping back, I wonder if it might not be over-complex.

For eg, should I use regex matching, instead of positional methods to extract information from the encoded str?

Should I write a simpler encoded str using the awkward type constructor directly?

Copy link

@gordonwatts gordonwatts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - nice! I like this and this is going to be very useful. I agree with your comment about simplifying things. Here is what I think should be done:

  1. Use json (with the built in json module) to generate the output on ServiceX
  2. Use the json module to parse it up on the client.

This should significantly simplify the code - the json builtin parser is basically bullet proof. Once that is done, then how the downstream things work can probably be significantly simplified.

@ArturU043
Copy link
Collaborator Author

Ready to be merged, please add other comments if you have some.

@ArturU043 ArturU043 requested a review from gordonwatts April 14, 2025 15:16
Copy link

@gordonwatts gordonwatts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@ArturU043 ArturU043 merged commit c87f48c into main Apr 23, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants