Skip to content

Conversation

@guerler
Copy link
Contributor

@guerler guerler commented Oct 5, 2025

Resolves #21183. Exploring options to resolve a failure caused by injecting an empty name column into the data table. Because only the column name was added without corresponding field data, the column list becomes longer than the field list. As a result, when searching for a column such as path, the returned index is shifted by one, leading to an out-of-range error when accessing the corresponding field value. twobit data tables do not include a name field, and since the path field is the last column, any attempt to access it for downloading data from IGV results in a failure. In my view, the correct approach is to avoid injecting an empty name column, since a value column is already required and guaranteed to exist. Consumers of this API endpoint should instead fall back to the value column or whichever column or index they prefer.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@guerler guerler added this to the 25.1 milestone Oct 5, 2025
@guerler guerler force-pushed the fix_data_table_handling branch from 5931704 to e6d1516 Compare October 29, 2025 04:55
@guerler guerler changed the base branch from dev to release_25.1 October 29, 2025 07:29
@github-actions github-actions bot changed the title Avoid missing field value failure in data tables [25.1] Avoid missing field value failure in data tables Oct 29, 2025
@guerler guerler changed the base branch from release_25.1 to dev October 29, 2025 07:30
@github-actions github-actions bot changed the title [25.1] Avoid missing field value failure in data tables Avoid missing field value failure in data tables Oct 29, 2025
@guerler guerler force-pushed the fix_data_table_handling branch from e6d1516 to b1946d8 Compare October 29, 2025 07:33
@guerler guerler changed the base branch from dev to release_25.1 October 29, 2025 07:33
@github-actions github-actions bot changed the title Avoid missing field value failure in data tables [25.1] Avoid missing field value failure in data tables Oct 29, 2025
@guerler guerler requested a review from mvdbeek October 29, 2025 07:35
if empty_field_value is not None:
self.empty_field_values[name] = empty_field_value
assert "value" in self.columns, "Required 'value' column missing from column def"
if "name" not in self.columns:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit says Attempt to not require name column but what you're removing here seems to be the fallback definition for name. Can you start with a test case (automated or not) that breaks what you're fixing ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, thanks for looking into it. I am adding a test case right now and will update PR comment with more details.

@guerler guerler marked this pull request as ready for review October 29, 2025 08:41
@mvdbeek
Copy link
Member

mvdbeek commented Oct 29, 2025

As a result, when searching for a column such as path, the returned index is shifted by one,

Shifted by one relative to what ? I see that https://test.galaxyproject.org/api/tool_data/twobit is missing one column relative to the fields the API claims to return, isn't that the correct bug fix ? Here you're altering all data table definitions, I don't think that's wise ?

twobit data tables do not include a name field, and since the path field is the last column, any attempt to access it for downloading data from IGV results in a failure.

I would suggest that the IGV viz throws an error there ?

@guerler
Copy link
Contributor Author

guerler commented Oct 30, 2025

Relative to: https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/config/sample/tool_data_table_conf.xml.sample#L75C9-L75C39.

This does not affect all tables, only those missing the name field, specifically sift_db and twobit.

The issue is in the API layer, which inserts a pseudo column that does not exist. Alternatively, the API should throw an error and require a name column in addition to a value column, which I don't think is necessary.

@mvdbeek
Copy link
Member

mvdbeek commented Oct 30, 2025

parse_column_spec is not restricted to the API layer though so you might be breaking tools. i'll have a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IGV visualization not loading reference Genomes

2 participants