Skip to content

Conversation

KoalaQin
Copy link
Contributor

Update highly structured annotations such as gnomAD sample meta.

@KoalaQin
Copy link
Contributor Author

@jkgoodrich I get this error:
ImportError: cannot import name 'add_filters_expr' from partially initialized module 'gnomad.utils.filtering' (most likely due to a circular import)
I searched around, it seems that I can't import functions from filtering.py because filtering.py is importing something from annotations.py
Where should I put the update_sample_annotations?

@KoalaQin
Copy link
Contributor Author

I put this function in release.py, passed checks, is it Okay?

return index_dict


def update_sample_annotations(expr: hl.expr, sample_annotations: Dict[str, hl.expr]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should go in annotations, it makes a lot more sense there. You can handle the circular import using import as for example import gnomad.utils.filtering as filter_utils

Also, I modified it a little:

def update_structured_annotations(
    ht: hl.Table,
    annotation_update_exprs: Dict[str, hl.Expression],
    annotation_update_label: Optional[str] = None,
) -> hl.Table:
    """
    Update highly structured annotations on a Table.

    This function recursively updates annotations defined by `annotation_update_exprs`
    and if `annotation_update_label` is supplied, it checks if the sample annotations
    are different from the input and adds a flag to the Table, indicating which
    annotations have been updated for each sample.

    :param ht: Input Table with structured annotations to update.
    :param annotation_update_exprs: Dictionary of annotations to update, structured as
        they are structured on the input `ht`.
    :param annotation_update_label: Optional string of the label to use for an
        annotation indicating which annotations have been updated. Default is None, so
        no annotation is added.
    :return: Table with updated annotations and optionally a flag indicating which
        annotations were changed.
    """
    def _update_struct(
        struct_expr: hl.expr.StructExpression,
        update_exprs: Union[Dict[str, hl.Expression], hl.expr.Expression]
    ) -> Tuple[Dict[str, hl.expr.BooleanExpression], hl.expr.StructExpression]:
        """
        Update a StructExpression.

        :param struct_expr: StructExpression to update.
        :param update_exprs: Dictionary of annotations to update.
        :return: Tuple of the updated annotations and the updated flag.
        """
        if isinstance(update_exprs, dict):
            updated_struct_expr = {}
            updated_flag_expr = {}
            for ann, expr in update_exprs.items():
                updated_flag, updated_ann = _update_struct(struct_expr[ann], expr)
                updated_flag_expr.update(
                    {ann + ("." + k if k else ""): v for k, v in updated_flag.items()}
                )
                updated_struct_expr[ann] = updated_ann
            return updated_flag_expr, struct_expr.annotate(**updated_struct_expr)
        else:
            return {"": update_exprs != struct_expr}, update_exprs

    annotation_update_flag, updated_rows = _update_struct(
        ht.row_value, annotation_update_exprs
    )
    if annotation_update_label is not None:
        updated_rows = updated_rows.annotate(
            **{
                annotation_update_label: filter_utils.add_filters_expr(
                    filters=annotation_update_flag
                )
            }
        )

    return ht.annotate(**updated_rows)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I put it back to annotations.py, and modified the return type in _updata_struct().

@KoalaQin KoalaQin requested a review from jkgoodrich August 21, 2023 19:11
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple questions about types


def _update_struct(
struct_expr: hl.expr.StructExpression,
update_exprs: Union[Dict[str, hl.Expression], hl.expr.Expression],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
update_exprs: Union[Dict[str, hl.Expression], hl.expr.Expression],
update_exprs: Union[Dict[str, hl.expr.Expression], hl.expr.Expression],

def _update_struct(
struct_expr: hl.expr.StructExpression,
update_exprs: Union[Dict[str, hl.Expression], hl.expr.Expression],
) -> Tuple[Dict[str, Any], hl.expr]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the first one not always Dict[str, hl.expr.BooleanExpression]? and why hl.expr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this warning for the return in else:

Expected type 'tuple[dict[str, BooleanExpression], StructExpression]', got 'tuple[dict[str, Any], Expression]' instead: 1846. 

Copy link
Contributor

@jkgoodrich jkgoodrich Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try this Tuple[Dict[str, hl.expr.BooleanExpression], Any] I think the first should be a BooleanExpression, the suppose the second can be multiple different things

@KoalaQin KoalaQin requested a review from jkgoodrich August 21, 2023 19:41
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@KoalaQin KoalaQin merged commit 0b355d1 into main Aug 21, 2023
@jkgoodrich jkgoodrich changed the title function to update sample annotations Add function update_structured_annotations to update structured annotations on a Table Aug 21, 2023
@KoalaQin KoalaQin deleted the qh/updata_annotations branch August 23, 2023 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants