-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add rxn helpers #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's Guide by SourceryThis pull request introduces new functionalities to the chemenv modal app, focusing on reaction SMILES manipulation and atom mapping. It includes functions for unifying and canonicalizing reaction SMILES, obtaining atom mappings using RxnMapper, and extracting functional groups using Exomol. The changes involve adding new Modal apps and functions, integrating external tools like DECIMER, MolScribe, RxnScribe, and RxnMapper, and defining new images for these tools. Sequence diagram for get_rxn_mappingsequenceDiagram
participant User
participant chemenv_modal_app
participant rxn_mapper_image
User->>chemenv_modal_app: get_rxn_mapping(rxns: list[str])
chemenv_modal_app->>rxn_mapper_image: _get_rxn_mapper_confidence(rxns)
rxn_mapper_image-->>chemenv_modal_app: list[dict] (atom mappings)
chemenv_modal_app-->>User: list[dict] (atom mappings)
Sequence diagram for unify_rxn_smilessequenceDiagram
participant User
participant chemenv_modal_app
participant rxn_utils_image
User->>chemenv_modal_app: unify_rxn_smiles(rxn_smiles: str)
chemenv_modal_app->>rxn_utils_image: _unify_rxn_smiles(rxn_smiles)
rxn_utils_image-->>chemenv_modal_app: str (unified SMILES)
chemenv_modal_app-->>User: str (unified SMILES)
Updated class diagram for rxn_schema_processingclassDiagram
class rxn_schema_processing {
<<image>> decimer_image
<<image>> rxnscribe_image
+ _decimer_extractor(image: bytes) : str
+ _molscribe_extractor(image: bytes) : dict[str, Any]
+ _rxnscribe_extractor(image: bytes) : list[dict]
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds new helper functions for reaction SMILES processing and extraction of chemical data using RxnMapper, Exomol, DECIMER, MolScribe, and RxnScribe.
- Introduces functions to unify, canonicalize, and map reaction SMILES.
- Adds functionality to extract functional groups and predict molecular/reaction data from images.
- Integrates the new functions into the modal app with dedicated endpoints and volumes.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
src/chemenv/tools/util_tool.py | Updated image initialization and added Exomol helper. |
src/chemenv/tools/rxn_utils.py | Added functions for RxnMapper confidence, and SMILES helpers. |
src/chemenv/tools/rxn_schema_processing.py | Added functions for reaction schema extraction using DECIMER, MolScribe, and RxnScribe. |
src/chemenv/modal_app.py | Integrated new extraction and reaction mapping endpoints into the app. |
Comments suppressed due to low confidence (1)
src/chemenv/tools/rxn_schema_processing.py:4
- For consistency with other pip_install calls, consider passing the package name as a list (e.g., ["decimer"]) to avoid potential type issues.
decimer_image = Image.debian_slim(python_version="3.10.0").pip_install("decimer")
from rxn.chemutils.miscellaneous import canonicalize_any | ||
|
||
|
||
def _get_rxn_mapper_confidence(rxns: list[str]) -> list[dict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function's docstring is inconsistent: it describes a single reaction string and a float return value, whereas the signature indicates it accepts a list of reaction strings and returns a list of dictionaries. Please update the docstring to accurately reflect the parameter type (list[str]) and return type (list[dict]).
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @MrtinoRG - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a unit test suite to verify the functionality of the newly added functions.
- It would be helpful to add docstrings to the internal helper functions (those starting with an underscore).
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
for result in results: | ||
for key, value in result.items(): | ||
for v in value: | ||
if "molfile" in v: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the nested loops in _rxnscribe_extractor
to use a helper function for removing 'molfile', reducing complexity and improving readability
You can reduce the three levels of nesting by extracting the "molfile" removal into a small helper that operates on each result. For example, instead of:
for result in results:
for key, value in result.items():
for v in value:
if "molfile" in v:
v.pop("molfile")
you might define a helper function:
def remove_molfile_from_list(dicts: list[dict], key: str = "molfile") -> None:
for d in dicts:
d.pop(key, None)
and then simplify the loop in _rxnscribe_extractor
to:
for result in results:
for value in result.values():
remove_molfile_from_list(value)
This refactoring preserves functionality and makes the code clearer and easier to maintain.
Summary by Sourcery
Add reaction-related utility functions and image processing capabilities to the chemenv project, introducing new tools for molecule and reaction schema extraction, mapping, and SMILES manipulation.
New Features:
Enhancements: