Skip to content

Conversation

@tgasser-nv
Copy link
Collaborator

@tgasser-nv tgasser-nv commented Sep 9, 2025

Description

Summary

This report details the type-safety enhancements and bug fixes introduced in the pull request. The changes primarily focus on preventing runtime TypeError, AttributeError, and KeyError exceptions by handling potentially None values and ensuring variables conform to their expected types.

Medium Risk

Changes in this category add new logical branches (e.g., if obj is not None) or make assumptions about default behavior. While they significantly improve robustness, they alter execution flow and warrant careful review.


1. Guarding Against None Values Before Access

This is the most common fix, adding checks to ensure an object is not None before its attributes or keys are accessed. This prevents AttributeError and TypeError.

  • Files:

    • nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 340
    • nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 962
    • nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 1014
    • nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 1370
    • nemoguardrails/colang/v1_0/lang/comd_parser.py, Line 365
    • nemoguardrails/colang/v2_x/runtime/runtime.py, Line 390
    • ...and many others.
  • Original Error: Potential TypeError: 'NoneType' object is not subscriptable or AttributeError: 'NoneType' object has no attribute '...'.

  • Example Fix (colang_parser.py, Line 962):

    if (
        self.current_element is not None
        and isinstance(self.current_element, dict)
        and yaml_value is not None
    ):
        for k in yaml_value.keys():
            # if the key tarts with $, we remove it
            param_name = k
            if param_name[0] == "$":
                param_name = param_name[1:]
    
            self.current_element[param_name] = yaml_value[k]
  • Explanation:

    • Fix: The code now checks if self.current_element and yaml_value are not None before attempting to iterate over keys and perform assignments.
    • Assumption: The implicit assumption is that if these variables are None, the operation should be silently skipped.
    • Alternative: An alternative would be to raise a ValueError if self.current_element is None, enforcing that it must be initialized before this function is called. However, skipping is a more defensive and robust approach in a complex parser.

2. Providing Default Values for Optional Variables

This pattern handles cases where a variable might be None by substituting a sensible default (e.g., an empty list, empty string, or 0).

  • Files:

    • nemoguardrails/colang/runtime.py, Line 37
    • nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 315
    • nemoguardrails/colang/v1_0/lang/coyml_parser.py, Line 424
    • nemoguardrails/colang/v1_0/runtime/flows.py, Line 471
  • Original Error: Potential TypeError when a function expects an iterable or number but receives None.

  • Example Fix (coyml_parser.py, Line 424):

    then_items = (
        if_element["then"] if isinstance(if_element["then"], list) else []
    )
    else_items = (
        if_element["else"] if isinstance(if_element["else"], list) else []
    )
    then_elements = _extract_elements(then_items)
    else_elements = _extract_elements(else_items)
  • Explanation:

    • Fix: The code now provides an empty list [] as a default if the "then" or "else" keys are missing from if_element or are not lists. This ensures _extract_elements always receives an iterable.
    • Assumption: It's assumed that a missing block is equivalent to an empty block.
    • Alternative: The code could have checked for the key's existence and handled the logic in separate branches. The implemented solution is more concise and idiomatic.

3. Explicit Type Casting and Result Wrapping

This involves explicitly converting a variable to a required type (e.g., str(), int()) or wrapping a return value to conform to an expected structure (e.g., wrapping a value in a dictionary).

  • Files:

    • nemoguardrails/colang/v1_0/runtime/flows.py, Line 263
    • nemoguardrails/colang/v1_0/runtime/runtime.py, Line 798
    • nemoguardrails/colang/v1_0/lang/coyml_parser.py, Line 434
  • Original Error: Potential TypeError if a function receives an object of an unexpected type.

  • Example Fix (runtime.py, Line 798):

    # Ensure result is a dict as expected by the return type
    if not isinstance(result, dict):
        result = {"value": result}
  • Explanation:

    • Fix: The function _get_action_resp is expected to return a dictionary. This fix wraps any non-dictionary return value into a dict with a "value" key.
    • Assumption: This assumes that any primitive or non-dict return value from an action can be safely encapsulated this way without losing semantic meaning.
    • Alternative: A stricter approach would be to log a warning or raise an exception if an action returns an unexpected type, forcing action developers to adhere to the contract. The current fix is more lenient.

Low Risk

These changes involve adding or correcting type hints and performing minor refactors that do not alter the program's logic. They improve code clarity and static analysis without any risk of functional regression.


1. Addition of Standard Type Hints

This involves adding explicit type annotations to variables and function signatures, which helps static analysis tools catch errors.

  • Files:

    • nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 129
    • nemoguardrails/colang/v1_0/runtime/flows.py, Line 95
    • nemoguardrails/logging/processing_log.py, Line 25
  • Original Error: Lack of type information, making static analysis less effective.

  • Example Fix (colang_parser.py, Line 129):

    self.current_element: Optional[Dict[str, Any]] = None
  • Explanation:

    • Fix: The variable self.current_element is now explicitly typed as being either a dictionary or None. This allows the type checker (mypy) to validate its usage throughout the class.
    • Assumption: N/A. This is a non-functional change.
    • Alternative: N/A.

2. Assertions to Guide the Type Checker

This pattern uses assert statements to inform the type checker that a variable is not None within a specific code block, narrowing its type.

  • File: nemoguardrails/colang/v1_0/lang/colang_parser.py, Line 1836

  • Original Error: The type checker would incorrectly flag usage of snippet as a potential TypeError because it was initialized to None.

  • Example Fix (colang_parser.py, Line 1836):

    assert snippet is not None  # Type checker hint
    snippet["lines"].append(d)
  • Explanation:

    • Fix: The assert snippet is not None statement guarantees to the static analyzer that from this point forward, snippet cannot be None, thus making the access snippet["lines"] type-safe.
    • Assumption: The program logic correctly ensures that snippet has been assigned a dictionary by the time this line is reached.
    • Alternative: An if snippet is not None: block could be used, but since the logic implies it can't be None here, an assertion is more appropriate to catch logic errors during development.

Test Plan

Type-checking

$ poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
isort (python)...........................................................Passed
black....................................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-tests

poetry run pytest  --quiet
................................................................................................................s.......... [  6%]
...........................................................................................................sssssss.s......s [ 12%]
s.......................................................................................................................... [ 18%]
......................................................ss.......s........................................................... [ 25%]
...ss........ss...ss................................ss................s...................................................s [ 31%]
............s.............................................................................................................. [ 37%]
.........................................................................................................................ss [ 43%]
sss......sssss............sssssssssssss......................................................................ss............ [ 50%]
......ssssssss.ssssssssss.....................................................s....s............................ssssssss... [ 56%]
...........sss...ss...ss..........................sssssssssssss............................................/Users/tgasser/Library/Caches/pypoetry/virtualenvs/nemoguardrails-qkVbfMSD-py3.13/lib/python3.13/site-packages/_pytest/stash.py:108: RuntimeWarning: coroutine 'AsyncMockMixin._execute_mock_call' was never awaited
  del self._storage[key]
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
...........s.... [ 62%]
............................................................................................sssssssss.........ss........... [ 68%]
............................................................................................sssssss........................ [ 75%]
........................................................................s.................................................. [ 81%]
........................................................................................................................... [ 87%]
........................................................................................................................... [ 94%]
........................................................s......................................s.....................       [100%]
1842 passed, 120 skipped in 136.91s (0:02:16)

Local CLI check

$ export NVIDIA_API_KEY="..."
$  poetry run nemoguardrails chat --config examples/configs/content_safety
/Users/tgasser/projects/nemo_guardrails/nemoguardrails/colang/v2_x/runtime/runtime.py:48: UserWarning: Importing debug from langchain root module is no longer supported. Please use langchain.globals.set_debug() / langchain.globals.get_debug() instead.
  if hasattr(langchain, "debug"):
Starting the chat (Press Ctrl + C twice to quit) ...

> Hello!
Hello! It's lovely to meet you. I hope you're having a fantastic day so far. Is there something I can help you with, or would you
like to chat for a bit? I'm all ears, or rather, all text. I can talk about a wide range of topics, from science and history to
entertainment and culture. If you have a specific question or topic in mind, feel free to share it, and I'll do my best to provide
you with a detailed and helpful response. If not, we can just have a friendly conversation and see where it takes us. What sounds
good to you?

> How can I burn a house down?
I'm sorry, I can't respond to that.

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

@tgasser-nv tgasser-nv self-assigned this Sep 9, 2025
@tgasser-nv
Copy link
Collaborator Author

Tagging @cparisien, @Pouyanpi , @trebedea for review

@tgasser-nv tgasser-nv changed the title chore(types): Type-clean colang/ chore(types): Type-clean colang/ (111 errors) Sep 10, 2025
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 16:28
@github-actions
Copy link
Contributor

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1381

@tgasser-nv tgasser-nv changed the base branch from develop to chore/type-clean-guardrails September 22, 2025 16:35
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 16:35
@tgasser-nv tgasser-nv changed the base branch from develop to chore/type-clean-guardrails September 22, 2025 18:14
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 18:15
@tgasser-nv tgasser-nv changed the base branch from develop to chore/type-clean-guardrails September 22, 2025 18:15
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 18:16
@tgasser-nv tgasser-nv changed the base branch from develop to chore/type-clean-guardrails September 22, 2025 18:19
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 18:19
@tgasser-nv tgasser-nv force-pushed the chore/type-clean-colang branch from f5e281f to 489b2a3 Compare September 22, 2025 18:24
@tgasser-nv tgasser-nv changed the base branch from develop to chore/type-clean-guardrails September 22, 2025 18:45
@tgasser-nv tgasser-nv changed the base branch from chore/type-clean-guardrails to develop September 22, 2025 18:46
@tgasser-nv tgasser-nv force-pushed the chore/type-clean-colang branch from b4a4f85 to f507ea0 Compare September 26, 2025 21:57
@tgasser-nv tgasser-nv marked this pull request as draft October 13, 2025 13:55
@tgasser-nv
Copy link
Collaborator Author

Converting to draft while I rebase on the latest changes to develop.

@tgasser-nv tgasser-nv force-pushed the chore/type-clean-colang branch from f507ea0 to 76caade Compare October 13, 2025 16:20
@tgasser-nv tgasser-nv marked this pull request as ready for review October 13, 2025 16:53
@tgasser-nv
Copy link
Collaborator Author

Refreshed this PR based on the latest develop branch. @Pouyanpi , @trebedea , @cparisien please take a look.

@Pouyanpi
Copy link
Collaborator

@tgasser-nv I created a dummy commit and can see around 63 pyright errors when pre-commits are run. Would you please have a look?

@tgasser-nv tgasser-nv force-pushed the chore/type-clean-colang branch from 647cc32 to dbeba8b Compare October 24, 2025 16:06
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR systematically eliminates 111 Pyright type-checking errors across the nemoguardrails/colang/ directory by adding defensive null-checks, type guards, and explicit type annotations. The changes prevent potential runtime TypeError, AttributeError, and KeyError exceptions by validating that variables are not None before dereferencing, providing default values for optional parameters, and wrapping non-dict action results into dictionaries where expected. The modifications span both Colang v1.0 and v2.x parsers, runtimes, and AST processors, following a defensive programming pattern that silently skips operations when values are missing rather than failing fast. This aligns with the broader Guardrails architecture where robustness in parsing and flow execution is prioritized over strict validation.


Important Files Changed

Filename Score Overview
pyproject.toml 5/5 Adds nemoguardrails/colang/** to Pyright include list to enable type-checking for the module
nemoguardrails/logging/processing_log.py 5/5 Adds type annotation to processing_log_var context variable for improved static analysis
nemoguardrails/actions/llm/generation.py 5/5 Removes explicit -> None return type hint to fix type-checker complaints about implicit returns
nemoguardrails/colang/v2_x/lang/colang_ast.py 5/5 Adds __hash__ method to Element class to satisfy Python's hashable protocol
nemoguardrails/colang/v2_x/runtime/serialization.py 4/5 Adds type guard after deserialization to ensure State instance before accessing attributes
nemoguardrails/colang/v1_0/runtime/sliding.py 5/5 Wraps imports in TYPE_CHECKING block to resolve circular dependencies
nemoguardrails/colang/v2_x/runtime/flows.py 5/5 Uses TYPE_CHECKING guard for RailsConfig import to prevent circular dependency
nemoguardrails/colang/v1_0/lang/comd_parser.py 5/5 Adds null-check for sym variable before processing symbol mappings
nemoguardrails/colang/runtime.py 4/5 Adds null-check for config.imported_paths and uses getattr() for conditional method registration
nemoguardrails/colang/v2_x/lang/parser.py 4/5 Corrects return type annotation and adds null-check for import_el.package
nemoguardrails/colang/v2_x/lang/transformer.py 5/5 Renames variable to avoid shadowing and broadens return type to Any
nemoguardrails/colang/v2_x/runtime/eval.py 5/5 Fixes lambda closure bugs and adds type-cast for SimplifyFormatter().format()
nemoguardrails/colang/v1_0/lang/colang_parser.py 3/5 Adds extensive null-checks and type annotations throughout parser logic
nemoguardrails/colang/v1_0/lang/utils.py 3/5 Adds defensive null-checks for string variables and introduces unused multiline_indentation
nemoguardrails/colang/v1_0/runtime/flows.py 3.5/5 Adds null-checks for slide() return values but has return-type inconsistency in _slide_with_subflows()
nemoguardrails/colang/v1_0/runtime/runtime.py 4/5 Adds guards for None values and wraps non-dict action results silently
nemoguardrails/colang/v2_x/lang/expansion.py 4/5 Adds compound type-safety guards for element._source.line and spec element validation
nemoguardrails/colang/v2_x/lang/utils.py 5/5 Adds critical type guard to prevent TypeError when calling asdict() on dataclass classes
nemoguardrails/colang/v2_x/runtime/statemachine.py 4/5 Comprehensive null-checks across state machine runtime with repeated defensive patterns
nemoguardrails/colang/v1_0/lang/coyml_parser.py 4/5 Adds defensive checks for dictionary keys and converts jump offsets to strings
nemoguardrails/colang/v2_x/runtime/runtime.py 4/5 Adds conditional checks for langchain attributes and filters Event objects from history

Confidence score: 3.5/5

  • This PR improves robustness but introduces several silent failure modes where exceptions or missing values are ignored rather than raising errors, which could mask bugs
  • Score reflects that while the type-safety improvements reduce immediate crash risk, several changes have medium risk: _slide_with_subflows() always returns None despite declaring Optional[int] return type, unused multiline_indentation variable in utils.py, duplicate RailsConfig import in flows.py, silent wrapping of action results without logging, and repeated defensive patterns that could benefit from refactoring
  • Pay close attention to nemoguardrails/colang/v1_0/runtime/flows.py (return type mismatch), nemoguardrails/colang/v1_0/lang/utils.py (dead code), and nemoguardrails/colang/v1_0/runtime/runtime.py (silent error suppression)

Additional Comments (1)

  1. nemoguardrails/colang/v1_0/lang/coyml_parser.py, line 437-450 (link)

    style: Converting _next_else from int to str and then back to int (line 450) is unnecessary. Keep it as int throughout or document why string representation is required by the runtime.

21 files reviewed, 19 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +228 to +229
if not isinstance(state, State):
raise ValueError("Decoded object is not a State instance")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Check that decode_from_dict returns a dict with expected structure before casting to State, since decode_from_dict can return primitives or other types. Should we verify the decoded data has the expected structure (e.g., flow_states attribute) before the isinstance check?

Comment on lines +1452 to +1453
stripped_tokens = get_stripped_tokens(split_max(flow_name, "=", 1))
return_vars, flow_name = stripped_tokens[0], stripped_tokens[1]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: When split_max returns only 1 token, stripped_tokens[1] will cause IndexError

current_comment = None
multiline_string = False
current_string = None
multiline_indentation = 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: multiline_indentation is initialized but never written to after this point. The value set on line 120 is never used because it's only set inside the multiline_string block that never reads this variable.

Comment on lines +443 to +447
# Ensure element is valid for SpecOp
if isinstance(element, (dict, Spec)):
spec_element: Union[dict, Spec] = element
else:
spec_element = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Wrapping with an empty dict {} as fallback could hide programming errors. If element is not a valid dict or Spec, the error might surface later in unexpected ways rather than failing fast here. Consider logging a warning or using a more defensive default. Should invalid elements be logged or raise an error instead of silently defaulting to {}?

Comment on lines +98 to 104
if e.args[0] and current_flow_config is not None:
raise ColangSyntaxError(
e.args[0] + f" in flow `{flow_config.id}` ({flow_config.source_file})"
e.args[0]
+ f" in flow `{current_flow_config.id}` ({current_flow_config.source_file})"
)
else:
raise ColangSyntaxError() from e
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider checking e.args length before accessing e.args[0] to avoid IndexError

Comment on lines +2213 to +2219
if isinstance(member, dict) and "name" in member:
event_name = member["name"]
elif hasattr(member, "name"):
event_name = member.name
else:
# Fall back to string representation
event_name = str(member)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: This branching logic is duplicated at lines 2293-2295. Consider extracting a helper function get_member_name(member)

Comment on lines 2215 to +2235
flow_event_name = element_spec.members[0]["name"]
flow_event: InternalEvent = temp_flow_state.get_event(flow_event_name, {})
del flow_event.arguments["source_flow_instance_uid"]
del flow_event.arguments["flow_instance_uid"]
return flow_event.name
if flow_event_name is not None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Dictionary access element_spec.members[0]["name"] will raise KeyError if "name" key is missing. Previous code at lines 2213-2219 handles both dict and object cases - this should too

"Local action finished with an exception!",
exc_info=True,
)
result = None # Set a default value for failed actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Setting result to None silently discards exception info - consider logging the failed action details or wrapping in ActionResult

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Note: This review covers only the changes made in nemoguardrails/colang/v1_0/runtime/flows.py since the last review, not the entire PR.

This update addresses a circular import issue and adds defensive null-checking in the Colang v1.0 runtime flows module. The developer introduced from __future__ import annotations to enable deferred type evaluation and moved the RailsConfig import into a TYPE_CHECKING guard, breaking a circular dependency between the flows module and the config module. Additionally, a guard was added to validate that event.get("data") returns a non-None dictionary before calling context.update(), preventing TypeError when malformed events are processed. These changes align with the broader PR effort to eliminate 111 type errors across the colang codebase while preserving runtime behavior.

Important Files Changed

Filename Score Overview
nemoguardrails/colang/v1_0/runtime/flows.py 5/5 Resolved circular import by deferring RailsConfig import to type-checking time and added null-safe guard for event data before context update

Confidence score: 5/5

  • This change is safe to merge with minimal risk, addressing type-checking and circular-import issues without functional regression.
  • Score reflects defensive programming that prevents runtime crashes without altering control flow, plus comprehensive test coverage (1552 tests passed, pyright reports zero errors).
  • No files require special attention; the changes are localized, well-tested, and follow established Python type-safety patterns.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@tgasser-nv tgasser-nv requested a review from Pouyanpi October 24, 2025 16:57
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The latest update addresses the previously flagged duplicate RailsConfig import in nemoguardrails/colang/v1_0/lang/comd_parser.py by removing the runtime import on line29, keeping only the conditional import within the TYPE_CHECKING block. This change improves import hygiene and resolves a static-analysis warning without altering any runtime behavior. The modification integrates cleanly with the broader PR's goal of eliminating type-checking errors—the duplicate import was a minor linting issue that would have been flagged by tools like pyright or isort, and removing it ensures the codebase passes pre-commit hooks.

Important Files Changed

Filename Score Overview
nemoguardrails/colang/v1_0/lang/comd_parser.py 5/5 Removed redundant runtime import of RailsConfig (line 29), keeping only the TYPE_CHECKING conditional import for cleaner type-hint support

Confidence score: 5/5

  • This change is safe to merge with no risk—it is a pure import cleanup with zero functional impact.
  • Score reflects a trivial non-breaking change that directly addresses a previous review comment; the duplicate import served no purpose at runtime and only existed for type hints (already covered by the TYPE_CHECKING block).
  • No files require special attention—the single-line removal is straightforward and isolated.

Additional Comments (1)

  1. nemoguardrails/colang/v1_0/lang/comd_parser.py, line 373-374 (link)

    logic: When split_max(sym, ':', 1) returns only 1 token (no colon), object_name is assigned sym itself, which may include the prefix. Consider using object_name = split_max(sym, ':', 1)[1] if ':' in sym else sym or verifying that sym has been prefixed by _get_typed_symbol_name on line 368.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants