fix(proxy): fix GCP Model Armor guardrail detection and circular reference issue #12991

colesmcintosh · 2025-07-25T18:57:01Z

Title

fix(proxy): fix GCP Model Armor guardrail detection and circular reference issue

Relevant issues

Fixes #12818

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

This PR fixes issue #12818 where GCP Model Armor was always returning "Success" even when it should block harmful content (like bomb-making instructions). The logs showed "standard_logging_guardrail_information": "CircularReference Detected".

Root Cause

The Model Armor implementation was checking for incorrect fields in the API response (blocked, action) that don't exist in the actual GCP Model Armor API
The guardrail logging was creating circular references by storing the entire request data dict

Changes Made

Fixed _should_block_content to check the correct API response fields:
- Now checks filterMatchState and individual filter results instead of non-existent fields
- Properly detects when content should be blocked based on RAI filters, prompt injection detection, etc.
Fixed _get_sanitized_content to extract sanitized text from the correct response location
Added _process_response override to prevent circular references:
- Stores only the Model Armor API response instead of the entire data dict
- Prevents the "CircularReference Detected" error in logging
Updated all test cases to use the actual GCP Model Armor API response format
Added specific tests:
- test_model_armor_bomb_content_blocked: Tests that harmful content is correctly blocked
- test_model_armor_no_circular_reference_in_logging: Verifies no circular references in logging
- test_model_armor_success_case_serializable: Ensures success cases are properly serializable

Testing

All Model Armor tests pass:

poetry run pytest tests/test_litellm/proxy/guardrails/guardrail_hooks/test_model_armor.py -v
============================== 24 passed in 1.70s ==============================

The fix ensures that harmful content like bomb-making instructions will be correctly detected and blocked by Model Armor, and that the guardrail information is properly logged without circular references.

…rence issue Fixes BerriAI#12818 where GCP Model Armor was always returning success even for harmful content. Changes: - Fix _should_block_content to check correct API response fields (filterMatchState instead of non-existent blocked/action fields) - Fix _get_sanitized_content to extract sanitized text from correct response location - Override _process_response to store only Model Armor API response, preventing circular references in logging - Update test cases to use actual GCP Model Armor API response format - Add specific tests for harmful content detection and circular reference prevention The issue was that the implementation was checking for fields that don't exist in the actual GCP Model Armor API response, causing harmful content to never be detected as blocked.

vercel · 2025-07-25T18:57:06Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jul 26, 2025 10:13pm

colesmcintosh · 2025-07-26T21:14:02Z

bugbot run

cursor

Bugbot free trial expires on July 29, 2025
Learn more in the Cursor dashboard.

litellm/proxy/guardrails/guardrail_hooks/model_armor/model_armor.py

…date logging mechanism - Updated `guardrail_status` to support a new "blocked" state in `CustomGuardrail` and `StandardLoggingGuardrailInformation`. - Modified `ModelArmorGuardrail` to store the Model Armor response and status in request metadata, preventing race conditions and ensuring accurate logging for concurrent requests. - Enhanced the logic for determining the guardrail status based on the Model Armor response.

- Added type ignore comment to `guardrail_status` assignment in `ModelArmorGuardrail` to suppress mypy warnings regarding the use of `metadata.get`. - Ensured that the guardrail status logic remains intact while maintaining type safety.

- Adjusted the placement of the type ignore comment for `guardrail_status` in `ModelArmorGuardrail` to improve clarity while maintaining mypy compatibility. - Ensured that the logic for determining guardrail status remains consistent with previous implementations.

- Removed the unnecessary comment about literal extension at runtime for `guardrail_status` in `ModelArmorGuardrail` to enhance code clarity. - Maintained mypy compatibility while ensuring the logic for guardrail status remains unchanged.

krrishdholakia · 2025-07-28T23:43:25Z

@colesmcintosh let me know when this has been manually qa'ed + ready for review

vercel bot deployed to Preview July 25, 2025 18:58 View deployment

colesmcintosh marked this pull request as ready for review July 25, 2025 23:33

cursor bot reviewed Jul 26, 2025

View reviewed changes

litellm/proxy/guardrails/guardrail_hooks/model_armor/model_armor.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview July 26, 2025 21:43 View deployment

vercel bot deployed to Preview July 26, 2025 21:47 View deployment

vercel bot deployed to Preview July 26, 2025 22:09 View deployment

vercel bot deployed to Preview July 26, 2025 22:13 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(proxy): fix GCP Model Armor guardrail detection and circular reference issue #12991

fix(proxy): fix GCP Model Armor guardrail detection and circular reference issue #12991

Uh oh!

colesmcintosh commented Jul 25, 2025

Uh oh!

vercel bot commented Jul 25, 2025 •

edited

Loading

Uh oh!

colesmcintosh commented Jul 26, 2025

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

krrishdholakia commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

fix(proxy): fix GCP Model Armor guardrail detection and circular reference issue #12991

Are you sure you want to change the base?

fix(proxy): fix GCP Model Armor guardrail detection and circular reference issue #12991

Uh oh!

Conversation

colesmcintosh commented Jul 25, 2025

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Root Cause

Changes Made

Testing

Uh oh!

vercel bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

colesmcintosh commented Jul 26, 2025

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

krrishdholakia commented Jul 28, 2025

Uh oh!

Uh oh!

vercel bot commented Jul 25, 2025 •

edited

Loading