feat(ai_grouping): Send token length metrics on stacktraces sent to Seer #99873

yuvmen · 2025-09-18T22:22:08Z

In preparation to making the switch to token length being considered instead of frame count of errors, we take metrics of the token length of stacktraces being sent to be able to map out the statistics and the impact that would make. Insturmented get_token_count to monitor how long it takes.

Introduces usage of tokenizers library for token count. Added the local tokenization model to Sentry to be used for tokenization without external dependencies.

Note

Add token-count metrics for Seer stacktraces using a transformers tokenizer, gate via option, adjust ingest checks, and update tests.

Seer Similarity (backend):
- Add token counting for stacktraces: get_token_count and report_token_count_metric using transformers.AutoTokenizer with jinaai/jina-embeddings-v2-base-en.
- Emit grouping.similarity.token_count distribution and timing via metrics.timer; capture exceptions with sentry_sdk.
- Reorder ingest checks in should_call_seer_for_grouping to run _has_too_many_contributing_frames after _has_empty_stacktrace_string.
Options:
- Register seer.similarity.token_count_metrics_enabled (Bool, default True).
Dependencies:
- Add transformers>=4.21.0.
Tests:
- Add token count tests and adjust assertions to accommodate additional metric calls.

^{Written by Cursor Bugbot for commit d079587. This will update automatically on new commits. Configure here.}

…Seer In preparation to making the switch to token length being considered instead of frame count of errors, we take metrics of the token length of stacktraces being sent to be able to map out the statistics and the impact that would make. Insturmented get_token_count to monitor how long it takes. We use the existing titoken library which was already in use in Sentry.

seer-by-sentry · 2025-09-18T22:30:50Z

src/sentry/seer/similarity/utils.py

            sample_rate=options.get("seer.similarity.metrics_sample_rate"),
            tags={**shared_tags, "outcome": "block"},
        )
+        metrics.distribution(
+            "grouping.similarity.token_count",
+            get_token_count(event, variants),
+            sample_rate=options.get("seer.similarity.metrics_sample_rate"),
+            tags={
+                "platform": platform,
+                "frame_check_outcome": "block",
+            },
+        )
        return True

    metrics.incr(
        "grouping.similarity.frame_count_filter",
        sample_rate=options.get("seer.similarity.metrics_sample_rate"),
        tags={**shared_tags, "outcome": "pass"},
    )
+    metrics.distribution(
+        "grouping.similarity.token_count",
+        get_token_count(event, variants),
+        sample_rate=options.get("seer.similarity.metrics_sample_rate"),
+        tags={
+            "platform": platform,
+            "frame_check_outcome": "pass",
+        },
+    )
    return False




Potential bug: The has_too_many_contributing_frames function calls get_token_count twice in its main execution path, instead of reusing the result from the first call.

Description: In the has_too_many_contributing_frames function, the token count is calculated multiple times unnecessarily. After an initial call to get_token_count whose result is stored in the token_count variable, the function is called again for metrics reporting in both the "block" and "pass" branches, instead of reusing the value from the token_count variable. This leads to a second, redundant execution of an expensive operation in a critical event ingestion path, impacting performance for every event processed by Seer.

Suggested fix: In the "block" and "pass" branches of has_too_many_contributing_frames, reuse the token_count variable for the metrics.timing call instead of calling get_token_count a second time. The "bypass" branch should also be refactored to avoid a separate call, perhaps by calculating the token count once at the top of the function.
_{severity: 0.6, confidence: 0.95}

_{Did we get this right? 👍 / 👎 to inform future reviews.}

codecov · 2025-09-18T22:34:44Z

Codecov Report

❌ Patch coverage is 96.42857% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/sentry/seer/similarity/utils.py	96.36%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #99873      +/-   ##
==========================================
+ Coverage   78.81%   81.30%   +2.49%     
==========================================
  Files        8699     8593     -106     
  Lines      385940   383015    -2925     
  Branches    24413    23858     -555     
==========================================
+ Hits       304162   311429    +7267     
+ Misses      81427    71243   -10184     
+ Partials      351      343       -8

src/sentry/seer/similarity/utils.py

JoshFerge · 2025-09-19T17:00:07Z

src/sentry/seer/similarity/utils.py

+        try:
+            timer_tags["has_content"] = False
+
+            # Get the encoding for cl100k_base (used by GPT-3.5-turbo/GPT-4)


we should double check that this is the same/similar enough to the tokenizer our embeddings model uses

hmm thats actually a good point, I used what we use in Sentry in Seer its actually something else. I looked and it seems like the actual models we use in Seer are environment depenedant, locally it uses a dummy and in production they come from Google storage, ill ask the AI team lets see

Oh, I thought it was the same one! Yeah, I def think we should try to align if possible.

JoshFerge · 2025-09-19T17:00:52Z

src/sentry/seer/similarity/utils.py

+            timer_tags["has_content"] = False
+
+            # Get the encoding for cl100k_base (used by GPT-3.5-turbo/GPT-4)
+            encoding = tiktoken.get_encoding("cl100k_base")


does this line have to be done every function call, or could it be static?

JoshFerge · 2025-09-19T17:05:39Z

src/sentry/seer/similarity/utils.py

+            timer_tags["source"] = "no_stacktrace_string"
+            return 0
+
+        except Exception:


should we report the exception to sentry?

JoshFerge · 2025-09-19T17:08:00Z

src/sentry/seer/similarity/utils.py

        )
+
+
+def get_token_count(


i'm wondering if we should put this behind an option for the initial deploy just to be safe?

a part of me feels like I am wrapping it with a try so it should be fine, but I dont mind being extra safe here. You mean like internally inside this get token to not even perform the try and just return a zero? or around even sending the metrics?

just around being able to debug any exceptions / know why they're happening

oh woops, i replied to the wrong comment. i mean just around running the function to report the metric, in case for whatever reason there's a problem when we deploy and need to roll back. (thinking perf/memory/whatever problem)

JoshFerge

makes sense to me.

lobsterkatie · 2025-09-19T17:26:08Z

src/sentry/seer/similarity/utils.py

+
+
+def get_token_count(
+    event: Event | GroupEvent, variants: dict[str, BaseVariant] | None = None


Is there a reason variants is typed as being nullable? AFAIK that shouldn't be possible.

tests/sentry/seer/similarity/test_utils.py

lobsterkatie · 2025-09-19T17:26:27Z

tests/sentry/seer/similarity/test_utils.py

+            data={"title": "Broken event"},
+        )
+
+        # Should not raise an exception, even with bad data


Does this in fact raise an exception inside the function? Can we assert on that?

I will be adding a repot to sentry there, I will assert it gets called

tests/sentry/seer/similarity/test_utils.py

yuvmen · 2025-09-19T20:22:47Z

refactored a bunch, could use another look - still up in the air if titoken is going to be a good gauge for our seer token count, so I will need to see about that

lobsterkatie · 2025-09-19T20:48:54Z

src/sentry/seer/similarity/utils.py

+
+def report_token_count_metric(
+    event: Event | GroupEvent,
+    variants: dict[str, BaseVariant] | None,


Same as before - this shouldn't be nullable.

bah cursor keeps adding it for some reason 🤦

lobsterkatie · 2025-09-19T20:48:58Z

src/sentry/seer/similarity/utils.py

+        try:
+            timer_tags["has_content"] = False
+            timer_tags["cached"] = event.data.get("stacktrace_string") is not None
+            timer_tags["source"] = "stacktrace_string"


Suggested change

timer_tags["source"] = "stacktrace_string"

timer_tags["source"] = "cached_string"

(To better differentiate from the get_stacktrace_string source.)

but there's a "cached" tag, though I guess I can just lose that and unifty it

actually I prefer it like this, I dont want to have to write "non_cached_string" on the other, and I dont want it to just say "stacktrace_string" and someone to need to know "cached_string" also exists.

lobsterkatie · 2025-09-19T20:49:01Z

src/sentry/seer/similarity/utils.py

+            stacktrace_text = event.data.get("stacktrace_string")
+
+            if stacktrace_text is None:
+                stacktrace_text = get_stacktrace_string(get_grouping_info_from_variants(variants))


Suggested change

stacktrace_text = get_stacktrace_string(get_grouping_info_from_variants(variants))

stacktrace_text = get_stacktrace_string(get_grouping_info_from_variants(variants))

timer_tags["source"] = "get_stacktrace_string"

(To address cursor's complaint, which I think is valid.)

not sure I understand this one, adding the source tag here? I set it above always, and I dont see any cursor comment about this

This is the comment I'm talking about:

So yeah, if you have the cached boolean separately (which I wasn't thinking about), then how you had it originally is fine. If you go with unifying them, though, having cached_stacktrace and get_stacktrace_string as two different values tells you whether or not it was there, just like cached would.

tests/sentry/seer/similarity/test_utils.py

lobsterkatie · 2025-09-19T20:50:31Z

tests/sentry/seer/similarity/test_utils.py

+
+    def test_handles_exception_gracefully(self) -> None:
+        """Test that get_token_count handles exceptions gracefully and returns 0."""
+        # Create an event with cached stacktrace that will cause tiktoken to fail


I feel like this is a misleading comment - the only reason tikitoken fails is because we later force it to, not because of the cached stacktrace.

lobsterkatie · 2025-09-19T20:50:40Z

tests/sentry/seer/similarity/test_utils.py

+            mock_encode.side_effect = ValueError("Tiktoken encoding failed")
+
+            with patch("sentry.seer.similarity.utils.logger.error") as mock_logger_error:
+                token_count = get_token_count(broken_event, variants=None, platform="python")


Suggested change

token_count = get_token_count(broken_event, variants=None, platform="python")

token_count = get_token_count(broken_event, variants={}, platform="python")

(Since variants should be non-nullable.)

cursor · 2025-09-30T18:23:47Z

src/sentry/seer/similarity/utils.py

 ]

 IGNORED_FILENAMES = ["<compiler-generated>"]
+TOKENIZER = Tokenizer.from_pretrained("jinaai/jina-embeddings-v2-base-en")


Bug: Global Tokenizer Initialization Causes Startup Delays

The TOKENIZER is initialized globally at module import time via Tokenizer.from_pretrained(). This can significantly slow down application startup, make unnecessary network requests, and potentially crash the module import if the model fails to load, even when the token counting feature is disabled.

…cal file saved the model locally under data/models and added a readme for downloading it again

cursor · 2025-09-30T22:53:14Z

src/sentry/seer/similarity/utils.py

+
+            if stacktrace_text:
+                timer_tags["has_content"] = True
+                return len(get_tokenizer().encode(stacktrace_text))


Bug: Token Count Calculation and Stacktrace Metrics Issues

The get_token_count function incorrectly calculates token counts by using len() on the tokenizer's Encoding object instead of its .ids attribute. It also reports misleading timer_tags["source"] metrics, showing "stacktrace_string" even when the stacktrace is generated rather than cached. Furthermore, calling get_stacktrace_string with empty variants may lead to a crash.

cursor · 2025-09-30T23:12:49Z

tests/sentry/seer/similarity/test_utils.py

+                token_count = get_token_count(broken_event, variants=variants, platform="python")
+                mock_logger_exception.assert_called()
+
+                assert token_count == 0


Bug: Test Fails to Validate Tokenizer Exception Handling

The test_handles_exception_gracefully test doesn't properly verify tokenizer exception handling. It attempts to mock a non-existent TOKENIZER global, and get_token_count returns early when variants is empty, preventing the mocked exception from ever being reached.

kddubey · 2025-10-01T18:17:51Z

src/sentry/seer/similarity/utils.py

+                # Double-check pattern to avoid race conditions
+                if self._tokenizer is None:
+                    # Try to load from local model first, fallback to remote
+                    if os.path.exists(TOKENIZER_MODEL_PATH):


this should always be true, no? the remote fallback might not be completely secure or successful (e.g., jinaai makes the huggingface repo private)

if for some reason we do want the remote fallback, then it should supply

self._tokenizer = Tokenizer.from_pretrained( "jinaai/jina-embeddings-v2-base-en", revision="322d4d7e2f35e84137961a65af894fda0385eb7a", )

hmm yea I was conflicted on the remote fallback.. looking at it again I am not sure it makes sense, I think I will just remove it. If this stays in our source code it should not get broken and tests will fail if it does. I will consider it if I end up having to host the model file.

getsentry-bot · 2025-10-14T20:37:27Z

PR reverted: 982c529

…ent to Seer (#99873)" This reverts commit 7d8584b. Co-authored-by: yuvmen <[email protected]>

…eer (#101477) In preparation to making the switch to token length being considered instead of frame count of errors, we take metrics of the token length of stacktraces being sent to be able to map out the statistics and the impact that would make. Insturmented get_token_count to monitor how long it takes. Introduces usage of tokenizers library for token count. Added the local tokenization model to Sentry to be used for tokenization without external dependencies. Redo of #99873 which removed `tiktoken` dep by mistake. It is still used in `getsentry` and causes build errors if removed.

…eer (#99873) In preparation to making the switch to token length being considered instead of frame count of errors, we take metrics of the token length of stacktraces being sent to be able to map out the statistics and the impact that would make. Insturmented get_token_count to monitor how long it takes. Introduces usage of `tokenizers` library for token count. Added the local tokenization model to Sentry to be used for tokenization without external dependencies.

…ent to Seer (#99873)" This reverts commit 7d8584b. Co-authored-by: yuvmen <[email protected]>

yuvmen requested a review from a team as a code owner September 18, 2025 22:22

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Sep 18, 2025

yuvmen requested a review from lobsterkatie September 18, 2025 22:22

yuvmen changed the title ~~feat(ai_grouping) - Send token length metrics on stacktraces sent to Seer~~ feat(ai_grouping): Send token length metrics on stacktraces sent to Seer Sep 18, 2025

vercel bot deployed to Preview September 18, 2025 22:23 View deployment

This comment was marked as outdated.

Sign in to view

seer-by-sentry bot reviewed Sep 18, 2025

View reviewed changes

mrduncan reviewed Sep 18, 2025

View reviewed changes

src/sentry/seer/similarity/utils.py Outdated Show resolved Hide resolved

test and typing fix

159a9f9

vercel bot deployed to Preview September 19, 2025 16:34 View deployment

JoshFerge reviewed Sep 19, 2025

View reviewed changes

JoshFerge approved these changes Sep 19, 2025

View reviewed changes

lobsterkatie reviewed Sep 19, 2025

View reviewed changes

Review refactors

fec50e8

yuvmen requested review from JoshFerge and lobsterkatie September 19, 2025 20:21

vercel bot deployed to Preview September 19, 2025 20:23 View deployment

This comment was marked as outdated.

Sign in to view

lobsterkatie reviewed Sep 19, 2025

View reviewed changes

typing fixes

8518cf7

vercel bot deployed to Preview September 19, 2025 21:15 View deployment

This comment was marked as outdated.

Sign in to view

PR comments

4767202

vercel bot deployed to Preview September 19, 2025 21:28 View deployment

❄️ re-freeze requirements

17c3a6a

cursor bot reviewed Sep 30, 2025

View reviewed changes

vercel bot deployed to Preview September 30, 2025 18:24 View deployment

Changed Tokenizer model to be lazy loaded at runtime and load from lo…

5b36cb2

…cal file saved the model locally under data/models and added a readme for downloading it again

vercel bot deployed to Preview September 30, 2025 22:26 View deployment

typing fix

06777b1

vercel bot deployed to Preview September 30, 2025 22:53 View deployment

cursor bot reviewed Sep 30, 2025

View reviewed changes

small refactor to tags and protetction from empty variants

42e61d7

cursor bot reviewed Sep 30, 2025

View reviewed changes

vercel bot deployed to Preview September 30, 2025 23:12 View deployment

fix tests

4a8d2ac

vercel bot deployed to Preview September 30, 2025 23:36 View deployment

kddubey reviewed Oct 1, 2025

View reviewed changes

Merge branch 'master' into yuvmen/token-count-stacktraces-poc

2dbbe74

vercel bot deployed to Preview October 9, 2025 20:18 View deployment

Merge branch 'master' into yuvmen/token-count-stacktraces-poc

2e4d20a

vercel bot deployed to Preview October 14, 2025 16:59 View deployment

removed remote fallback, raised error which gets caught instead

f7de5db

vercel bot deployed to Preview October 14, 2025 18:14 View deployment

yuvmen merged commit 7d8584b into master Oct 14, 2025
68 checks passed

yuvmen deleted the yuvmen/token-count-stacktraces-poc branch October 14, 2025 20:01

yuvmen added the Trigger: Revert Add to a merged PR to revert it (skips CI) label Oct 14, 2025

getsentry-bot added a commit that referenced this pull request Oct 14, 2025

Revert "feat(ai_grouping): Send token length metrics on stacktraces s…

982c529

…ent to Seer (#99873)" This reverts commit 7d8584b. Co-authored-by: yuvmen <[email protected]>

yuvmen mentioned this pull request Oct 14, 2025

feat(ai_grouping): Send token length metrics on stacktraces sent to Seer #101477

Merged

chromy pushed a commit that referenced this pull request Oct 17, 2025

Revert "feat(ai_grouping): Send token length metrics on stacktraces s…

91a8016

…ent to Seer (#99873)" This reverts commit 7d8584b. Co-authored-by: yuvmen <[email protected]>

github-actions bot locked and limited conversation to collaborators Oct 30, 2025



		def get_token_count(
		event: Event \| GroupEvent, variants: dict[str, BaseVariant] \| None = None

	timer_tags["source"] = "stacktrace_string"
	timer_tags["source"] = "cached_string"

	stacktrace_text = get_stacktrace_string(get_grouping_info_from_variants(variants))
	stacktrace_text = get_stacktrace_string(get_grouping_info_from_variants(variants))
	timer_tags["source"] = "get_stacktrace_string"

	token_count = get_token_count(broken_event, variants=None, platform="python")
	token_count = get_token_count(broken_event, variants={}, platform="python")

Uh oh!

feat(ai_grouping): Send token length metrics on stacktraces sent to Seer #99873

feat(ai_grouping): Send token length metrics on stacktraces sent to Seer #99873

Uh oh!

Conversation

yuvmen commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

seer-by-sentry bot Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JoshFerge left a comment

Choose a reason for hiding this comment

Uh oh!

lobsterkatie Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lobsterkatie Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yuvmen commented Sep 19, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot Sep 30, 2025

yuvmen commented Sep 18, 2025 •

edited

Loading

codecov bot commented Sep 18, 2025 •

edited

Loading

lobsterkatie Sep 19, 2025 •

edited

Loading

lobsterkatie Sep 19, 2025 •

edited

Loading

yuvmen Oct 2, 2025 •

edited

Loading