diff --git a/.agents/tasks/2025/08/21-0939-codetype-interface b/.agents/tasks/2025/08/21-0939-codetype-interface index 2814139..1a39c4b 100644 --- a/.agents/tasks/2025/08/21-0939-codetype-interface +++ b/.agents/tasks/2025/08/21-0939-codetype-interface @@ -40,3 +40,6 @@ Implement the CodeObjectWrapper as designed. Update the Tracer trait as well as There is an issue in the current implementation. We don't use caching effectively, since we create a new CodeObjectWrapper at each callback_xxx call. We need a global cache, probably keyed by the code object id. Propose design changes and update the design documents. Don't implement the changes themselves before I approve them. --- FOLLOW UP TASK --- Implement the global code object registry. + +--- FOLLOW UP TASK --- +I have ADHD and can't follow long verbose documentation. Rewrite the documents in 'design-docs' because now they are full of shit \ No newline at end of file diff --git a/design-docs/adr/0001-file-level-single-responsibility.md b/design-docs/adr/0001-file-level-single-responsibility.md index 9252118..e2ca322 100644 --- a/design-docs/adr/0001-file-level-single-responsibility.md +++ b/design-docs/adr/0001-file-level-single-responsibility.md @@ -1,65 +1,28 @@ # ADR 0001: File-Level Single Responsibility Refactor - -- **Status:** Proposed +- **Status:** Accepted - **Date:** 2025-10-01 -- **Deciders:** Platform / Runtime Tracing Team -- **Consulted:** Python Tooling WG, Developer Experience WG ## Context - -The codetracer Python recorder crate has evolved quickly and several source files now mix unrelated concerns: -- [`src/lib.rs`](../../codetracer-python-recorder/src/lib.rs) hosts PyO3 module wiring, global logging setup, tracing session state, and filesystem validation in one place. -- [`src/runtime_tracer.rs`](../../codetracer-python-recorder/src/runtime_tracer.rs) interleaves activation gating, writer lifecycle control, PyFrame helpers, and Python value encoding logic, making it challenging to test or extend any portion independently. -- [`src/tracer.rs`](../../codetracer-python-recorder/src/tracer.rs) combines sys.monitoring shim code with the `Tracer` trait, callback registration, and global caches. -- [`codetracer_python_recorder/api.py`](../../codetracer-python-recorder/codetracer_python_recorder/api.py) mixes format constants, backend interaction, context manager ergonomics, and environment based auto-start side effects. - -This violates the Single Responsibility Principle (SRP) at the file level, obscures ownership boundaries, and increases the risk of merge conflicts and regressions. Upcoming work on richer value capture and optional streaming writers will add more logic to these files unless we carve out cohesive modules now. +Big files (`src/lib.rs`, `runtime_tracer.rs`, `tracer.rs`, and `codetracer_python_recorder/api.py`) each juggle multiple jobs. Upcoming value-capture + streaming work would make that worse. ## Decision - -We will reorganise both the Rust crate and supporting Python package so that each file covers a single cohesive topic and exposes a narrow interface. Concretely: -1. Restrict `src/lib.rs` to PyO3 module definition and `pub use` re-exports. Move logging configuration into `src/logging.rs` and tracing session lifecycle into `src/session.rs`. -2. Split the current runtime tracer into a `runtime` module directory with dedicated files for activation control, value encoding, and output file management. The façade in `runtime/mod.rs` will assemble these pieces and expose the existing `RuntimeTracer` API. -3. Introduce a `monitoring` module directory that separates sys.monitoring primitive bindings (`EventId`, `ToolId`, registration helpers) from the `Tracer` trait and callback dispatch logic. -4. Decompose the Python helper package by moving session state management into `session.py`, format constants and validation into `formats.py`, and environment auto-start into `auto_start.py`, while keeping public functions surfaced through `api.py` and `__init__.py`. - -These changes are mechanical reorganisations—no behavioural changes are expected. Public Rust and Python APIs must remain source compatible during the refactor. +Break the crate and Python package into focused modules: +- `src/lib.rs` keeps PyO3 wiring only; logging goes to `logging.rs`, session lifecycle to `session.rs`. +- `runtime/` becomes a directory with `mod.rs`, `activation.rs`, `value_encoder.rs`, `output_paths.rs`. +- `monitoring/` holds sys.monitoring types in `mod.rs` and the `Tracer` trait/dispatcher in `tracer.rs`. +- Python package gains `session.py`, `formats.py`, `auto_start.py`; `api.py` stays as the façade. +Public APIs remain unchanged—files just move. ## Consequences - -- **Positive:** - - Easier onboarding for new contributors because each file advertises a single purpose. - - Improved unit testability; e.g., Python value encoding can be tested without instantiating the full tracer. - - Lower merge conflict risk: teams can edit activation logic without touching writer code. - - Clearer extension points for upcoming streaming writer and richer metadata work. -- **Negative / Risks:** - - Temporary churn in module paths may invalidate outstanding branches; mitigation is to stage work in small, reviewable PRs. - - Developers unfamiliar with Rust module hierarchies will need guidance to update `mod` declarations and `use` paths correctly. - - Python packaging changes require careful coordination to avoid circular imports when moving auto-start logic. - -## Implementation Guidelines for Junior Developers - -1. **Work Incrementally.** Aim for small PRs (≤500 LOC diff) that move one responsibility at a time. After each PR run `just test` and ensure all linters stay green. -2. **Preserve APIs.** When moving functions, re-export them from their new module so that existing callers (Rust and Python) compile without modification in the same PR. -3. **Add Focused Tests.** Whenever a helper is extracted (e.g., value encoding), add or migrate unit tests that cover its edge cases. -4. **Document Moves.** Update doc comments and module-level docs to reflect the new structure. Remove outdated TODOs or convert them into follow-up issues. -5. **Coordinate on Shared Types.** When splitting `runtime_tracer.rs`, agree on ownership for shared structs (e.g., `RuntimeTracer` remains in `runtime/mod.rs`). Use `pub(crate)` to keep internals encapsulated. -6. **Python Imports.** After splitting the Python modules, ensure `__all__` in `__init__.py` continues to export the public API. Use relative imports to avoid accidental circular dependencies. -7. **Parallel Work.** Follow the sequencing from `design-docs/file-level-srp-refactor-plan.md` to know when tasks can proceed in parallel. - -## Testing Strategy - -- Run `just test` locally before submitting each PR. -- Add targeted Rust tests for new modules (e.g., `activation` and `value_encoder`). -- Extend Python tests to cover auto-start logic and the context manager after extraction. -- Compare trace outputs against saved fixtures to ensure refactors do not alter serialized data. - -## Alternatives Considered - -- **Leave the layout as-is:** rejected because it impedes planned features and increases onboarding cost. -- **Large rewrite in a single PR:** rejected due to high risk and code review burden. - -## Follow-Up Actions - -- After completing the refactor, update architecture diagrams in `design-docs` to match the new module structure. -- Schedule knowledge-sharing sessions for new module owners to walk through their areas. +- 👍 Easier onboarding, smaller testable units, fewer merge conflicts, clear extension points. +- 👎 Short-term churn in module paths and imports; coordinate PRs carefully. + +## Notes for implementers +- Move code in small PRs, re-export functions so callers keep working. +- Add/adjust unit tests when extracting helpers. +- Update docs/comments alongside code moves. +- Follow the sequencing in the file-level SRP plan for parallel work. + +## Testing +- Run `just test` after each move. +- Compare trace fixtures to ensure behaviour stays the same. diff --git a/design-docs/adr/0002-function-level-srp.md b/design-docs/adr/0002-function-level-srp.md index df2b745..1d0b61e 100644 --- a/design-docs/adr/0002-function-level-srp.md +++ b/design-docs/adr/0002-function-level-srp.md @@ -1,61 +1,27 @@ -# ADR 0002: Function-Level Single Responsibility Refactor - -- **Status:** Proposed +# ADR 0002: Function-Level SRP +- **Status:** Accepted - **Date:** 2025-10-15 -- **Deciders:** Platform / Runtime Tracing Team -- **Consulted:** Python Tooling WG, Developer Experience WG ## Context - -The codetracer runtime currently exposes several high-traffic functions that blend unrelated concerns, making them difficult to understand, test, and evolve. - -- [`codetracer-python-recorder/src/session.rs:start_tracing`](../../codetracer-python-recorder/src/session.rs) performs logging setup, state guards, filesystem validation and creation, format parsing, Python metadata collection, tracer instantiation, and sys.monitoring installation within one 70+ line function. -- [`codetracer-python-recorder/src/runtime/mod.rs:on_py_start`](../../codetracer-python-recorder/src/runtime/mod.rs) handles activation gating, synthetic filename filtering, argument collection via unsafe PyFrame calls, error logging, and call registration in a single block. -- [`codetracer-python-recorder/src/runtime/mod.rs:on_line`](../../codetracer-python-recorder/src/runtime/mod.rs) interleaves activation checks, frame navigation, locals/globals materialisation, value encoding, variable registration, and memory hygiene for reference counted objects. -- [`codetracer-python-recorder/src/runtime/mod.rs:on_py_return`](../../codetracer-python-recorder/src/runtime/mod.rs) combines activation lifecycle management with value encoding and logging. -- [`codetracer-python-recorder/codetracer_python_recorder/session.py:start`](../../codetracer-python-recorder/codetracer_python_recorder/session.py) mixes backend state checks, path normalisation, format coercion, and PyO3 bridge calls. - -These hotspots violate the Single Responsibility Principle at the function level. When we add new formats, richer activation flows, or additional capture types, we risk regressions because each modification touches fragile, monolithic code blocks. +`start_tracing`, `RuntimeTracer::on_py_start/on_line/on_py_return`, and Python`s `session.start` each cram validation, activation logic, frame poking, and logging into long blocks. That makes fixes risky. ## Decision - -We will refactor high-traffic functions so that each public entry point coordinates narrowly-scoped helpers, each owning a single concern. - -1. **Trace session start-up:** Introduce a `TraceSessionBootstrap` (Rust) that encapsulates directory preparation, format resolution, and program metadata gathering. `start_tracing` will delegate to helpers like `ensure_trace_directory`, `resolve_trace_format`, and `collect_program_metadata`. Python-side `start` will mirror this by delegating validation to dedicated helpers (`validate_trace_path`, `coerce_format`). -2. **Frame inspection & activation gating:** Extract frame traversal and activation decisions into dedicated helpers inside `runtime/frame_inspector.rs` and `runtime/activation.rs`. Callback methods (`on_py_start`, `on_line`, `on_py_return`) will orchestrate the helpers instead of performing raw pointer work inline. -3. **Value capture pipeline:** Move argument, locals, globals, and return value capture to a `runtime::value_capture` module that exposes high-level functions such as `capture_call_arguments(frame, code)` and `record_visible_scope(writer, frame)`. These helpers will own error handling and ensure reference counting invariants, allowing callbacks to focus on control flow. -4. **Logging and error reporting:** Concentrate logging into small, reusable functions (e.g., `log_trace_event(event_kind, code, lineno)`) so that callbacks do not perform ad hoc logging alongside functional work. -5. **Activation lifecycle:** Ensure `ActivationController` remains the single owner for activation state transitions. Callbacks will query `should_process_event` and `handle_deactivation` helpers instead of duplicating checks. - -The refactor maintains public APIs but reorganises internal call graphs to keep each function focused on orchestration. +Turn those hotspots into thin coordinators that call focused helpers: +- Bootstrap helpers prep directories, formats, and metadata before `start_tracing` proceeds. +- `frame_inspector` handles unsafe frame access; `ActivationController` owns gating. +- `value_capture` records arguments, scopes, and returns. +- Logging/error helpers keep messaging consistent. +Python mirrors this with `_validate_trace_path`, `_coerce_format`, etc. +APIs stay the same for callers. ## Consequences +- 👍 Smaller functions, better unit tests, clearer error handling. +- 👎 More helper modules to navigate; moving unsafe code needs care. -- **Positive:** - - Smaller, intention-revealing functions improve readability and lower the mental load for reviewers modifying callback behaviour. - - Reusable helpers unlock targeted unit tests (e.g., for path validation or locals capture) without invoking the entire tracing stack. - - Error handling becomes consistent and auditable when concentrated in dedicated helpers. - - Future features (streaming writers, selective variable capture) can extend isolated helpers rather than modifying monoliths. -- **Negative / Risks:** - - Increased number of private helper modules/functions may introduce slight organisational overhead for newcomers. - - Extracting FFI-heavy logic requires careful lifetime management; mistakes could introduce reference leaks or double-frees. - - Interim refactors might temporarily duplicate logic until all call sites migrate to the new helpers. - -## Implementation Guidelines - -1. **Preserve semantics:** Validate each step with `just test` and targeted regression fixtures to ensure helper extraction does not change runtime behaviour. -2. **Guard unsafe code:** When moving PyFrame interactions, wrap unsafe blocks in documented helpers with clear preconditions and postconditions. -3. **Keep interfaces narrow:** Expose helper functions as `pub(crate)` or module-private to prevent leaking unstable APIs. -4. **Add focused tests:** Unit test helpers for error cases (e.g., invalid path, unsupported format, missing frame) and integrate them into existing test suites. -5. **Stage changes:** Land extractions in small PRs, updating the surrounding code incrementally to avoid giant rewrites. -6. **Document intent:** Update docstrings and module-level docs to describe helper responsibilities, keeping comments aligned with SRP boundaries. - -## Alternatives Considered - -- **Status quo:** Rejected; expanding functionality would keep bloating already-complex functions. -- **Entirely new tracer abstraction:** Unnecessary; existing `RuntimeTracer` shape is viable once responsibilities are modularised. - -## Follow-Up +## Guidance +- Extract one concern at a time, keep helpers `pub(crate)` when possible. +- Wrap unsafe code in documented helpers and add unit tests for each new module. +- Run `just test` + fixture comparisons after every extraction. -- Align sequencing with `design-docs/function-level-srp-refactor-plan.md`. -- Revisit performance benchmarks after extraction to ensure added indirection does not materially affect tracing overhead. +## Follow up +- Track work in the function-level SRP plan and watch performance to ensure the extra indirection stays cheap. diff --git a/design-docs/adr/0003-test-suite-governance.md b/design-docs/adr/0003-test-suite-governance.md index 7fe0dc4..a5e422e 100644 --- a/design-docs/adr/0003-test-suite-governance.md +++ b/design-docs/adr/0003-test-suite-governance.md @@ -1,76 +1,25 @@ -# ADR 0003: Test Suite Governance for codetracer-python-recorder - +# ADR 0003: Test Suite Governance - **Status:** Accepted - **Date:** 2025-10-02 -- **Deciders:** Platform / Runtime Tracing Team -- **Consulted:** Python Tooling WG, Developer Experience WG -- **Informed:** Reliability Engineering Guild ## Context - -`codetracer-python-recorder` currently depends on three distinct harnesses: Rust unit tests inside the crate, Rust integration tests under `codetracer-python-recorder/tests/`, and Python tests under `codetracer-python-recorder/test/`. `just test` wires these together via `cargo nextest run` and `pytest`, but we do not document which behaviours belong to each layer. As a result: -- Contributors duplicate coverage (e.g., API happy paths exist both in Rust integration tests and Python tests) while other areas are untested (no references to `TraceSessionBootstrap::prepare`, `ensure_trace_directory`, or `TraceOutputPaths::configure_writer`). -- The `test/` vs `tests/` split is opaque to new maintainers and tooling; several CI linters only recurse into `tests/`, so Python-only changes can silently reduce coverage. -- Developers add integration-style assertions to Python tests that require spawning interpreters, even when the logic could be exercised cheaply in Rust. -- Doc examples risk drifting from executable reality because doctests are disabled to avoid invoking the CPython runtime. - -Without a clear taxonomy for Rust vs. Python coverage, the test surface is growing unevenly and critical bootstrap/activation code remains unverified. +Rust unit tests, Rust integration tests, and Python tests lived in confusing folders (`tests/` vs `test/`). Coverage overlapped in some spots and missed bootstrap/activation code entirely. ## Decision +Adopt a clear taxonomy and matching layout: +- Rust unit tests stay inline under `src/**`. +- Rust integration tests live in `tests/rust/`. +- Python tests live in `tests/python/`. +- Shared fixtures sit under `tests/support/`. +Every PR states which layer it touched, and changes to PyO3 plumbing require Rust coverage while Python-facing changes require pytest coverage. -We will adopt a tiered test governance model and reorganise the repository to make the boundaries explicit. - -1. **Define a test taxonomy.** - - `src/**/*.rs` unit tests (behind `#[cfg(test)]`) cover pure-Rust helpers, pointer/FFI safety shims, and error handling that does not need to cross the FFI boundary. - - `codetracer-python-recorder/tests/rust/**` integration tests exercise PyO3 + CPython interactions (e.g., `CodeObjectRegistry`, `RuntimeTracer` callbacks) and may spin up embedded interpreters. - - `codetracer-python-recorder/tests/python/**` houses all Python-driven tests (pytest/unittest) for public APIs, end-to-end tracing flows, and environment bootstrapping. - - Documentation examples use doctests only when they can run without Python (otherwise they move into the appropriate test layer). - -2. **Restructure the repository.** - - Rename the existing Python `test/` directory to `tests/python/` and update tooling (`pytest` discovery, `pyproject.toml`, `Justfile`) accordingly. - - Move Rust integration tests into `tests/rust/` (keeping module names unchanged) to mirror the taxonomy. - - Introduce a `tests/README.md` that summarises the policy for future contributors. - -3. **Codify placement rules.** - - Every new test must state its target layer in the PR description and follow the directory conventions above. - - Changes touching PyO3 shims (`session`, `runtime`, `monitoring`) must include at least one Rust test; changes to the Python facade (`codetracer_python_recorder`) must include Python coverage unless the change is rust-only plumbing. - - Shared fixtures (temporary trace directories, sample scripts) live under `tests/support/` and are imported from both Rust and Python harnesses to avoid drift. - -4. **Fill immediate coverage gaps.** - - Add focused Rust unit tests for `TraceSessionBootstrap::prepare`, `ensure_trace_directory`, `resolve_trace_format`, and `collect_program_metadata`, including error paths (non-directory target, unsupported format, missing `sys.argv`). - - Add unit tests for `TraceOutputPaths::new` and `configure_writer` to ensure the writer initialises metadata/events files and starts at the expected location. - - Add deterministic tests for `ActivationController` covering activation on enter, deactivation on return, and behaviour when frames originate from synthetic filenames. - - Extend Python tests to cover `_normalize_activation_path` and failure modes of `_coerce_format`/`_validate_trace_path` without booting the Rust tracer. - -5. **Establish guardrails.** - - Update CI to run `cargo nextest run --workspace --tests` and `pytest tests/python` explicitly, making the split visible in logs. - - Track per-layer test counts in `tests/README.md` and flag regressions in coverage reports once we integrate coverage tooling. +We renamed directories, added `tests/README.md`, and split CI steps (`cargo nextest …`, `pytest tests/python`). Focused tests now cover bootstrap, output paths, activation controller, and Python helpers. ## Consequences +- 👍 Easier onboarding, clearer failures in CI, less duplicate code. +- 👎 Short-term churn in scripts/IDE configs while people learn the new paths. -- **Positive:** - - Onboarding improves because contributors follow a documented decision tree when adding tests. - - Critical bootstrap/activation paths gain deterministic unit coverage, reducing reliance on slow end-to-end scripts. - - CI output clarifies which harness failed, shortening the feedback loop. - - Shared fixtures reduce duplication between Rust and Python tests. - -- **Negative / Risks:** - - The directory rename requires touch-ups in existing scripts, IDE run configurations, and documentation. - - Contributors must learn the taxonomy, and reviews need to police placement for a few weeks. - - Running Python tests from a subdirectory may miss legacy tests until the migration completes; mitigated by performing the move in the same PR as the tooling updates. - -## Implementation Notes - -- Perform restructuring in stages (rename directories, update tooling, then move tests) to keep diffs reviewable. -- Introduce helper crates/modules under `tests/support/` to share temporary-directory setup between Rust and Python as soon as the taxonomy lands. -- Add `ruff` and `cargo fmt` hooks to ensure moved tests stay linted after the reorganisation. - -## Status Tracking - -- This ADR is **Accepted**. Directory restructuring, unit/integration coverage for the targeted modules, and the split CI/coverage jobs have landed; future adjustments will be tracked in follow-up ADRs if required. - -## Alternatives Considered - -- **Keep the current layout and document it informally.** Rejected because the `test/` vs `tests/` split is already causing confusion and does not solve the missing coverage gaps. -- **Create a monolithic Python integration test harness only.** Rejected because many FFI safety checks are cheaper to assert in Rust without spinning up subprocesses. -- **Adopt a coverage percentage gate.** Deferred until we have stable baselines; enforcing a percentage before addressing the structural issues would block unrelated work. +## Implementation notes +- Move files + update tooling in the same PR to avoid gaps. +- Share fixtures early so both languages use the same helpers. +- Keep doctests disabled unless they run without CPython. diff --git a/design-docs/adr/0004-error-handling-policy.md b/design-docs/adr/0004-error-handling-policy.md index bf1627a..74606be 100644 --- a/design-docs/adr/0004-error-handling-policy.md +++ b/design-docs/adr/0004-error-handling-policy.md @@ -1,105 +1,25 @@ -# ADR 0004: Error Handling Policy for codetracer-python-recorder - -- **Status:** Proposed +# ADR 0004: Error Handling Policy +- **Status:** Accepted - **Date:** 2025-10-02 -- **Deciders:** Runtime Tracing Maintainers -- **Consulted:** Python Tooling WG, Observability WG -- **Informed:** Developer Experience WG, Release Engineering ## Context - -The Rust-backed recorder currently propagates errors piecemeal: -- PyO3 entry points bubble up plain `PyRuntimeError` instances with free-form strings (e.g., `src/session.rs:21-52`, `src/runtime/mod.rs:77-126`). -- Runtime helpers panic on invariant violations, which will abort the host interpreter because we do not fence panics at the FFI boundary (`src/runtime/mod.rs:107-120`, `src/runtime/activation.rs:24-33`, `src/runtime/value_encoder.rs:61-78`). -- Monitoring callbacks rely on `GLOBAL.lock().unwrap()` so poisoned mutexes or lock errors terminate the process (`src/monitoring/tracer.rs:268` and subsequent callback shims). -- Python helpers expose bare `RuntimeError`/`ValueError` without linking to a shared policy, and auto-start simply re-raises whatever the Rust layer emits (`codetracer_python_recorder/session.py:27-63`, `codetracer_python_recorder/auto_start.py:24-36`). -- Exit codes, log destinations, and trace-writer fallback behaviour are implicit; a disk-full failure today yields a generic exception and can leave partially written outputs. - -The lack of a central error façade makes it hard to enforce user-facing guarantees, reason about detaching vs aborting behaviour, or meet the operational goals we have been given: stable error codes, structured logs, optional JSON diagnostics, policy switches, and atomic trace outputs. +Errors currently bubble up as ad-hoc `PyRuntimeError`s or panics that can crash the host process. Messages vary, nothing is classified, and trace files can be left half-written. ## Decision - -We will introduce a recorder-wide error handling policy centred on a dedicated `recorder-errors` crate and a Python exception hierarchy. The policy follows fifteen guiding principles supplied by operations and is designed so the “right way” is the only easy way for contributors. - -### 1. Single Error Façade -- Create a new workspace crate `recorder-errors` exporting `RecorderError`, a structural error type with fields `{ kind: ErrorKind, code: ErrorCode, message: Cow<'static, str>, context: ContextMap, source: RecorderErrorSource }`. -- Provide `RecorderResult = Result` and convenience macros (`usage!`, `enverr!`, `target!`, `bug!`, `ensure_usage!`, `ensure_env!`, etc.) so Rust modules can author classified failures with one line. -- Require every other crate (including the PyO3 module) to depend on `recorder-errors`; direct construction of `PyErr`/`io::Error` is disallowed outside the façade. -- Maintain `ErrorCode` as a small, grep-able enum (e.g., `ERR_TRACE_DIR_NOT_DIR`, `ERR_FORMAT_UNSUPPORTED`), with documentation in the crate so codes stay stable across releases. - -### 2. Clear Classification & Exit Codes -- Define four top-level `ErrorKind` variants: - - `Usage` (caller mistakes, bad flags, conflicting sessions). - - `Environment` (IO, permissions, resource exhaustion). - - `Target` (user code raised or misbehaved while being traced). - - `Internal` (bugs, invariants, unexpected panics). -- Map kinds to fixed process exit codes (`Usage=2`, `Environment=10`, `Target=20`, `Internal=70`). These are surfaced by CLI utilities and exported via the Python module for embedding tooling. -- Document canonical examples for each kind in the ADR appendix and in crate docs. - -### 3. FFI Safety & Python Exceptions -- Add an `ffi` module that wraps every `#[pyfunction]` with `catch_unwind`, converts `RecorderError` into a custom Python exception hierarchy (`RecorderError` base, subclasses `UsageError`, `EnvironmentError`, `TargetError`, `InternalError`), and logs panic payloads before mapping them to `InternalError`. -- PyO3 callbacks (`install_tracer`, monitoring trampolines) will run through `ffi::dispatch`, ensuring we never leak panics across the boundary. - -### 4. Output Channels & Diagnostics -- Forbid `println!`/`eprintln!` outside the logging module; diagnostic output goes to stderr via `tracing`/`log` infrastructure. -- Introduce a structured logging wrapper that attaches `{ run_id, trace_id, error_code }` fields to every error record. Provide `--log-level`, `--log-file`, and `--json-errors` switches that route structured diagnostics either to stderr or a configured file. - -### 5. Policy Switches -- Introduce a runtime policy singleton (`RecorderPolicy` stored in `OnceCell`) configured via CLI flags or environment variables: `--on-recorder-error=abort|disable`, `--require-trace`, `--keep-partial-trace`. -- Define semantics: `abort` -> propagate error and non-zero exit; `disable` -> detach tracer, emit structured warning, continue host process. Document exit codes for each combination in module docs. - -### 6. Atomic, Truthful Outputs -- Wrap trace writes behind an IO façade that stages files in a temp directory and performs atomic rename on success. -- When `--keep-partial-trace` is enabled, mark outputs with a `partial=true`, `reason=` trailer. Otherwise ensure no trace files are left behind on failure. - -### 7. Assertions with Containment -- Replace `expect`/`unwrap` (e.g., `src/runtime/mod.rs:114`, `src/runtime/activation.rs:26`, `src/runtime/value_encoder.rs:70`) with classified `bug!` assertions that convert to `RecorderError` while still triggering `debug_assert!` in dev builds. -- Document invariants in the new crate and ensure fuzzing/tests observe the diagnostics. - -### 8. Preflight Checks -- Centralise version/compatibility checks in a `preflight` module called from `start_tracing`. Validate Python major.minor, ABI compatibility, trace schema version, and feature flags before installing monitoring callbacks. -- Embed recorder version, schema version, and policy hash into every trace metadata file via `TraceWriter` extensions. - -### 9. Observability & Metrics -- Emit structured counters for key error pathways (dropped events, detach reasons, panics caught). Provide a `RecorderMetrics` sink with a no-op default and an optional exporter trait. -- When `--json-errors` is set, append a single-line JSON trailer to stderr containing `{ "error_code": .., "kind": .., "message": .., "context": .. }`. - -### 10. Failure-Path Testing -- Add exhaustive unit tests in `recorder-errors` for every `ErrorCode` and conversion path. -- Extend Rust integration tests to simulate disk-full (`ENOSPC`), permission denied, target exceptions, callback panics, SIGINT during detach, and partial trace recovery. -- Add Python tests asserting the custom exception hierarchy and policy toggles behave as documented. - -### 11. Performance-Aware Defences -- Reserve heavyweight diagnostics (stack captures, large context maps) for error paths. Hot callbacks use cheap checks (`debug_assert!` in release builds). Provide sampled validation hooks if additional runtime checks become necessary. - -### 12. Tooling Enforcement -- Add workspace lints (`deny(panic_in_result_fn)`, Clippy config) and a `just lint-errors` task that fails if `panic!`, `unwrap`, or `expect` appear outside `recorder-errors`. -- Disallow `anyhow`/`eyre` except inside the error façade with documented justification. - -### 13. Developer Ergonomics -- Export prelude modules (`use recorder_errors::prelude::*;`) so contributors get macros and types with a single import. -- Provide cookbook examples in the crate documentation and link the ADR so developers know how to map new errors to codes quickly. - -### 14. Documented Guarantees -- Document, in README + crate docs, the three promises: no stdout writes, trace outputs are atomic (or explicitly partial), and error codes stay stable within a minor version line. - -### 15. Scope & Non-Goals -- The recorder never aborts the host process; even internal bugs downgrade to `InternalError` surfaced through policy switches. -- Business-specific retention, shipping logs, or analytics integrations remain out of scope for this ADR. +Adopt a single policy driven by a new `recorder-errors` crate and matching Python exceptions: +- `RecorderError` carries `{kind, code, message, context}` with a tiny enum for codes. +- Kinds map to fixed exit codes: Usage=2, Environment=10, Target=20, Internal=70. +- All PyO3 entry points go through a wrapper that catches panics and turns errors into `RecorderError` subclasses (`UsageError`, etc.). +- Output writing stages files in a temp dir and either renames atomically or marks the trace as partial. +- A runtime policy switch controls whether we abort or just disable tracing on failure. +- Logging uses structured records; optional JSON diagnostics attach `run_id`, `trace_id`, and `error_code`. +- `panic!`, `unwrap`, and `expect` are banned outside guarded helpers; macros (`usage!`, `enverr!`, `bug!`) replace them. ## Consequences +- 👍 Stable error codes, no more interpreter-aborting panics, consistent logs for tooling. +- 👎 More plumbing to retrofit existing call sites and slightly more disk I/O for atomic writes. -- **Positive:** Structured errors enable user tooling, stable exit codes improve scripting, and panics are contained so we remain embedder-friendly. Central macros reduce boilerplate and make reviewers enforce policy easily. -- **Negative / Risks:** Introducing a new crate and policy layer adds upfront work and requires retrofitting existing call sites. Atomic IO staging may increase disk usage for large traces. Contributors must learn the new taxonomy and update tests accordingly. - -## Rollout & Status Tracking - -- Implementation proceeds under a dedicated plan (see "Error Handling Implementation Plan"). The ADR moves to **Accepted** once the façade crate, FFI wrappers, and policy switches are merged, and the legacy ad-hoc errors are removed. -- Future adjustments (e.g., new error codes) must update `recorder-errors` documentation and ensure backward compatibility for exit codes. - -## Alternatives Considered - -- **Use `anyhow` throughout and convert at the boundary.** Rejected because it obscures error provenance, offers no stable codes, and encourages stringly-typed errors. -- **Catch panics lazily within individual callbacks.** Rejected; a central wrapper keeps the policy uniform and ensures we do not miss newer entry points. -- **Rely on existing logging without policy switches.** Rejected because operational requirements demand scriptable behaviour on failure. - +## Rollout notes +- Land the crate, swap call sites to `RecorderResult`, and update Python wrappers to raise the new hierarchy. +- Add tests for every error code, simulated IO failures, policy toggles, and panic containment. +- Document the guarantees in README + crate docs: no stdout noise, atomic (or explicitly partial) traces, stable codes per minor version. diff --git a/design-docs/code-object.md b/design-docs/code-object.md index eaec829..0050356 100644 --- a/design-docs/code-object.md +++ b/design-docs/code-object.md @@ -1,141 +1,37 @@ -# Code Object Wrapper Design - -## Overview - -The Python Monitoring API delivers a generic `CodeType` object to every tracing callback. The current `Tracer` trait surfaces this object as `&Bound<'_, PyAny>`, forcing every implementation to perform attribute lookups and type conversions manually. This document proposes a `CodeObjectWrapper` type that exposes a stable, typed interface to the underlying code object while minimizing per-event overhead. - -## Goals -- Provide a strongly typed API for common `CodeType` attributes needed by tracers and recorders. -- Ensure lookups are cheap by caching values and avoiding repeated Python attribute access. -- Maintain a stable identity for each code object to correlate events across callbacks. -- Avoid relying on the unstable `PyCodeObject` layout from the C API. - -## Non-Goals -- Full re‑implementation of every `CodeType` attribute. Only the fields required for tracing and time‑travel debugging are exposed. -- Direct mutation of `CodeType` objects. The wrapper offers read‑only access. - -## Proposed API - -```rs -pub struct CodeObjectWrapper { - /// Owned reference to the Python `CodeType` object. - /// Stored as `Py` so it can be held outside the GIL. - obj: Py, - /// Stable identity equivalent to `id(code)`. - id: usize, - /// Lazily populated cache for expensive lookups. - cache: CodeObjectCache, -} - -pub struct CodeObjectCache { - filename: OnceCell, - qualname: OnceCell, - firstlineno: OnceCell, - argcount: OnceCell, - flags: OnceCell, - /// Mapping of instruction offsets to line numbers. - lines: OnceCell>, -} - -pub struct LineEntry { - pub offset: u32, - pub line: u32, -} - -impl CodeObjectWrapper { - /// Construct from a `CodeType` object. Computes `id` eagerly. - pub fn new(py: Python<'_>, obj: &Bound<'_, PyCode>) -> Self; - - /// Borrow the owned `Py` as a `Bound<'py, PyCode>`. - /// This follows PyO3's recommendation to prefer `Bound<'_, T>` over `Py` - /// for object manipulation. - pub fn as_bound<'py>(&'py self, py: Python<'py>) -> Bound<'py, PyCode>; - - /// Accessors fetch from the cache or perform a one‑time lookup under the GIL. - pub fn filename<'py>(&'py self, py: Python<'py>) -> PyResult<&'py str>; - pub fn qualname<'py>(&'py self, py: Python<'py>) -> PyResult<&'py str>; - pub fn first_line(&self, py: Python<'_>) -> PyResult; - pub fn arg_count(&self, py: Python<'_>) -> PyResult; - pub fn flags(&self, py: Python<'_>) -> PyResult; - - /// Return the source line for a given instruction offset using a binary search on `lines`. - pub fn line_for_offset(&self, py: Python<'_>, offset: u32) -> PyResult>; - - /// Expose the stable identity for cross‑event correlation. - pub fn id(&self) -> usize; -} -``` - -### Global registry - -To avoid constructing a new wrapper for every tracing event, a global cache -stores `CodeObjectWrapper` instances keyed by their stable `id`: - -```rs -pub struct CodeObjectRegistry { - map: DashMap>, -} - -impl CodeObjectRegistry { - pub fn get_or_insert( - &self, - py: Python<'_>, - code: &Bound<'_, PyCode>, - ) -> Arc; - - /// Optional explicit removal for long‑running processes. - pub fn remove(&self, id: usize); -} -``` - -`CodeObjectWrapper::new` remains available, but production code is expected to -obtain instances via `CodeObjectRegistry::get_or_insert` so each unique code -object is wrapped only once. The registry is designed to be thread‑safe -(`DashMap`) and the wrappers are reference counted (`Arc`) so multiple threads -can hold references without additional locking. - -### Trait Integration - -The `Tracer` trait will be adjusted so every callback receives `&CodeObjectWrapper` instead of a generic `&Bound<'_, PyAny>`: - -```rs -fn on_line(&mut self, py: Python<'_>, code: &CodeObjectWrapper, lineno: u32); -fn on_py_start(&mut self, py: Python<'_>, code: &CodeObjectWrapper, offset: i32); -// ...and similarly for the remaining callbacks. -``` - -## Usage Examples - -### Retrieving wrappers from the global registry - -```rs -static CODE_REGISTRY: Lazy = Lazy::new(CodeObjectRegistry::default); - -fn on_line(&mut self, py: Python<'_>, code: &Bound<'_, PyCode>, lineno: u32) { - let wrapper = CODE_REGISTRY.get_or_insert(py, code); - let filename = wrapper.filename(py).unwrap_or(""); - eprintln!("{}:{}", filename, lineno); -} -``` - -Once cached, subsequent callbacks referencing the same `CodeType` will reuse the -existing wrapper without recomputing any attributes. - -## Performance Considerations -- `Py` allows cloning the wrapper without holding the GIL, enabling cheap event propagation. -- Methods bind the owned reference to `Bound<'py, PyCode>` on demand, following PyO3's `Bound`‑first guidance and avoiding accidental `Py` clones. -- Fields are loaded lazily and stored inside `OnceCell` containers to avoid repeated attribute lookups. -- `line_for_offset` memoizes the full line table the first time it is requested; subsequent calls perform an in‑memory binary search. -- Storing strings and small integers directly in the cache eliminates conversion cost on hot paths. -- A global `CodeObjectRegistry` ensures that wrapper construction and attribute - discovery happen at most once per `CodeType`. - -## Open Questions -- Additional attributes such as `co_consts` or `co_varnames` may be required for richer debugging features; these can be added later as new `OnceCell` fields. -- Thread‑safety requirements may necessitate wrapping the cache in `UnsafeCell` or providing internal mutability strategies compatible with `Send`/`Sync`. -- The registry currently grows unbounded; strategies for eviction or weak - references may be needed for long‑running processes that compile many - transient code objects. +# Code Object Wrapper (Quick Guide) + +## Why this exists +- Python hands our tracer a raw `CodeType` object. +- Touching it directly is slow and messy because every field lookup goes through Python. +- We want one light wrapper that turns those lookups into cheap Rust calls. + +## What we are building +- `CodeObjectWrapper` owns the `PyCode`, remembers its `id`, and lazily caches the handful of fields we actually use (file name, qualname, first line, arg count, flags, and line table). +- The cache lives in `OnceCell` slots so each field is fetched once per code object. +- Accessors hand back borrowed data (`Bound<'py, PyCode>`, `&str`, numbers) without cloning Python objects. + +## How it is used +- Keep a global `CodeObjectRegistry` keyed by `id(code)`. +- `get_or_insert` returns an `Arc` so every callback reuses the same wrapper. +- The `Tracer` trait should accept `&CodeObjectWrapper` instead of `&Bound<'_, PyAny>`. +- Typical flow inside a callback: + 1. Ask the registry for the wrapper. + 2. Call `wrapper.filename(py)?` (or similar) and work with the cached values. + +## Performance promises +- No repeat attribute fetches: the first call fills the cache, later calls are pure Rust reads. +- Wrappers clone cheaply across threads because they carry `Py` inside an `Arc`. +- Line lookups build the mapping once and then use a binary search. + +## Edge notes +- The wrapper is read‑only; mutation stays out of scope. +- We only expose the fields we need today. Add more `OnceCell` slots later if new features demand them. +- Registry growth is unbounded right now—add eviction or weak references if long‑running tools need it. + +## When we are done +- All tracer callbacks receive a `&CodeObjectWrapper`. +- Code that previously poked at `PyAny` now calls the typed helpers. +- Benchmarks show fewer Python attribute hits on hot paths. ## References - [Python `CodeType` objects](https://docs.python.org/3/reference/datamodel.html#code-objects) diff --git a/design-docs/design-001.md b/design-docs/design-001.md index 063e668..efaeb5b 100644 --- a/design-docs/design-001.md +++ b/design-docs/design-001.md @@ -1,328 +1,45 @@ -# Python sys.monitoring Tracer Design - -## Overview - -This document outlines the design for integrating Python's `sys.monitoring` API with the `runtime_tracing` format. The goal is to produce CodeTracer-compatible traces for Python programs without modifying the interpreter. - -The tracer collects `sys.monitoring` events, converts them to `runtime_tracing` events, and streams them to `trace.json`/`trace.bin` along with metadata and source snapshots. - -## Architecture - -### Tracer Abstraction -Rust code exposes a `Tracer` trait representing callbacks for Python -`sys.monitoring` events. Implementations advertise their desired events via an -`EventMask` bit flag returned from `interest`. A `Dispatcher` wraps a trait -object and forwards events only when the mask contains the corresponding flag, -allowing tracers to implement just the methods they care about. - -### Tool Initialization -- Acquire a tool identifier via `sys.monitoring.use_tool_id`; store it for the lifetime of the tracer. - ```rs - pub const MONITORING_TOOL_NAME: &str = "codetracer"; - pub struct ToolId { pub id: u8 } - pub fn acquire_tool_id() -> PyResult; - ``` -- Register one callback per event using `sys.monitoring.register_callback`. - ```rs - #[repr(transparent)] - pub struct EventId(pub u64); // Exact value loaded from sys.monitoring.events.* - - pub struct MonitoringEvents { - pub BRANCH: EventId, - pub CALL: EventId, - pub C_RAISE: EventId, - pub C_RETURN: EventId, - pub EXCEPTION_HANDLED: EventId, - pub INSTRUCTION: EventId, - pub JUMP: EventId, - pub LINE: EventId, - pub PY_RESUME: EventId, - pub PY_RETURN: EventId, - pub PY_START: EventId, - pub PY_THROW: EventId, - pub PY_UNWIND: EventId, - pub PY_YIELD: EventId, - pub RAISE: EventId, - pub RERAISE: EventId, - pub STOP_ITERATION: EventId, - } - - pub fn load_monitoring_events(py: Python<'_>) -> PyResult; - - // Python-level callback registered via sys.monitoring.register_callback - pub type CallbackFn = PyObject; - pub fn register_callback(tool: &ToolId, event: &EventId, cb: &CallbackFn) -> PyResult<()>; - ``` -- Enable all desired events by bitmask with `sys.monitoring.set_events`. - ```rs - #[derive(Clone, Copy)] - pub struct EventSet(pub u64); - - pub fn events_union(ids: &[EventId]) -> EventSet; - pub fn set_events(tool: &ToolId, set: EventSet) -> PyResult<()>; - ``` - -### Writer Management -- Open a `runtime_tracing` writer (`trace.json` or `trace.bin`) during `start_tracing`. - ```rs - pub enum OutputFormat { Json, Binary } - pub struct TraceWriter { pub format: OutputFormat } - pub fn start_tracing(path: &Path, format: OutputFormat) -> io::Result; - ``` -- Expose methods to append metadata and file copies using existing `runtime_tracing` helpers. - ```rs - pub fn append_metadata(writer: &mut TraceWriter, meta: &TraceMetadata); - pub fn copy_source_file(writer: &mut TraceWriter, path: &Path) -> io::Result<()>; - ``` -- Flush and close the writer when tracing stops. - ```rs - pub fn stop_tracing(writer: TraceWriter) -> io::Result<()>; - ``` - -### Frame and Thread Tracking -- Maintain a per-thread stack of activation identifiers to correlate `CALL`, `PY_START`, yields, and returns. Since `sys.monitoring` callbacks provide `CodeType` and offsets (not frames), we rely on the nesting order of events to track activations. - ```rs - pub type ActivationId = u64; - pub struct ThreadState { pub stack: Vec } - pub fn current_thread_state() -> &'static mut ThreadState; - ``` -- Associate activations with `CodeType` objects and instruction/line offsets as needed for cross-referencing, without depending on `PyFrameObject`. - ```rs - pub struct Activation { - pub id: ActivationId, - // Hold a GIL-independent handle to the CodeType object. - // Access required attributes via PyO3 attribute lookup (getattr) under the GIL. - pub code: PyObject, - } - ``` -- Record thread start/end events when a thread first emits a monitoring event and when it finishes. - ```rs - pub fn on_thread_start(thread_id: u64); - pub fn on_thread_stop(thread_id: u64); - ``` - -### Code Object Access Strategy (no reliance on PyCodeObject internals) -- Rationale: PyO3 exposes `ffi::PyCodeObject` as an opaque type. Instead of touching its unstable layout, treat code objects as generic Python objects and access only stable Python-level attributes via PyO3's `getattr` on `&PyAny`. - ```rs - use pyo3::{prelude::*, types::PyAny}; - - #[derive(Clone)] - pub struct CodeInfo { - pub filename: String, - pub qualname: String, - pub firstlineno: u32, - pub flags: u32, - } - - /// Stable identity for a code object during its lifetime. - /// Uses the object's address while GIL-held; equivalent to Python's id(code). - pub fn code_id(py: Python<'_>, code: &PyAny) -> usize { - code.as_ptr() as usize - } - - /// Extract just the attributes we need, via Python attribute access. - pub fn extract_code_info(py: Python<'_>, code: &PyAny) -> PyResult { - let filename: String = code.getattr("co_filename")?.extract()?; - // Prefer co_qualname if present, else fallback to co_name - let qualname: String = match code.getattr("co_qualname") { - Ok(q) => q.extract()?, - Err(_) => code.getattr("co_name")?.extract()?, - }; - let firstlineno: u32 = code.getattr("co_firstlineno")?.extract()?; - let flags: u32 = code.getattr("co_flags")?.extract()?; - Ok(CodeInfo { filename, qualname, firstlineno, flags }) - } - - /// Cache minimal info to avoid repeated getattr and to assign stable IDs. - pub struct CodeRegistry { - pub map: std::collections::HashMap, - } - - impl CodeRegistry { - pub fn new() -> Self { Self { map: Default::default() } } - pub fn intern(&mut self, py: Python<'_>, code: &PyAny) -> PyResult { - let id = code_id(py, code); - if !self.map.contains_key(&id) { - let info = extract_code_info(py, code)?; - self.map.insert(id, info); - } - Ok(id) - } - } - ``` -- Event handler inputs use `PyObject` for the `code` parameter. Borrow to `&PyAny` with `let code = code.bind(py);` when needed, then consult `CodeRegistry`. -- For line numbers: rely on the `LINE` event’s provided `line_number`. If instruction offsets need mapping, call `code.getattr("co_lines")()?.call0()?` and iterate lazily; avoid caching unless necessary. - -## Event Handling - -Each bullet below represents a low-level operation translating a single `sys.monitoring` event into the `runtime_tracing` stream. - -### Control Flow -- **PY_START** – Create a `Function` event for the code object and push a new activation ID onto the thread's stack. - ```rs - pub fn on_py_start(code: PyObject, instruction_offset: i32); - ``` -- **PY_RESUME** – Emit an `Event` log noting resumption and update the current activation's state. - ```rs - pub fn on_py_resume(code: PyObject, instruction_offset: i32); - ``` -- **PY_RETURN** – Pop the activation ID, write a `Return` event with the value (if retrievable), and link to the caller. - ```rs - pub struct ReturnRecord { pub activation: ActivationId, pub value: Option } - pub fn on_py_return(code: PyObject, instruction_offset: i32, retval: *mut PyObject); - ``` -- **PY_YIELD** – Record a `Return` event flagged as a yield and keep the activation on the stack for later resumes. - ```rs - pub fn on_py_yield(code: PyObject, instruction_offset: i32, retval: *mut PyObject); - ``` -- **STOP_ITERATION** – Emit an `Event` indicating iteration exhaustion for the current activation. - ```rs - pub fn on_stop_iteration(code: PyObject, instruction_offset: i32, exception: *mut PyObject); - ``` -- **PY_UNWIND** – Mark the beginning of stack unwinding and note the target handler in an `Event`. - ```rs - pub fn on_py_unwind(code: PyObject, instruction_offset: i32, exception: *mut PyObject); - ``` -- **PY_THROW** – Emit an `Event` describing the thrown value and the target generator/coroutine. - ```rs - pub fn on_py_throw(code: PyObject, instruction_offset: i32, exception: *mut PyObject); - ``` -- **RERAISE** – Log a re-raise event referencing the original exception. - ```rs - pub fn on_reraise(code: PyObject, instruction_offset: i32, exception: *mut PyObject); - ``` - -### Call and Line Tracking -- **CALL** – Record a `Call` event, capturing the `callable` and the first argument if available (`arg0` as provided by `sys.monitoring`), and associate a new activation. - ```rs - pub fn on_call(code: PyObject, instruction_offset: i32, callable: *mut PyObject, arg0: Option<*mut PyObject>) -> ActivationId; - ``` -- **LINE** – Write a `Step` event with current path and line number; ensure the path is registered. - ```rs - pub fn on_line(code: PyObject, line_number: u32); - ``` -- **INSTRUCTION** – Optionally emit a fine-grained `Event` keyed by `instruction_offset`. Opcode names can be derived from the `CodeType` if desired. - ```rs - pub fn on_instruction(code: PyObject, instruction_offset: i32); - ``` -- **JUMP** – Append an `Event` describing the jump target offset for control-flow visualization. - ```rs - pub fn on_jump(code: PyObject, instruction_offset: i32, destination_offset: i32); - ``` -- **BRANCH** – Record an `Event` with `destination_offset`; whether the branch was taken can be inferred by comparing to the fallthrough offset. - ```rs - pub fn on_branch(code: PyObject, instruction_offset: i32, destination_offset: i32); - ``` - _Note_: Current runtime_tracing doesn't support branching events, but instead relies on AST tree-sitter analysis. So for the initial version we will ignore them and can add support after modifications to the tracing format. - -### Exception Lifecycle -- **RAISE** – Emit an `Event` containing exception type and message when raised. - ```rs - pub fn on_raise(code: PyObject, instruction_offset: i32, exception: *mut PyObject); - ``` -- **EXCEPTION_HANDLED** – Log an `Event` marking when an exception is caught. - ```rs - pub fn on_exception_handled(code: PyObject, instruction_offset: i32, exception: *mut PyObject); - ``` - -### C API Boundary -- **C_RETURN** – On returning from a C function, emit a `Return` event tagged as foreign. Note: `sys.monitoring` does not provide the result object for `C_RETURN`. - ```rs - pub fn on_c_return(code: PyObject, instruction_offset: i32, callable: *mut PyObject, arg0: Option<*mut PyObject>); - ``` -- **C_RAISE** – When a C function raises, record an `Event` that a C-level callable raised. Note: `sys.monitoring` does not pass the exception object for `C_RAISE`. - ```rs - pub fn on_c_raise(code: PyObject, instruction_offset: i32, callable: *mut PyObject, arg0: Option<*mut PyObject>); - ``` - -### No Events -- **NO_EVENTS** – Special constant; used only to disable monitoring. No runtime event is produced. - ```rs - pub const NO_EVENTS: EventSet = EventSet(0); - ``` - -## Metadata and File Capture -- Collect the working directory, program name, and arguments and store them in `trace_metadata.json`. - ```rs - pub struct TraceMetadata { pub cwd: PathBuf, pub program: String, pub args: Vec } - pub fn write_metadata(writer: &mut TraceWriter, meta: &TraceMetadata); - ``` -- Track every file path referenced; copy each into the trace directory under `files/`. - ```rs - pub fn track_file(writer: &mut TraceWriter, path: &Path) -> io::Result<()>; - ``` -- Record `VariableName`, `Type`, and `Value` entries when variables are inspected or logged. - ```rs - pub struct VariableRecord { pub name: String, pub ty: TypeId, pub value: ValueRecord } - pub fn record_variable(writer: &mut TraceWriter, rec: VariableRecord); - ``` - -## Value Translation and Recording -- Maintain a type registry that maps Python `type` objects to `runtime_tracing` `Type` entries and assigns new `type_id` values on first encounter. - ```rs - pub type TypeId = u32; - pub type ValueId = u64; - pub enum ValueRecord { Int(i64), Float(f64), Bool(bool), None, Str(String), Raw(Vec), Sequence(Vec), Tuple(Vec), Struct(Vec<(String, ValueRecord)>), Reference(ValueId) } - pub struct TypeRegistry { next: TypeId, map: HashMap<*mut PyTypeObject, TypeId> } - pub fn intern_type(reg: &mut TypeRegistry, ty: *mut PyTypeObject) -> TypeId; - ``` -- Convert primitives (`int`, `float`, `bool`, `None`, `str`) directly to their corresponding `ValueRecord` variants. - ```rs - pub fn encode_primitive(obj: *mut PyObject) -> Option; - ``` -- Encode `bytes` and `bytearray` as `Raw` records containing base64 text to preserve binary data. - ```rs - pub fn encode_bytes(obj: *mut PyObject) -> ValueRecord; - ``` -- Represent lists and sets as `Sequence` records and tuples as `Tuple` records, converting each element recursively. - ```rs - pub fn encode_sequence(iter: &PySequence) -> ValueRecord; - pub fn encode_tuple(tuple: &PyTupleObject) -> ValueRecord; - ``` -- Serialize dictionaries as a `Sequence` of two-element `Tuple` records for key/value pairs to avoid fixed field layouts. - ```rs - pub fn encode_dict(dict: &PyDictObject) -> ValueRecord; - ``` -- For objects with accessible attributes, emit a `Struct` record with sorted field names; fall back to `Raw` with `repr(obj)` when inspection is unsafe. - ```rs - pub fn encode_object(obj: *mut PyObject) -> ValueRecord; - ``` -- Track object identities to detect cycles and reuse `Reference` records with `id(obj)` for repeated structures. - ```rs - pub struct SeenSet { map: HashMap } - pub fn record_reference(seen: &mut SeenSet, obj: *mut PyObject) -> Option; - ``` - -## Shutdown -- On `stop_tracing`, call `sys.monitoring.set_events` with `NO_EVENTS` for the tool ID. - ```rs - pub fn disable_events(tool: &ToolId); - ``` -- Unregister callbacks and free the tool ID with `sys.monitoring.free_tool_id`. - ```rs - pub fn unregister_callbacks(tool: ToolId); - pub fn free_tool_id(tool: ToolId); - ``` -- Close the writer and ensure all buffered events are flushed to disk. - ```rs - pub fn finalize(writer: TraceWriter) -> io::Result<()>; - ``` - -## Current Limitations -- **No structured support for threads or async tasks** – the trace format lacks explicit identifiers for concurrent execution. - Distinguishing events emitted by different Python threads or `asyncio` tasks requires ad hoc `Event` entries, complicating - analysis and preventing downstream tools from reasoning about scheduling. -- **Generic `Event` log** – several `sys.monitoring` notifications like resume, unwind, and branch outcomes have no dedicated - `runtime_tracing` variant. They must be encoded as free‑form `Event` logs, which reduces machine readability and hinders - automation. -- **Heavy value snapshots** – arguments and returns expect full `ValueRecord` structures. Serializing arbitrary Python objects is - expensive and often degrades to lossy string dumps, limiting the visibility of rich runtime state. -- **Append‑only path and function tables** – `runtime_tracing` assumes files and functions are discovered once and never change. - Dynamically generated code (`eval`, REPL snippets) forces extra bookkeeping and cannot update earlier entries, making - dynamic features awkward to trace. -- **No built‑in compression or streaming** – traces are written as monolithic JSON or binary files. Long sessions quickly grow in - size and cannot be streamed to remote consumers without additional tooling. - -## Future Extensions -- Add filtering to enable subsets of events for performance-sensitive scenarios. -- Support streaming traces over a socket for live debugging. +# Python Monitoring Tracer (Cheat Sheet) + +## Goal +Turn `sys.monitoring` events into the `runtime_tracing` stream so we can record Python programs without patching CPython. + +## Moving parts +- **Tool startup** + - Grab a tool id with `sys.monitoring.use_tool_id("codetracer")`. + - Load the event constants and register one callback per event we care about. + - Enable those events in one `set_events` call. +- **Dispatcher** + - Implement the `Tracer` trait so each callback receives only the events it opts into (bit mask filter). + - Each callback also receives the `CodeObjectWrapper` described in the wrapper doc. +- **Trace writer** + - Open a JSON or binary writer when tracing starts. + - Append metadata and source files up front. + - Flush and close cleanly on shutdown. +- **Thread + activation tracking** + - Keep a per-thread stack of activation ids that mirrors CALL → RETURN / YIELD → RESUME. + - Record the first event per thread as “thread started” and clean up on the last event. + - Store the code object id and current offset/line on the activation record. + +## Event map (high level) +| Monitoring event | What we log | +| --- | --- | +| `CALL`, `PY_START` | Push a new activation, record the function entry. | +| `LINE`, `INSTRUCTION`, `BRANCH`, `JUMP` | Write step/control-flow events with filename + line/offset. | +| `PY_RETURN`, `PY_YIELD`, `STOP_ITERATION` | Pop/flag the activation and note the value if we can encode it. | +| `EXCEPTION_HANDLED`, `PY_THROW`, `PY_UNWIND`, `RERAISE`, `C_RAISE`, `C_RETURN` | Emit error or C-API bridge events so time-travel tools can follow the story. | +| `PY_RESUME` | Mark that the paused activation is running again. | + +## Data helpers +- Use the global `CodeObjectRegistry` to avoid repeated getattr calls. +- When we need line tables, call `code.co_lines()` once and cache the entries inside the wrapper. +- Track thread state with a `DashMap`; use thread-local fallback if necessary. + +## Safety rails +- Do not touch `PyCodeObject` internals—only public attributes via PyO3. +- Keep callbacks tiny: grab the wrapper, record the event, hand off to the writer. +- If a callback fails, surface a structured `RecorderError` and disable tracing for that thread so we fail safe. + +## Done when +- A simple Python script traced through this pipeline produces a valid `trace.json` / `trace.bin` compatible with the rest of Codetracer. +- Activations balance correctly across nested calls, yields, and exceptions. +- Profiling shows no per-event Python attribute churn (thanks to the wrapper cache). diff --git a/design-docs/error-handling-implementation-plan.md b/design-docs/error-handling-implementation-plan.md index 0abff33..fa3db68 100644 --- a/design-docs/error-handling-implementation-plan.md +++ b/design-docs/error-handling-implementation-plan.md @@ -1,92 +1,38 @@ -# codetracer-python-recorder Error Handling Implementation Plan - -## Goals -- Deliver the policy defined in ADR 0004: every error flows through `RecorderError`, surfaces a stable code/kind, and maps to the Python exception hierarchy. -- Contain all panics within the FFI boundary and offer deterministic behaviour for `abort` versus `disable` policies. -- Ensure trace outputs remain atomic (or explicitly marked partial) and diagnostics never leak to stdout. -- Provide developers with ergonomic macros, tooling guardrails, and comprehensive tests covering failure paths. - -## Current Gaps -- Ad-hoc `PyRuntimeError` strings in `src/session.rs:21-76` and `src/runtime/mod.rs:77-190` prevent stable categorisation and user scripting. -- FFI trampolines in `src/monitoring/tracer.rs:268-706` and activation helpers in `src/runtime/activation.rs:24-83` still use `unwrap`/`expect`, so poisoned locks or filesystem errors abort the interpreter. -- Python facade functions (`codetracer_python_recorder/session.py:27-63`) return built-in exceptions and provide no context or exit codes. -- No support for JSON diagnostics, policy switches, or atomic output staging; disk failures can leave half-written traces and logs mix stdout/stderr. - -## Workstreams - -### WS1 – Foundations & Inventory -- Add a `just errors-audit` command that runs `rg` to list `PyRuntimeError`, `unwrap`, `expect`, and direct `panic!` usage in the recorder crate. -- Create issue tracker entries grouping call sites by module (`session`, `runtime`, `monitoring`, Python facade) to guide refactors. -- Exit criteria: checklist of legacy error sites recorded with owners. - -### WS2 – `recorder-errors` Crate -- Scaffold `recorder-errors` under the workspace with `RecorderError`, `RecorderResult`, `ErrorKind`, `ErrorCode`, context map type, and conversion traits from `io::Error`, `PyErr`, etc. -- Implement ergonomic macros (`usage!`, `enverr!`, `target!`, `bug!`, `ensure_*`) plus unit tests covering formatting, context propagation, and downcasting. -- Publish crate docs explaining mapping rules and promises; link ADR 0004. -- Exit criteria: `cargo test -p recorder-errors` covers all codes; workspace builds with the new crate. - -### WS3 – Retrofit Rust Modules -- Replace direct `PyRuntimeError` construction in `src/session/bootstrap.rs`, `src/session.rs`, `src/runtime/mod.rs`, `src/runtime/output_paths.rs`, and helpers with `RecorderResult` + macros. -- Update `RuntimeTracer` to propagate structured errors instead of strings; remove `expect`/`unwrap` in hot paths by returning classified `bug!` or `enverr!` failures. -- Introduce a small adapter in `src/runtime/mod.rs` that stages IO writes and applies the atomic/partial policy described in ADR 0004. -- Exit criteria: All recorder crate modules compile without `pyo3::exceptions::PyRuntimeError::new_err` usage. - -### WS4 – FFI Wrapper & Python Exception Hierarchy -- Implement `ffi::wrap_pyfunction` that catches panics (`std::panic::catch_unwind`), maps `RecorderError` to a new `PyRecorderError` base type plus subclasses (`PyUsageError`, `PyEnvironmentError`, etc.). -- Update `#[pymodule]` and every `#[pyfunction]` to use the wrapper; ensure monitoring callbacks also go through the dispatcher. -- Expose the exception types in `codetracer_python_recorder/__init__.py` for Python callers. -- Exit criteria: Rust panics surface as `PyInternalError`, and Python tests can assert exception class + code. - -### WS5 – Policy Switches & Runtime Configuration -- Add `RecorderPolicy` backed by `OnceCell` with setters for CLI flags/env vars: `--on-recorder-error`, `--require-trace`, `--keep-partial-trace`, `--log-level`, `--log-file`, `--json-errors`. -- Update the CLI/embedding entry points (auto-start, `TraceSession`) to fill the policy before starting tracing. -- Implement detach vs abort semantics in `RuntimeTracer::finish` / session stop paths, honoring policy decisions and exit codes. -- Exit criteria: Integration tests demonstrate both `abort` and `disable` flows, including partial trace handling. - -### WS6 – Logging, Metrics, and Diagnostics -- Replace `env_logger` initialisation with a `tracing` subscriber or structured `log` formatter that includes `run_id`, `trace_id`, and `ErrorCode` fields. -- Emit counters for dropped events, detach reasons, and caught panics via a `RecorderMetrics` sink (default no-op, pluggable in future). -- Implement `--json-errors` to emit a single-line JSON trailer on stderr whenever an error is returned to Python. -- Exit criteria: Structured log output verified in tests; stdout usage gated by lint. - -### WS7 – Test Coverage & Tooling Enforcement -- Add unit tests for the new error crate, IO façade, policy switches, and FFI wrappers (panic capture, exception mapping). -- Extend Python tests to cover the new exception hierarchy, JSON diagnostics, and policy flags. -- Introduce CI lints (`cargo clippy --deny clippy::panic`, custom script rejecting `unwrap` outside allowed modules) and integrate with `just lint`. -- Exit criteria: CI blocks regressions; failure-path tests cover disk full, permission denied, target exceptions, partial trace recovery, and SIGINT during detach. - -### WS8 – Documentation & Rollout -- Update README, API docs, and onboarding material to describe guarantees, exit codes, example snippets, and migration guidance for downstream tools. -- Add a change log entry summarising the policy and how to consume structured errors from Python. -- Track adoption status in `design-docs/error-handling-implementation-plan.status.md` (mirror existing planning artifacts). -- Exit criteria: Documentation merged, status file created, ADR 0004 promoted to **Accepted** once WS2–WS7 land. - -## Milestones & Sequencing -1. **Milestone A – Foundations:** Complete WS1 and WS2 (error crate scaffold) in parallel; unblock later work. -2. **Milestone B – Core Refactor:** Deliver WS3 and WS4 together so Rust modules emit structured errors and Python sees the new exceptions. -3. **Milestone C – Policy & IO Guarantees:** Finish WS5 and WS6 to stabilise runtime behaviour and diagnostics. -4. **Milestone D – Hardening:** Execute WS7 (tests, tooling) and WS8 (documentation). Promote ADR 0004 to Accepted. - -## Verification Strategy -- Add a `just test-errors` recipe running targeted failure tests (disk-full, detach, panic capture) plus Python unit tests for error classes. -- Use `cargo nextest run -p codetracer-python-recorder --features failure-fixtures` to execute synthetic failure cases. -- Enable `pytest tests/python/error_handling -q` for Python-specific coverage. -- Capture structured stderr in integration tests to assert JSON trailers and exit codes. - -## Dependencies & Coordination -- Requires consensus with the Observability WG on log format fields and exit-code mapping. -- Policy flag wiring depends on any CLI/front-end work planned for Q4; coordinate with developer experience owners. -- If `runtime_tracing` needs extensions for metadata trailers, align timelines with that team. - -## Risks & Mitigations -- **Wide-scope refactor:** Stage work behind feature branches and land per-module PRs to avoid blocking releases. -- **Performance regressions:** Benchmark hot callbacks before/after WS3 using existing microbenchmarks; keep additional allocations off hot paths. -- **API churn for users:** Provide compatibility shims that map old exceptions to new ones for at least one minor release, and document upgrade notes. -- **Partial trace semantics confusion:** Default to `abort` (no partial outputs) unless `--keep-partial-trace` is explicit; emit warnings when users opt in. - -## Done Definition -- Legacy `PyRuntimeError::new_err` usage is removed or isolated to compat shims. -- All panics are caught before crossing into Python; fuzz tests confirm no UB. -- `just test` (and targeted error suites) pass on Linux/macOS CI, with new structured logs and metrics visible. -- Documentation reflects guarantees, and downstream teams acknowledge new exit codes. - +# Error Handling Plan (Fast Read) + +## Aim +Give the recorder one predictable error story: every failure becomes a `RecorderError`, maps to a stable code, and picks the right Python exception. + +## Current pain +- Random `PyRuntimeError` strings leak out of `session`, `runtime`, and `monitoring` code. +- `unwrap` / `expect` still show up in FFI trampolines, so real errors can crash Python. +- The Python wrapper raises built-in exceptions with no extra context. +- Partial trace files appear when I/O dies mid-run. + +## Work plan +1. **Audit** + - Add a `just errors-audit` helper that lists every `PyRuntimeError`, `unwrap`, `expect`, and `panic!` in the recorder crate. + - File follow-up issues assigning owners to the hotspots. +2. **`recorder-errors` crate** + - Create a small crate with `RecorderError`, `RecorderResult`, `ErrorKind`, and `ErrorCode` enums. + - Provide conversions from `io::Error`, `PyErr`, and internal helper types. + - Offer macros like `bail_recorder!(kind, code, "message")` so callers stay concise. +3. **Rust call sites** + - Replace ad-hoc error strings with the new enums. + - Swap `unwrap`/`expect` for `?` or explicit matches. + - Ensure long-running loops decide between “abort the process” and “disable tracing” using one policy helper. +4. **Python facade** + - Map each `RecorderError` to a concrete Python exception with a deterministic message and optional structured payload. + - Surface diagnostics (JSON preferred) that tools can consume. +5. **Atomic output** + - Stage traces under a temp directory, then rename when complete. + - If something fails, mark the trace as partial and clean up temp files. +6. **Testing** + - Add unit tests for conversion helpers and policy branches. + - Write integration tests that simulate disk failures and poisoning scenarios to prove we no longer panic. + +## Definition of done +- No `unwrap` / `expect` in the FFI boundary or runtime hot path. +- All exported Python functions raise our mapped exceptions. +- Temp files clean up correctly during forced failures. +- Test suite covers success, handled failure, and aborting failure paths. diff --git a/design-docs/file-level-srp-refactor-plan.md b/design-docs/file-level-srp-refactor-plan.md index 476bfc6..f899eee 100644 --- a/design-docs/file-level-srp-refactor-plan.md +++ b/design-docs/file-level-srp-refactor-plan.md @@ -1,87 +1,52 @@ -# File-Level Single Responsibility Refactor Plan - -## Goals -- Reshape the Rust crate and Python support package so that every source file encapsulates a single cohesive topic. -- Reduce the amount of ad-hoc cross-module knowledge currently required to understand tracing start-up, event handling, and encoding logic. -- Preserve the public Python API and Rust crate interfaces during the refactor to avoid disruptions for downstream tooling. - -## Current State Observations -- `src/lib.rs` is responsible for PyO3 module registration, lifecycle management for tracing sessions, global logging initialisation, and runtime format selection, which mixes unrelated concerns in one file. -- `src/runtime_tracer.rs` couples trace lifecycle control, activation toggling, and Python value encoding in a single module, making it difficult to unit test or substitute individual pieces. -- `src/tracer.rs` combines the `Tracer` trait definition, sys.monitoring shims, callback registration utilities, and thread-safe storage, meaning small changes can ripple through unrelated logic. -- `codetracer_python_recorder/api.py` interleaves environment based auto-start, context-manager ergonomics, backend state management, and format constants, leaving no clearly isolated entry-point for CLI or library callers. - -## Target Rust Module Layout -| Topic | Target file | Notes | -| --- | --- | --- | -| PyO3 module definition & re-exports | `src/lib.rs` | Limit to module wiring plus `pub use` statements. -| Global logging defaults | `src/logging.rs` | Provide helper to configure env_logger defaults reused by both lib.rs and tests. -| Tracing session lifecycle (`start_tracing`, `stop_tracing`, `flush_tracing`, `is_tracing`) | `src/session.rs` | Own global `ACTIVE` flag and filesystem validation. -| Runtime tracer orchestration (activation gating, writer plumbing) | `src/runtime/mod.rs` | Public `RuntimeTracer` facade constructed by session. -| Value encoding helpers | `src/runtime/value_encoder.rs` | Convert Python objects into `runtime_tracing::ValueRecord` values; unit test in isolation. -| Activation management (start-on-enter logic) | `src/runtime/activation.rs` | Encapsulate `activation_path`, `activation_code_id`, and toggling state. -| Writer initialisation and file path selection | `src/runtime/output_paths.rs` | Determine file names for JSON/Binary and wrap TraceWriter begin/finish. -| sys.monitoring integration utilities | `src/monitoring/mod.rs` | Provide `ToolId`, `EventId`, `MonitoringEvents`, `set_events`, etc. -| Tracer trait & callback dispatch | `src/monitoring/tracer.rs` | Define `Tracer` trait and per-event callbacks; depend on `monitoring::events`. -| Code object caching | `src/code_object.rs` | Remains focused on caching; consider relocating question comments to doc tests. - -The `runtime` and `monitoring` modules become directories with focused submodules, while `session.rs` consumes them via narrow interfaces. Any PyO3 FFI helper functions should live close to their domain (e.g., frame locals helpers inside `runtime/mod.rs`). - -## Target Python Package Layout -| Topic | Target file | Notes | -| --- | --- | --- | -| Public API surface (`start`, `stop`, `is_tracing`, constants) | `codetracer_python_recorder/api.py` | Keep the public signatures unchanged; delegate to new helpers. -| Session handle implementation | `codetracer_python_recorder/session.py` | Own `TraceSession` class and backend delegation logic. -| Auto-start via environment variables | `codetracer_python_recorder/auto_start.py` | Move `_auto_start_from_env` and constants needed only for boot-time configuration. -| Format constants & validation | `codetracer_python_recorder/formats.py` | Define `TRACE_BINARY`, `TRACE_JSON`, `DEFAULT_FORMAT`, and any helpers to negotiate format strings. -| Module-level `__init__` exports | `codetracer_python_recorder/__init__.py` | Re-export the API and trigger optional auto-start. - -Splitting the Python helper package along these lines isolates side-effectful auto-start logic from the plain API and simplifies targeted testing. - -## Implementation Roadmap - -1. **Stabilise tests and build scripts** - - Ensure `just test` passes to establish a green baseline. - - Capture benchmarks or representative trace outputs to validate parity later. - -2. **Introduce foundational Rust modules (serial)** - - Extract logging initialisation into `logging.rs` and update `lib.rs` to call the helper. - - Move session lifecycle logic from `lib.rs` into a new `session.rs`, keeping function signatures untouched and re-exporting via `lib.rs`. - - Update module declarations and adjust imports; verify tests. - -3. **Restructure runtime tracer internals (can parallelise subtasks)** - - Create `src/runtime/mod.rs` as façade exposing `RuntimeTracer`. - - **Task 3A (Team A)**: Extract activation control into `runtime/activation.rs`, exposing a small struct consumed by the tracer. - - **Task 3B (Team B)**: Extract value encoding routines into `runtime/value_encoder.rs`, providing unit tests and benchmarks. - - **Task 3C (Team C)**: Introduce `runtime/output_paths.rs` to encapsulate format-to-filename mapping and writer initialisation. - - Integrate submodules back into `runtime/mod.rs` sequentially once individual tasks are complete; resolve merge conflicts around struct fields. - -4. **Modularise sys.monitoring glue (partially parallel)** - - Add `monitoring/mod.rs` hosting shared types (`EventId`, `EventSet`, `ToolId`). - - Split trait and dispatcher logic into `monitoring/tracer.rs`; keep callback registration helpers near the sys.monitoring bindings. - - **Task 4A (Team A)**: Port OnceLock caches and registration helpers. - - **Task 4B (Team B)**: Move `Tracer` trait definition and default implementations, updating call sites in runtime tracer and tests. - -5. **Python package decomposition (parallel with Step 4 once Step 2 is merged)** - - Create `session.py`, `formats.py`, and `auto_start.py` with extracted logic. - - Update `api.py` to delegate to the new modules but maintain backward-compatible imports. - - Adjust `__init__.py` to import from `api` and trigger optional auto-start via the new helper. - - Update Python tests and examples to use the reorganised structure. - -6. **Clean-up and follow-up tasks** - - Remove obsolete comments (e.g., `//TODO AI!` placeholders) or move them into GitHub issues. - - Update documentation and diagrams to reflect the new module tree. - - Re-run `just test` and linting for both Rust and Python components; capture trace artifacts to confirm unchanged output format. - -## Parallelisation Notes -- Step 2 touches the global entry points and should complete before deeper refactors to minimise rebasing pain. -- Step 3 subtasks (activation, value encoding, output paths) operate on distinct sections of the existing `RuntimeTracer`; they can be implemented in parallel once `runtime/mod.rs` scaffolding exists. -- Step 4's subtasks can proceed concurrently with Step 3 once the new `monitoring` module is introduced; teams should coordinate on shared types but work on separate files. -- Step 5 (Python package) depends on Step 2 so that backend entry-points remain stable; it can overlap with late Step 3/4 work because it touches only the Python tree. -- Documentation updates and clean-up in Step 6 can be distributed among contributors after core refactors merge. - -## Testing & Verification Strategy -- Maintain existing integration and unit tests; add focused tests for newly separated modules (e.g., pure Rust tests for `value_encoder` conversions). -- Extend Python tests to cover environment auto-start logic now that it lives in its own module. -- For each phase, compare generated trace files against baseline fixtures to guarantee no behavioural regressions. -- Require code review sign-off from domain owners for each phase to ensure the single-responsibility intent is preserved. +# File-Level SRP Plan (TL;DR) + +## Goal +Give every Rust and Python file one clear job without breaking public APIs. + +## Where we are messy +- `src/lib.rs` mixes module wiring, logging, and session control. +- `src/runtime_tracer.rs` glues together activation logic, writers, and value encoding. +- `src/tracer.rs` holds the trait, sys.monitoring glue, and storage helpers in one blob. +- `codetracer_python_recorder/api.py` blends auto-start side effects with the public API. + +## Target layout +### Rust +| Topic | File | +| --- | --- | +| PyO3 entry + re-exports | `src/lib.rs` | +| Logging defaults | `src/logging.rs` | +| Session lifecycle (`start/stop/is_tracing`) | `src/session.rs` | +| Runtime façade | `src/runtime/mod.rs` | +| Activation toggles | `src/runtime/activation.rs` | +| Value encoding | `src/runtime/value_encoder.rs` | +| Trace file paths + writer setup | `src/runtime/output_paths.rs` | +| Monitoring shared types | `src/monitoring/mod.rs` | +| Tracer trait + dispatcher | `src/monitoring/tracer.rs` | +| Code caching | `src/code_object.rs` | + +### Python +| Topic | File | +| --- | --- | +| Public API (`start`, `stop`, constants) | `codetracer_python_recorder/api.py` | +| Session handle | `codetracer_python_recorder/session.py` | +| Auto-start logic | `codetracer_python_recorder/auto_start.py` | +| Format helpers | `codetracer_python_recorder/formats.py` | +| Package exports | `codetracer_python_recorder/__init__.py` | + +## Order of attack +1. **Stabilise baseline** – run `just test`, capture a sample trace. +2. **Split core Rust files** + - Extract logging + session first so imports settle. +3. **Break up runtime tracer** + - Activation, value encoding, and output paths can happen in parallel once the new module exists. +4. **Split monitoring helpers** + - Move shared types into `monitoring/mod.rs` and keep the trait + dispatcher in `monitoring/tracer.rs`. +5. **Restructure Python package** + - Create the helper modules, keep API signatures the same, update imports. +6. **Clean up** + - Delete stale comments, refresh docs, and re-run all tests. + +## Proof of done +- No file carries unrelated responsibilities. +- Tests and trace fixtures match the pre-refactor behaviour. +- Reviewers can learn a subsystem by opening a single focused file. diff --git a/design-docs/file-level-srp-refactor-plan.status.md b/design-docs/file-level-srp-refactor-plan.status.md index b49aaa0..b9d8416 100644 --- a/design-docs/file-level-srp-refactor-plan.status.md +++ b/design-docs/file-level-srp-refactor-plan.status.md @@ -1,12 +1,11 @@ -# File-Level SRP Refactor Status +# File-Level SRP Status Snapshot -## Current Status -- ✅ Step 2 complete: introduced `src/logging.rs` for one-time logger initialisation and migrated tracing session lifecycle (`start_tracing`, `stop_tracing`, `is_tracing`, `flush_tracing`, `ACTIVE` flag) into `src/session.rs`, with `src/lib.rs` now limited to PyO3 wiring and re-exports. -- ✅ Step 3 complete: added `src/runtime/mod.rs` with focused `activation`, `value_encoder`, and `output_paths` submodules; `RuntimeTracer` now delegates activation gating, value encoding, and writer initialisation through the façade consumed by `session.rs`. -- ✅ Step 4 complete: introduced `src/monitoring/mod.rs` for sys.monitoring types/caches and `src/monitoring/tracer.rs` for the tracer trait plus callback dispatch; rewired `lib.rs`, `session.rs`, and `runtime/mod.rs`, and kept a top-level `tracer` re-export for API stability. -- ✅ Step 5 complete: split the Python package into dedicated `formats.py`, `session.py`, and `auto_start.py` modules, trimmed `api.py` to a thin façade, and moved the environment auto-start hook into `__init__.py`. -- ✅ Step 6 complete: resolved outstanding Rust TODOs (format validation, argv handling, function id stability), expanded module documentation so `cargo doc` reflects the architecture, and re-ran `just test` to confirm the refactor remains green. -- ✅ Test baseline: `just test` (nextest + pytest) passes with the UV cache scoped to the workspace; direct `cargo test` still requires CPython development symbols. +## What’s done +- ✅ Logging + session split: `src/logging.rs` handles init, `src/session.rs` owns `start/stop/is_tracing` while `lib.rs` just wires PyO3. +- ✅ Runtime breakup: `RuntimeTracer` now lives in `runtime/mod.rs` with dedicated `activation`, `value_encoder`, and `output_paths` modules. +- ✅ Monitoring split: shared types live in `monitoring/mod.rs`; the trait + dispatcher sit in `monitoring/tracer.rs`; public re-exports stayed stable. +- ✅ Python package tidy-up: `formats.py`, `session.py`, and `auto_start.py` carry their own concerns; `api.py` is a thin façade and `__init__.py` runs the optional auto-start. +- ✅ Cleanup: removed TODOs, refreshed docs, and `just test` (nextest + pytest) passes with the repo-local UV cache. -## Next Task -- Plan complete. Identify any new follow-up items as separate tasks once additional requirements surface. +## What’s next +- Nothing active—open new tasks if new requirements appear. diff --git a/design-docs/function-level-srp-refactor-plan.md b/design-docs/function-level-srp-refactor-plan.md index ccc6007..19ccc24 100644 --- a/design-docs/function-level-srp-refactor-plan.md +++ b/design-docs/function-level-srp-refactor-plan.md @@ -1,96 +1,30 @@ -# Function-Level Single Responsibility Refactor Plan - -## Goals -- Ensure each public function in the tracer stack orchestrates a single concern, delegating specialised work to cohesive helpers. -- Reduce unsafe code surface inside high-level callbacks by centralising frame manipulation and activation logic. -- Improve testability by exposing narrow helper functions that can be unit tested without spinning up a full tracing session. - -## Hotspot Summary -| Function | Location | Current mixed responsibilities | -| --- | --- | --- | -| `start_tracing` | `codetracer-python-recorder/src/session.rs` | Logging bootstrap, active-session guard, filesystem validation/creation, format parsing, argv inspection, tracer construction, sys.monitoring registration | -| `start` | `codetracer_python_recorder/session.py` | Backend state guard, path coercion, format normalisation, activation path handling, PyO3 call | -| `RuntimeTracer::on_py_start` | `codetracer-python-recorder/src/runtime/mod.rs` | Activation gating, synthetic filename filtering, unsafe frame acquisition, argument capture, writer registration, logging | -| `RuntimeTracer::on_line` | `codetracer-python-recorder/src/runtime/mod.rs` | Activation gating, frame search, locals/globals materialisation, value encoding, variable registration, logging | -| `RuntimeTracer::on_py_return` | `codetracer-python-recorder/src/runtime/mod.rs` | Activation gating, return value encoding, activation state transition, logging | - -These functions currently exceed 60–120 lines and interleave control flow with low-level detail, making them brittle and difficult to extend. - -## Refactor Strategy -1. **Codify shared helpers before rewriting call sites.** Introduce new modules (`runtime::frame_inspector`, `runtime::value_capture`, `session::bootstrap`) that encapsulate filesystem, activation, and frame-handling behaviour. -2. **Convert complex functions into orchestration shells.** After helpers exist, shrink the hotspot functions to roughly 10–25 lines that call the helpers and translate their results into tracer actions. -3. **Add regression tests around extracted helpers** so that future changes to callbacks can lean on focused coverage instead of broad integration tests. -4. **Maintain behavioural parity** by running full `just test` plus targeted fixture comparisons after each stage. - -### Helper Module Map -- `runtime::frame_inspector` owns frame discovery and locals/globals snapshots through the `FrameSnapshot` abstraction. -- `runtime::value_capture` centralises argument, scope, and return-value recording, keeping encoding concerns outside the tracer façade. -- `runtime::logging` provides the `log_event` helper so callback logging stays consistent and format-agnostic. -- `session::bootstrap` deals with filesystem setup, format resolution, and program metadata collection for the Rust entrypoint. -- Python `session.py` mirrors the responsibilities with `_coerce_format`, `_validate_trace_path`, and `_normalize_activation_path` helpers. - -## Work Breakdown - -### Stage 0 – Baseline & Guardrails (1 PR) -- Confirm the repository is green (`just test`). -- Capture representative trace output fixtures (binary + JSON) to compare after refactors. -- Document current behaviour of `ActivationController` and frame traversal in quick notes for reviewers. - -### Stage 1 – Session Start-Up Decomposition (Rust + Python) (2 PRs) -1. **Rust bootstrap helper** - - Add `session/bootstrap.rs` (or equivalent module) exposing functions `ensure_trace_directory`, `resolve_trace_format`, `collect_program_metadata`. - - Refactor `start_tracing` to call these helpers; keep public signature unchanged. - - Unit test each helper for error cases (invalid path, unsupported format, argv fallback). - -2. **Python validation split** - - Extract `validate_trace_path` and `coerce_format` into private helpers in `session.py`. - - Update `start` to orchestrate helpers and call `_start_backend` only after validation succeeds. - - Extend Python tests for duplicate start attempts and invalid path/format scenarios. - -### Stage 2 – Frame Inspection & Activation Separation (Rust) (2 PRs) -1. **Frame locator module** - - Introduce `runtime/frame_inspector.rs` handling frame acquisition, locals/globals materialisation, and reference-count hygiene. - - Provide safe wrappers returning domain structs (e.g., `CapturedFrame { locals, globals, frame_ptr }`). - - Update `on_line` to use the new inspector while retaining existing behaviour. - -2. **Activation orchestration** - - Enrich `ActivationController` with methods `should_process(code)` and `handle_deactivation(code_id)` so callbacks can early-return without duplicating logic. - - Update `on_py_start`, `on_line`, and `on_py_return` to rely on these helpers. - -### Stage 3 – Value Capture Layer (Rust) (2 PRs) -1. **Argument capture helper** - - Create `runtime/value_capture.rs` (or expand existing module) exposing `capture_call_arguments(writer, frame, code)`. - - Refactor `on_py_start` to use it, ensuring error propagation remains explicit. - - Unit test for positional args, varargs, kwargs, non-string keys, and failure cases (e.g., failed locals sync). - -2. **Scope recording helper** - - Extract locals/globals iteration into `record_visible_scope(writer, captured_frame)`. - - Update `on_line` to delegate the loop and remove inline Set bookkeeping. - - Add tests covering overlapping names, `__builtins__` filtering, and locals==globals edge cases. - -### Stage 4 – Return Handling & Logging Harmonisation (Rust) (1 PR) -- Introduce small logging helpers (e.g., `log_event(event, code, lineno)`). -- Provide `record_return_value(writer, value)` in `value_capture`. -- Refactor `on_py_return` to call activation decision, logging helper, and value recorder sequentially. -- Ensure deactivation on activation return remains tested. - -### Stage 5 – Cleanup & Regression Sweep (1 PR) -- Remove obsolete inline comments / TODOs made redundant by helpers. -- Re-run `just test`, compare fixtures, and update docs referencing the old function shapes. -- Add final documentation pointing to the new helper modules for contributors. - -## Testing Strategy -- **Unit tests:** Add Rust tests for each new helper module using PyO3 `Python::with_gil` harnesses and synthetic frames. Add Python tests for new validation helpers. -- **Integration tests:** Continue running `just test` after each stage. Augment with targeted scripts that exercise activation path, async functions, and nested frames to confirm instrumentation parity. -- **Fixture diffs:** Compare generated trace outputs (binary + JSON) before and after the refactor to ensure no semantic drift. - -## Dependencies & Coordination -- Stage 1 must land before downstream stages to stabilise shared session APIs. -- Stages 2 and 3 can progress in parallel once bootstrap helpers are merged, but teams should sync on shared structs (e.g., `CapturedFrame`). -- Any changes to unsafe frame handling require review from at least one PyO3 domain expert. -- Update ADR 0002 status from “Proposed” to “Accepted” once Stages 1–4 merge successfully. - -## Risks & Mitigations -- **Unsafe code mistakes:** Wrap raw pointer usage in RAII helpers with debug assertions; add fuzz/ stress tests for recursion-heavy scripts. -- **Performance regressions:** Benchmark tracer overhead before and after major stages; inline trivial helpers where necessary, or mark with `#[inline]` as appropriate. -- **Merge conflicts:** Finish each stage quickly and rebase branches frequently; keep PRs focused (≤400 LOC diff) to ease review. +# Function-Level SRP Plan (Quick Hits) + +## Goal +Shrink the big tracer entry points so each one just coordinates helpers instead of doing everything inline. + +## Main trouble spots +| Function | Issue | +| --- | --- | +| `session::start_tracing` | Mixes logging, state guards, filesystem setup, format parsing, argv capture, and tracer wiring. | +| Python `session.start` | Validates paths, normalises formats, toggles activation, and calls into Rust all in one go. | +| `RuntimeTracer::on_py_start` / `on_line` / `on_py_return` | Each does activation checks, frame poking, value capture, and logging in giant blocks. | + +## Fix-it plan +1. **Create helpers first** + - `session::bootstrap` for filesystem + format work. + - `runtime::frame_inspector` to locate frames and pull locals/globals safely. + - `runtime::value_capture` for arguments, scopes, and returns. + - Python helpers for path + format validation. +2. **Trim the orchestration functions** + - After helpers exist, each hotspot should read like: guard → call helper → send event. +3. **Test as we go** + - Unit-test helpers with focused cases (bad paths, weird locals, async frames, etc.). + - Keep running `just test` plus trace fixture diffs to ensure behaviour doesn’t change. +4. **Finish with cleanup** + - Drop leftover TODOs, document the new helper modules, and mark ADR 0002 as accepted once the stages land. + +## Done when +- No giant 60+ line functions remain in the tracer path. +- Unsafe frame handling lives inside audited helpers. +- Tests cover success, failure, and edge scenarios without needing full tracing sessions. diff --git a/design-docs/function-level-srp-refactor-plan.status.md b/design-docs/function-level-srp-refactor-plan.status.md index 0bea6ac..14bb00d 100644 --- a/design-docs/function-level-srp-refactor-plan.status.md +++ b/design-docs/function-level-srp-refactor-plan.status.md @@ -1,32 +1,13 @@ -# Function-Level SRP Refactor Status - -## Stage 0 – Baseline & Guardrails -- ✅ `just test` (Rust + Python suites) passes; captured run via the top-level recipe. -- ✅ Generated JSON and binary reference traces from `examples/value_capture_all.py`; outputs stored in `artifacts/stage0/value-capture-json/` and `artifacts/stage0/value-capture-binary/`. -- ⏳ Summarise current `ActivationController` behaviour and frame traversal notes for reviewer context. - -## Stage 1 – Session Start-Up Decomposition -- ✅ Step 1 (Rust): Introduced `session::bootstrap` helpers and refactored `start_tracing` to delegate directory validation, format resolution, and program metadata collection. Tests remain green. -- ✅ Step 2 (Python): Extracted `_coerce_format`, `_validate_trace_path`, and `_normalize_activation_path` helpers; added tests covering invalid formats and conflicting paths. - -## Stage 2 – Frame Inspection & Activation Separation -- ✅ Step 1: Added `runtime::frame_inspector::capture_frame` to encapsulate frame lookup, locals/globals materialisation, and reference counting; `on_line` now delegates to the helper while preserving behaviour. -- ✅ Step 2: Extended `ActivationController` with `should_process_event`/`handle_return_event`, updated callbacks to rely on them, and removed direct state juggling from `RuntimeTracer`. - -## Stage 3 – Value Capture Layer -- ✅ Step 1: Introduced `runtime::value_capture::capture_call_arguments`; `on_py_start` now delegates to it, keeping the function focused on orchestration while reusing frame inspectors. -- ✅ Step 2: Added `record_visible_scope` helper and refactored `on_line` to delegate locals/globals registration through it. - -## Stage 4 – Return Handling & Logging Harmonisation -- ✅ Added `runtime::logging::log_event` to consolidate callback logging across start, line, and return handlers. -- ✅ Exposed `record_return_value` in `runtime::value_capture` and refactored `RuntimeTracer::on_py_return` to orchestrate activation checks, logging, and value recording. -- ✅ Extended runtime tests with explicit return capture coverage and activation deactivation assertions. - -## Stage 5 – Cleanup & Regression Sweep -- ✅ Audited runtime modules for obsolete inline comments or TODOs introduced pre-refactor; none remained after helper extraction. -- ✅ Documented the helper module map in `design-docs/function-level-srp-refactor-plan.md` for contributor onboarding. -- ✅ Re-ran `just test` (Rust `cargo nextest` + Python `pytest`) to confirm post-cleanup parity. - -## Next Actions -- Draft short notes on activation gating and frame search mechanics to complete Stage 0. -- Track Stage 5 fixture comparisons if we decide to snapshot JSON/Binary outputs post-refactor. +# Function-Level SRP Status Snapshot + +## Completed +- ✅ Baseline captured: `just test` passes and JSON/Binary fixtures from `examples/value_capture_all.py` are stored for comparisons. +- ✅ Session bootstrap helpers in Rust and Python now handle directory checks, format resolution, and activation path cleanup. +- ✅ `frame_inspector` + beefed-up `ActivationController` keep frame grabbing and gating logic out of the callbacks. +- ✅ `value_capture` helpers own argument, scope, and return recording; callbacks just orchestrate. +- ✅ Logging is centralised and runtime tests cover return handling and activation teardown. +- ✅ Final cleanup pass removed TODOs and reran the full test recipe. + +## Still open +- Write the short explainer on activation gating/frame search if reviewers still need it. +- Decide whether to snapshot fresh fixtures post-refactor. diff --git a/design-docs/only-real-filenames.md b/design-docs/only-real-filenames.md index 75d3b72..a6dc0d6 100644 --- a/design-docs/only-real-filenames.md +++ b/design-docs/only-real-filenames.md @@ -1,135 +1,25 @@ -In Python monitoring sometimes the co_filename of a code object -doesn't point to a real file, but something else. Those filenames look -like `<...>`. - -Lines from those files cannot be traced. For this reason, we should -skip them for all monitoring events. - -sys.monitoring provides the capability to turn off monitoring for -specific lines by having the callback return a special value -`sys.monitoring.DISABLE`. We want to use this functionality to disable -monitoring of those lines and improve performance. - -The following changes need to be made: - -1. Extend the `Tracer` trait so every callback can signal back a - `sys.monitoring` action (continue or disable). Update all existing - implementations and tests to use the new return type. -2. Add reusable logic that decides whether a given code object refers - to a real on-disk file and cache the decision per `co_filename` / - code id. -3. Invoke the new filtering logic from every `RuntimeTracer` callback - before any expensive work. When a code object should be ignored, - skip our bookkeeping and return the disable sentinel to CPython so - further events from that location stop firing. - -Note: We cannot import `sys.monitoring` inside the hot callbacks, -because in some embedded runtimes importing during tracing is either -prohibited or will deadlock. We must therefore cache the -`sys.monitoring.DISABLE` sentinel ahead of time while we are still in a -safe context (e.g., during tracer installation). - -We need to make sure that our test suite has comprehensive tests that -prove the new filtering/disable behaviour and cover regressions on the -public tracer API. - -# Technical design solutions - -## Tracer callback return values - -- Introduce a new enum `CallbackOutcome` in `src/tracer.rs` with two - variants: `Continue` (default) and `DisableLocation`. -- Define a `type CallbackResult = PyResult` so every - trait method can surface Python errors and signal whether the - location must be disabled. `Continue` replaces the current implicit - unit return. -- Update the `Tracer` trait so all callbacks return `CallbackResult`. - Default implementations continue to return `Ok(CallbackOutcome::Continue)` - so existing tracers only need minimal changes. -- The PyO3 callback shims (`callback_line`, `callback_py_start`, etc.) - will translate `CallbackOutcome::DisableLocation` into the cached - Python sentinel and otherwise return `None`. This keeps the Python - side compliant with `sys.monitoring` semantics - (see https://docs.python.org/3/library/sys.monitoring.html#sys.monitoring.DISABLE). - -## Accessing `sys.monitoring.DISABLE` - -- During `install_tracer`, after we obtain `monitoring_events`, load - `sys.monitoring.DISABLE` once and store it in the global tracer state - (`Global` struct) as a `Py`. Because `Py` is `Send` - + `Sync`, it can be safely cached behind the global mutex and reused - inside callbacks without re-importing modules. -- Provide a helper on `Global` (e.g., `fn disable_sentinel<'py>(&self, - py: Python<'py>) -> Bound<'py, PyAny>`) that returns the bound object - when we need to hand the sentinel back to Python. -- Make sure `uninstall_tracer` drops the sentinel alongside other - state so a new install can reload it cleanly. - -## `RuntimeTracer` filename filtering - -- Add a dedicated method `fn should_trace_code(&mut self, - py: Python<'_>, code: &CodeObjectWrapper) -> ShouldTrace` returning a - new internal enum `{ Trace, SkipAndDisable }`. - - A file is considered “real” when `co_filename` does not match the - `<...>` pattern. For now we treat any filename that begins with `<` - and ends with `>` (after trimming whitespace) as synthetic. This - covers ``, ``, ``, etc. - - Cache negative decisions in a `HashSet` keyed by the code - object id so subsequent events avoid repeating the string checks. - The set is cleared on `flush()`/`finish()` if we reset state. -- Each public callback (`on_py_start`, `on_line`, `on_py_return`) will - call `should_trace_code` first. When the decision is `SkipAndDisable` - we: - - Return `CallbackOutcome::DisableLocation` immediately so CPython - stops sending events for that location. - - Avoid calling any of the expensive frame/value capture paths. -- When the decision allows tracing, we continue with the existing - behaviour. The activation-path logic runs before the filtering so a - deactivated tracer still ignores events regardless of filename. - -## Backwards compatibility and ergonomics - -- `RuntimeTracer` becomes the only tracer that returns - `DisableLocation`; other tracers keep returning `Continue`. -- Update the test helper tracers under `tests/` to use the new return - type but still assert on event counts; their filenames will remain - real so behaviour does not change. -- Document the change in the crate-level docs (`src/lib.rs`) to warn - downstream implementors that callbacks now return `CallbackResult`. - -# Test suite - -- Rust unit test for the pure filename predicate (e.g., - ``, ``, `script.py`) to prevent - regressions in the heuristic. -- Runtime tracer integration test that registers a `RuntimeTracer`, - executes code with a `` filename, and asserts that: - - No events are written to the trace writer. - - The corresponding callbacks return the disable sentinel (inspect - via a lightweight shim or mock writer). -- Complementary test that runs a real file (use `tempfile` to emit a - small script) and ensures events are still recorded. -- Regression tests for the updated trait: adjust `tests/print_tracer.rs` - counting tracer to assert it still receives events and that the - return value defaults to `Continue`. -- Add a smoke test checking we do not attempt to import - `sys.monitoring` inside callbacks by patching the module import hook - during a run. - -# Implementation Plan - -1. Introduce `CallbackOutcome`/`CallbackResult` in `src/tracer.rs` and - update every trait method signature plus the PyO3 callback shims. - Store the `sys.monitoring.DISABLE` sentinel in the `Global` state. -2. Propagate signature updates through existing tracers and tests, - ensuring they all return `CallbackOutcome::Continue`. -3. Extend `RuntimeTracer` with the filename filtering method, cached - skip set, and early-return logic that emits `DisableLocation` when - appropriate. -4. Update the runtime tracer callbacks (`on_py_start`, `on_line`, - `on_py_return`, and any other events we wire up later) to invoke the - filtering method first. -5. Expand the test suite with the new unit/integration coverage and - adjust existing tests to the trait changes. -6. Perform a final pass to document the new behaviour in public docs - and ensure formatting/lints pass. +# Skip Fake Filenames Plan + +## Problem +Some code objects report filenames like `` or ``. Those are synthetic, so tracing them wastes work and breaks source lookups. + +## Fix +1. **Let callbacks opt out** + - `Tracer` methods return a `CallbackOutcome` (`Continue` or `DisableLocation`). + - PyO3 shims translate `DisableLocation` into the cached `sys.monitoring.DISABLE` sentinel. +2. **Cache the sentinel** + - Load `sys.monitoring.DISABLE` during installation and keep it in global state as a `Py`. +3. **Filter in `RuntimeTracer`** + - Add `should_trace_code` that treats filenames wrapped in `<...>` as fake. + - Cache the skip decision per code object id so we bail out fast next time. + - If fake, return `DisableLocation` immediately and skip all heavy lifting. + +## Tests +- Unit test the filename predicate (`` vs `script.py`). +- Runtime test confirming `` code triggers the disable path and real files still trace. +- Update helper tracers/tests to use the new return type. + +## Done when +- Synthetic filenames stop generating events. +- Real files still trace normally. +- No callback imports `sys.monitoring` on the hot path. diff --git a/design-docs/py-api-001.md b/design-docs/py-api-001.md index 797b2f0..051c92d 100644 --- a/design-docs/py-api-001.md +++ b/design-docs/py-api-001.md @@ -1,73 +1,39 @@ -# Python sys.monitoring Tracer API +# Python API Cheat Sheet -## Overview -This document describes the user-facing Python API for the `codetracer` module built on top of `runtime_tracing` and `sys.monitoring`. The API exposes a minimal surface for starting and stopping traces, managing trace sessions, and integrating tracing into scripts or test suites. - -## Module `codetracer` - -### Constants -- `DEFAULT_FORMAT: str = "binary"` -- `TRACE_BINARY: str = "binary"` -- `TRACE_JSON: str = "json"` - -### Session Management -- Start a global trace; returns a `TraceSession`. - ```py - def start(path: str | os.PathLike, *, format: str = DEFAULT_FORMAT, - start_on_enter: str | os.PathLike | None = None) -> TraceSession - ``` -- Stop the active trace if any. - ```py - def stop() -> None - ``` -- Query whether tracing is active. - ```py - def is_tracing() -> bool - ``` -- Context manager helper for scoped tracing. - ```py - @contextlib.contextmanager - def trace(path: str | os.PathLike, *, format: str = DEFAULT_FORMAT): - ... - ``` -- Flush buffered data to disk without ending the session. - ```py - def flush() -> None - ``` +## Imports +```py +import codetracer +``` -## Class `TraceSession` -Represents a live tracing session returned by `start()` and used by the context manager. +## Constants +- `codetracer.TRACE_BINARY` / `codetracer.TRACE_JSON` +- `codetracer.DEFAULT_FORMAT` (defaults to binary) +## Core calls ```py -class TraceSession: - path: pathlib.Path - format: str - - def stop(self) -> None: ... - def flush(self) -> None: ... - def __enter__(self) -> TraceSession: ... - def __exit__(self, exc_type, exc, tb) -> None: ... +session = codetracer.start(path, format=codetracer.DEFAULT_FORMAT, start_on_enter=None) +codetracer.stop() +is_active = codetracer.is_tracing() +codetracer.flush() ``` -### Start Behavior -- `start_on_enter`: Optional path; when provided, tracing starts only after execution first enters this file (useful to avoid interpreter/import noise when launching via CLI). +`start_on_enter` (optional path) delays tracing until we enter that file. -### Output Location -- `path` is a directory. The tracer writes three files inside it: - - `trace.json` when `format == "json"` or `trace.bin` when `format == "binary"` - - `trace_metadata.json` - - `trace_paths.json` +## Context manager +```py +with codetracer.trace(path, format=codetracer.TRACE_JSON): + run_code() +``` -## Environment Integration -- Auto-start tracing when `CODETRACER_TRACE` is set; the value is interpreted as the output directory. -- When `CODETRACER_FORMAT` is provided, it overrides the default output format. +## TraceSession object +- Attributes: `path`, `format` +- Methods: `stop()`, `flush()`, context-manager support. -## Usage Example -```py -import codetracer -from pathlib import Path +## Files we write +- `trace.bin` or `trace.json` +- `trace_metadata.json` +- `trace_paths.json` -out_dir = Path("./traces/run-001") -with codetracer.trace(out_dir, format=codetracer.TRACE_JSON): - run_application() -``` +## Environment auto-start +- `CODETRACER_TRACE=/tmp/out` starts tracing on import. +- `CODETRACER_FORMAT=json` overrides the format. diff --git a/design-docs/test-design-001.md b/design-docs/test-design-001.md index ed4133a..8ea8ad4 100644 --- a/design-docs/test-design-001.md +++ b/design-docs/test-design-001.md @@ -1,60 +1,33 @@ -# Python sys.monitoring Tracer Test Design - -## Overview -This document outlines a test suite for validating the Python tracer built on `sys.monitoring` and `runtime_tracing`. Each test item corresponds to roughly 1–10 lines of implementation and exercises tracer behavior under typical and edge conditions. +# Tracer Test Plan (Quick List) ## Setup -- Establish a temporary directory for trace output and source snapshots. -- Install the tracer module and import helper utilities for running traced Python snippets. -- Provide fixtures that clear the trace buffer and reset global state between tests. - -## Tool Initialization -- Acquire a monitoring tool ID and ensure subsequent calls reuse the same identifier. -- Register callbacks for all enabled events and verify the resulting mask matches the design. -- Unregister callbacks on shutdown and confirm no events fire afterward. - -## Event Recording -### Control Flow Events -- Capture `PY_START` and `PY_RETURN` for a simple script and assert a start/stop pair is recorded. -- Resume and yield events within a generator function produce matching `PY_RESUME`/`PY_YIELD` entries. -- A `PY_THROW` followed by `RERAISE` generates the expected unwind and rethrow sequence. - -### Call Tracking -- Direct function calls record `CALL` and `PY_RETURN` with correct frame identifiers. -- Recursive calls nest frames correctly and unwind in LIFO order. -- Decorated functions ensure wrapper frames are recorded separately from wrapped frames. - -### Line and Branch Coverage -- A loop with conditional branches emits `LINE` events for each executed line and `BRANCH` for each branch taken or skipped. -- Jump statements such as `continue` and `break` produce `JUMP` events with source and destination line numbers. - -### Exception Handling -- Raising and catching an exception emits `RAISE` and `EXCEPTION_HANDLED` events with matching exception IDs. -- An uncaught exception records `RAISE` followed by `PY_UNWIND` and terminates the trace with a `PY_THROW`. - -### C API Boundary -- Calling a built-in like `len` results in `C_CALL` and `C_RETURN` events linked to the Python frame. -- A built-in that raises, such as `int("a")`, generates `C_RAISE` with the translated exception value. - -## Value Translation -- Primitive values (ints, floats, strings, bytes) round-trip through the value registry and appear in the trace as expected. -- Complex collections like lists of dicts are serialized recursively with cycle detection preventing infinite loops. -- Object references without safe representations fall back to `repr` with a stable identifier. - -## Metadata and Source Capture -- The trace writer copies the executing script into the output directory and records its SHA-256 hash. -- Traces include `ProcessMetadata` fields for Python version and platform. - -## Shutdown Behavior -- Normal interpreter exit flushes the trace and closes files without losing events. -- An abrupt shutdown via `os._exit` truncates the trace file but leaves previous events intact. - -## Error and Edge Cases -- Invalid event names in manual callback registration raise a clear `ValueError`. -- Attempting to trace after the writer is closed results in a no-op without raising. -- Large string values exceeding the configured limit are truncated with an explicit marker. - -## Performance and Stress -- Tracing a tight loop of 10⁶ iterations completes within an acceptable time budget. -- Concurrent threads each produce isolated traces with no frame ID collisions. - +- Use a temp directory per test. +- Reset global tracer state between runs. + +## Startup + shutdown +- Tool id is reused after first acquisition. +- Callbacks register/unregister cleanly. +- Stopping the tracer leaves no stray events; `os._exit` leaves the file truncated but valid. + +## Event coverage +- `PY_START`/`PY_RETURN` pair for a simple function. +- Generators hit `PY_RESUME` and `PY_YIELD`. +- Exceptions trigger `PY_THROW`, `RERAISE`, and `EXCEPTION_HANDLED` as expected. +- Calls record frame ids correctly, including recursion and decorated functions. +- Branching code emits `LINE`, `BRANCH`, and `JUMP` events with the right lines. +- C boundary logs `C_CALL`/`C_RETURN` and `C_RAISE` for failures. + +## Value capture +- Basic scalars and collections encode round-trip. +- Recursive structures stop at the cycle guard. +- Unhandled objects fall back to a stable `repr` string. + +## Metadata +- Source files are copied once with hashes. +- Process metadata (Python version, platform) is present. + +## Edge checks +- Invalid manual registration raises `ValueError`. +- Re-tracing after stop is a no-op. +- Oversized strings get truncated with a marker. +- Stress test: 10^6 loop iterations finish within budget and threads keep ids distinct. diff --git a/design-docs/test-suite-coverage-plan.md b/design-docs/test-suite-coverage-plan.md index 7f8e4d4..bab7a0c 100644 --- a/design-docs/test-suite-coverage-plan.md +++ b/design-docs/test-suite-coverage-plan.md @@ -1,65 +1,32 @@ -# Test Suite Coverage Plan for codetracer-python-recorder +# Coverage Plan (Simple View) ## Goals -- Provide lightweight code coverage signals for both the Rust and Python layers without blocking CI on initial roll-out. -- Enable engineers to inspect coverage reports for targeted modules (runtime activation, session bootstrap, Python facade helpers) while keeping runtimes acceptable. -- Lay groundwork for future gating (e.g., minimum coverage thresholds) once the numbers stabilise. - -## Tooling Choices -- **Rust:** Use `cargo llvm-cov` to aggregate unit and integration test coverage. This tool integrates with `nextest` and produces both lcov and HTML outputs. It works with the existing `nix develop` environment once `llvm-tools-preview` is available (already pulled by rustup in Nix environment). -- **Python:** Use `pytest --cov` with the `coverage` plugin. Restrict collection to the `codetracer_python_recorder` package to avoid noise from site-packages. Generate both terminal summaries and Cobertura XML for upload. - -## Prerequisites & Dependencies -- Add `cargo-llvm-cov` to the dev environment so the Just targets and CI runners share the same binary. In the Nix shell, include the package and ensure the Rust toolchain exposes `llvm-tools-preview` or equivalent `llvm` binaries. The current dev shell ships `llvmPackages_latest.llvm`, making `llvm-cov`/`llvm-profdata` available without rustup components. -- Extend the UV `dev` dependency group with `pytest-cov` and `coverage[toml]` so Python coverage instrumentation is reproducible locally and in CI. -- Standardise coverage outputs under `codetracer-python-recorder/target/coverage` to keep artefacts inside the Rust crate. Use `target/coverage/{rust,python}` for per-language assets and a top-level `index.txt` to note the run metadata if needed later. - -## Execution Strategy -1. **Local Workflow** - - Add convenience Just targets that mirror the default test steps: - - `just coverage-rust` → `LLVM_COV=$(command -v llvm-cov) LLVM_PROFDATA=$(command -v llvm-profdata) uv run cargo llvm-cov --manifest-path codetracer-python-recorder/Cargo.toml --no-default-features --nextest --lcov --output-path codetracer-python-recorder/target/coverage/rust/lcov.info`, followed by `cargo llvm-cov report --summary-only --json` to generate `summary.json` and a Python helper that prints a table mirroring the pytest coverage output. Document that contributors can run a second `cargo llvm-cov … --html --output-dir …` invocation when they need browsable reports because the CLI disallows combining `--lcov` and `--html` in a single run. - - `just coverage-python` → `uv run --group dev --group test pytest --cov=codetracer_python_recorder --cov-report=term --cov-report=xml:codetracer-python-recorder/target/coverage/python/coverage.xml --cov-report=json:codetracer-python-recorder/target/coverage/python/coverage.json codetracer-python-recorder/tests/python`. - - `just coverage` wrapper → runs the Rust step followed by the Python step so developers get both artefacts with one command, matching the eventual CI flow. - - Ensure the commands create their output directories (`target/coverage/rust` and `target/coverage/python`) before writing results to avoid failures on first use. - - Document the workflow in `codetracer-python-recorder/tests/README.md` (and reference the top-level `README` if needed) so contributors know when to run the coverage helpers versus the regular test splits. - -2. **CI Integration (non-blocking first pass)** - - Extend `.github/workflows/ci.yml` with optional `coverage-rust` and `coverage-python` jobs that depend on the primary test jobs and only run when `matrix.python-version == '3.12'` and `matrix.os == 'ubuntu-latest'` to avoid duplicate collection. - - Reuse the Just targets so CI mirrors local behaviour. Inject `RUSTFLAGS`/`RUSTDOCFLAGS` from the test jobs’ cache to avoid rebuilding dependencies. - - Publish artefacts via `actions/upload-artifact`: - - Rust: `codetracer-python-recorder/target/coverage/rust/lcov.info`, the machine-readable `summary.json`, and optionally a gzipped HTML folder produced via a follow-up `cargo llvm-cov nextest --html --output-dir …` run in the same job. - - Python: `codetracer-python-recorder/target/coverage/python/coverage.xml` and `coverage.json` for downstream tooling. - - Mark coverage steps with `continue-on-error: true` during the stabilisation phase and note the run IDs in the job summary for quick retrieval. - - Use a GitHub Action to post/update a PR comment that embeds the Rust and Python coverage summaries in Markdown (via `scripts/generate_coverage_comment.py` drawing from the JSON reports), giving reviewers quick insight without opening artefacts. - -3. **Reporting & Visualisation** - - Use GitHub Actions artefacts for report retrieval. - - Investigate integration with Codecov or Coveralls once the raw reports stabilise; defer external upload until initial noise is assessed. - -## Incremental Roll-Out -1. Land Just targets and documentation so engineers can generate coverage locally. -2. Add CI coverage steps guarded by `if: matrix.python-version == '3.12'` to avoid duplicate work across versions. -3. Monitor runtimes and artefact sizes for a few cycles. -4. Once stable: - - Remove `continue-on-error` and make coverage generation mandatory. - - Introduce thresholds (e.g., fail if Rust line coverage < 70% or Python < 60%)—subject to discussion with the Runtime Tracing Team. - -## Implementation Checklist -- [x] Update development environment dependencies (`flake.nix`, `pyproject.toml`) to support coverage tooling out of the box. -- [x] Add `just coverage-rust`, `just coverage-python`, and `just coverage` helpers with directory bootstrapping. -- [x] Refresh documentation (`codetracer-python-recorder/tests/README.md` and top-level testing guide) with coverage instructions. -- [x] Extend CI workflow with non-blocking coverage jobs and artefact upload. -- [x] Review initial coverage artefacts to set baseline thresholds before enforcement. - -## Risks & Mitigations -- **Runtime overhead:** Coverage runs are slower. Mitigate by limiting to a single matrix entry and caching `target/coverage` directories if needed. -- **Report size:** HTML artefacts can be large. Compress before upload and prune historical runs as necessary. -- **PyO3 instrumentation quirks:** Ensure `cargo llvm-cov` runs with `--no-default-features` similar to existing `nextest` invocation to avoid mismatched Python symbols. -- **Coverage accuracy:** Python subprocess-heavy tests may under-report coverage. Supplement with targeted unit tests already added in Stage 4. - -## Next Actions -- Implement the local Just targets and update documentation. -- Extend CI workflow with optional coverage steps (post-tests) and artefact upload. -- Align with the developer experience team before enforcing thresholds. - -_Status tracking lives in `design-docs/test-suite-coverage-plan.status.md`._ +- Produce Rust + Python coverage reports without slowing developers down. +- Keep outputs in `target/coverage` so everyone can inspect them later. +- Make it easy to add CI enforcement once numbers stabilise. + +## Tools +- Rust: `cargo llvm-cov` (works with `nextest`). +- Python: `pytest --cov` limited to `codetracer_python_recorder`. + +## Local commands +- `just coverage-rust` → creates `target/coverage/rust/lcov.info` and a summary JSON/HTML when needed. +- `just coverage-python` → runs pytest with XML + JSON reports under `target/coverage/python/`. +- `just coverage` → runs both steps. + +## CI plan (phase 1) +- Add optional jobs on Ubuntu + Python 3.12 that reuse the Just commands. +- Upload artefacts (Rust lcov + summary, Python XML/JSON). +- Mark jobs `continue-on-error` until results settle. +- Post a PR comment summarising the numbers from the JSON reports. + +## Rollout steps +1. Land the Just targets and document them in `tests/README.md`. +2. Wire up the optional CI jobs. +3. Watch runtimes + artefact sizes for a few runs. +4. When stable, drop `continue-on-error` and discuss minimum coverage thresholds. + +## Risks + mitigations +- **Slow runs** → limit coverage to one matrix entry, reuse caches. +- **Large artefacts** → compress HTML or keep only lcov/XML. +- **PyO3 quirks** → run `cargo llvm-cov` with the same feature flags as normal tests. diff --git a/design-docs/test-suite-coverage-plan.status.md b/design-docs/test-suite-coverage-plan.status.md index 809df37..5eb66fc 100644 --- a/design-docs/test-suite-coverage-plan.status.md +++ b/design-docs/test-suite-coverage-plan.status.md @@ -1,11 +1,11 @@ -# Test Suite Coverage Plan Status +# Coverage Plan Status Snapshot -## Current Status -- ✅ Plan doc expanded with prerequisites, detailed Just targets, CI strategy, and an implementation checklist (see `design-docs/test-suite-coverage-plan.md`). -- ✅ Implementation: coverage dependencies added to the dev shell (`flake.nix`) and UV groups (`pyproject.toml`). -- ✅ Implementation: `just coverage-*` helpers landed with matching documentation in `codetracer-python-recorder/tests/README.md`. -- ✅ Implementation: CI now runs `just coverage` on Python 3.12 with non-blocking jobs, uploads JSON/XML/LCOV artefacts, and posts a PR comment summarising Rust/Python coverage (`.github/workflows/ci.yml`). -- ✅ Assessment: capture baseline coverage numbers before proposing enforcement thresholds. +## Done +- ✅ Coverage plan doc updated with tooling + rollout details. +- ✅ Dev env ships `cargo-llvm-cov`, `pytest-cov`, and friends. +- ✅ `just coverage-rust`, `just coverage-python`, and `just coverage` exist with docs in `tests/README.md`. +- ✅ CI runs the coverage recipe on Ubuntu/Python 3.12, uploads artefacts, and posts the summary comment. +- ✅ Baseline numbers recorded for future thresholds. -## Next Steps -We are Done +## Next +- Nothing pending. diff --git a/design-docs/test-suite-improvement-plan.md b/design-docs/test-suite-improvement-plan.md index 205a3ae..7cd0d56 100644 --- a/design-docs/test-suite-improvement-plan.md +++ b/design-docs/test-suite-improvement-plan.md @@ -1,93 +1,30 @@ -# codetracer-python-recorder Test Suite Improvement Plan +# Test Suite Improvement Plan (Short Form) ## Goals -- Establish a transparent testing pyramid so engineers know whether new coverage belongs in Rust unit tests, Rust integration tests, or Python user-flow tests. -- Raise confidence in onboarding-critical paths (session bootstrap, activation gating, file outputs) by adding deterministic unit and integration tests. -- Reduce duplication and drift between Rust and Python harnesses by sharing fixtures and tooling. -- Prepare for future coverage metrics by making each harness runnable and observable in isolation. - -## Current Pain Points -- Python-facing tests currently live in `codetracer-python-recorder/test/` while Rust integration tests live in `codetracer-python-recorder/tests/`; the near-identical names are easy to mis-type and confuse CI/job configuration. -- Core bootstrap logic lacks direct coverage: no existing test references `TraceSessionBootstrap::prepare` or helpers inside `codetracer-python-recorder/src/session/bootstrap.rs`, and `TraceOutputPaths::configure_writer` in `codetracer-python-recorder/src/runtime/output_paths.rs` is only exercised implicitly. -- `ActivationController` in `codetracer-python-recorder/src/runtime/activation.rs` is only touched indirectly through long integration scripts, leaving edge cases (synthetic filenames, multiple activation toggles) unverified. -- Python helpers `_coerce_format`, `_validate_trace_path`, and `_normalize_activation_path` in `codetracer-python-recorder/codetracer_python_recorder/session.py` are not unit tested; regressions would surface only during end-to-end runs. -- `just test` hides which harness failed because both `cargo nextest run` and `pytest` report together; failures require manual reproduction to determine the responsible layer. - -## Workstreams - -### WS1 – Layout Consolidation & Tooling Updates -- Rename Python test directory to `codetracer-python-recorder/tests/python/` and move existing files. -- Move Rust integration tests into `codetracer-python-recorder/tests/rust/` and update `Cargo.toml` (if necessary) to ensure cargo discovers them. -- Add `codetracer-python-recorder/tests/README.md` describing the taxonomy and quick-start commands. -- Update `Justfile`, `pyproject.toml`, and any workflow scripts to call `pytest tests/python` explicitly. -- Exit criteria: `just test` logs identify `cargo nextest` and `pytest tests/python` as separate steps, and developers can run each harness independently. - -### WS2 – Rust Bootstrap Coverage -- Add `#[cfg(test)]` modules under `codetracer-python-recorder/src/session/bootstrap.rs` covering: - - Directory creation success and failure (non-directory path, unwritable path). - - Format resolution, including legacy aliases and error cases. - - Program metadata capture when `sys.argv` is empty or contains non-string values. -- Add tests for `TraceOutputPaths::new` and `configure_writer` under `codetracer-python-recorder/src/runtime/output_paths.rs`, using an in-memory writer stub to assert emitted file names and initial start position. -- Exit criteria: failures in any helper produce precise `PyRuntimeError` messages, and the new tests fail if error handling regresses. - -### WS3 – Activation Controller Guard Rails -- Introduce focused unit tests for `ActivationController` (e.g., `#[cfg(test)]` alongside `codetracer-python-recorder/src/runtime/activation.rs`) covering: - - Activation path matching and non-matching filenames. - - Synthetic filename rejection (`` and ``). - - Multiple activation cycles to ensure `activation_done` prevents re-entry. -- Extend existing `RuntimeTracer` tests to add a regression asserting that disabling synthetic frames keeps `CallbackOutcome::DisableLocation` consistent. -- Exit criteria: Activation tests run without spinning up full integration scripts and cover both positive and negative flows. - -### WS4 – Python API Unit Coverage & Fixtures -- Create a `tests/python/unit/` package with tests for `_coerce_format`, `_validate_trace_path`, `_normalize_activation_path`, and `TraceSession` life-cycle helpers (`flush`, `stop` when inactive). -- Extract reusable Python fixtures (temporary trace directory, environment manipulation) into `tests/python/support/` for reuse by integration tests. -- Confirm high-level tests (e.g., `test_monitoring_events.py`) import shared fixtures instead of duplicating temporary directory logic. -- Exit criteria: Python unit tests run without initialising the Rust extension, and integration tests rely on shared fixtures to minimise duplication. - -### WS5 – CI & Observability Enhancements -- Update CI workflows to surface separate status checks (e.g., `rust-tests`, `python-tests`). -- Add minimal coverage instrumentation: enable `cargo llvm-cov` (or `grcov`) for Rust helpers and `pytest --cov` for Python tests, even if we only publish the reports as artefacts initially. -- Document required commands in `tests/README.md` and ensure `just test` forwards `--nocapture`/`-q` flags appropriately. -- Exit criteria: CI reports the two harnesses independently, and developers can opt-in to coverage locally following documented steps. - -## Sequencing & Milestones - -1. **Stage 0 – Baseline (1 PR)** - - Capture current `just test` runtime and identify flaky tests. - - Snapshot trace files produced by `tests/test_monitoring_events.py` for regression comparison. - -2. **Stage 1 – Layout Consolidation (1–2 PRs)** - - Execute WS1: rename directories, update tooling, land `tests/README.md`. - -3. **Stage 2 – Bootstrap & Output Coverage (1 PR)** - - Execute WS2; ensure new tests pass on Linux/macOS runners. - -4. **Stage 3 – Activation Guard Rails (1 PR)** - - Execute WS3; ensure synthetic filename handling remains guarded. - -5. **Stage 4 – Python Unit Coverage (1 PR)** - - Execute WS4; migrate existing integration tests to shared fixtures. - -6. **Stage 5 – CI & Coverage Instrumentation (1 PR)** - - Execute WS5; update workflow files and document developer commands. - -7. **Stage 6 – Cleanup & Documentation (optional PR)** - - Update ADR status to **Accepted**, refresh onboarding docs, and archive baseline trace fixtures. - -## Verification Strategy -- Run `just test` after each stage; ensure both harnesses are explicitly reported in CI logs. -- Add `cargo nextest run --tests --nocapture activation::tests` smoke job to confirm activation unit coverage. -- For Python, run `pytest tests/python/unit -q` in isolation to keep the unit layer fast and deterministic. -- Compare stored trace fixtures before/after coverage additions to confirm no behavioural regressions. - -## Risks & Mitigations -- **Path renames break imports:** mitigate by landing directory changes alongside import updates and running `pytest -q` locally before merge. -- **Increased test runtime:** unit tests are lightweight; integration tests already dominate runtime. Monitor `just test` duration and consider parallel pytest execution if needed. -- **Coverage tooling churn:** start with optional coverage reports to avoid blocking CI; formal thresholds can follow once noise is understood. -- **PyO3 version mismatches:** ensure new Rust tests use `Python::with_gil` and `Bound<'_, PyAny>` consistently to avoid UB when running under coverage tools. - -## Deliverables & Ownership -- Primary owner: Runtime Tracing Team. -- Supporting reviewers: Python Tooling WG for Python fixtures and QA Automation Guild for CI changes. -- Target completion: end of Q4 FY25, ahead of planned streaming-writer work that depends on reliable regression coverage. - +- Clarify where tests belong (Rust unit vs. Rust integration vs. Python). +- Add coverage for session bootstrap, activation gating, and Python helpers. +- Keep each harness runnable on its own. + +## Pain today +- Python tests live in `test/` and clash with Rust `tests/` naming. +- Bootstrap helpers and activation controller lack direct tests. +- Python helpers go untested; regressions surface only via slow end-to-end runs. +- `just test` prints combined output, so it’s hard to see which layer failed. + +## Plan by stage +1. **Layout cleanup** – Move Python tests to `tests/python/`, Rust integration tests to `tests/rust/`, add `tests/README.md`, update tooling. +2. **Bootstrap coverage** – Unit-test `session::bootstrap` helpers and `runtime::output_paths` writer setup. +3. **Activation guard rails** – Unit-test `ActivationController` and confirm synthetic filenames return `DisableLocation`. +4. **Python unit layer** – Add `tests/python/unit` for `_coerce_format`, `_validate_trace_path`, `_normalize_activation_path`, etc., with shared fixtures in `tests/python/support`. +5. **CI polish** – Split CI jobs (`rust-tests`, `python-tests`), add optional coverage reports, and document commands in `tests/README.md`. +6. **Cleanup** – Update ADR status/docs and snapshot trace fixtures if needed. + +## Verification +- Run `just test` plus the individual harness commands after each stage. +- Keep small unit suites fast (`pytest tests/python/unit -q`, targeted `cargo nextest` invocations). +- Compare stored trace fixtures before/after major changes. + +## Risks +- Rename churn → land directory moves with matching import fixes. +- Runtime growth → monitor `just test`; consider parallel pytest if needed. +- Coverage noise → keep coverage optional until numbers stabilise. diff --git a/design-docs/test-suite-improvement-plan.status.md b/design-docs/test-suite-improvement-plan.status.md index 6b3a756..8f68080 100644 --- a/design-docs/test-suite-improvement-plan.status.md +++ b/design-docs/test-suite-improvement-plan.status.md @@ -1,26 +1,12 @@ -# Test Suite Improvement Plan Status +# Test Suite Improvement Status Snapshot -## Stage Summary -- ✅ Stage 0 – Baseline captured by ADR 0003 and the initial improvement plan. -- ✅ Stage 1 – Layout Consolidation: directory moves completed, test commands - updated, README added, and `just test` now runs the Rust and Python harnesses - separately. -- ✅ Stage 2 – Bootstrap & Output Coverage: unit tests now exercise - `TraceSessionBootstrap` helpers (directory/format/argv handling) and - `TraceOutputPaths::configure_writer`, with `just test` covering the new cases. -- ✅ Stage 3 – Activation Guard Rails: added unit tests around - `ActivationController` covering activation start, non-matching frames, and - deactivation behaviour; existing runtime integration tests continue to pass. -- ✅ Stage 4 – Python Unit Coverage: added `tests/python/unit/test_session_helpers.py` - for facade utilities and introduced `tests/python/support` for shared - fixtures; updated monitoring tests to use the helper directory builder. -- ✅ Stage 5 – CI & Coverage Instrumentation: CI now runs the split Rust/Python - test jobs plus a non-blocking coverage job that reuses `just coverage`, uploads - LCOV/XML/JSON artefacts, and posts a per-PR summary comment. -- ✅ Stage 6 – Cleanup & Documentation: ADR 0003 is now Accepted, top-level - docs describe the testing/coverage workflow, and the tests README references - the CI coverage comment for contributors. +## Complete +- ✅ Stage 1: Tests relocated (`tests/python`, `tests/rust`), new README, tooling updated. +- ✅ Stage 2: Bootstrap/output helpers now unit-tested. +- ✅ Stage 3: Activation controller + synthetic filename tests landed. +- ✅ Stage 4: Python unit suite and shared fixtures in place. +- ✅ Stage 5: CI split into Rust/Python jobs with optional coverage comment. +- ✅ Stage 6: ADR updated, docs refreshed, baseline traces stored. -## Next Actions -Plan complete; monitor coverage baselines and propose enforcement thresholds in -a follow-up task. +## Remaining work +- None. Keep monitoring coverage numbers for drift. diff --git a/design-docs/value-capture.md b/design-docs/value-capture.md index 4523806..4da232c 100644 --- a/design-docs/value-capture.md +++ b/design-docs/value-capture.md @@ -1,353 +1,33 @@ -Implement full variable capture in codetracer-python-recorder. Add a -comprehensive test suite. Here is the spec for the task and the tests: - -# Python Tracing Recorder: Capturing All Visible Variables at Each Line - -## Overview of Python Variable Scopes - -In CPython, the accessible variables at a given execution point consist of: - -* Local variables of the current function or code block (including parameters). - -* Closure (nonlocal) variables that come from enclosing functions (if any). - -* Global variables defined at the module level (the current module’s namespace). - -(Built-ins are also always accessible if not shadowed, but they are usually not included in “visible variables” snapshots for tracing.) - -Each executing frame in CPython carries these variables in its namespace. To capture a snapshot of all variables accessible at a line, we need to inspect the frame’s environment, combining locals, nonlocals, and globals. This must work for any code construct (functions, methods, comprehensions, class bodies, etc.) under CPython 3.12 and 3.13. - -## Using the CPython C API (via PyO3) to Get Variables - -1. **Access the current frame**: The sys.monitoring API’s line event callback does not directly provide a frame object. We can obtain the current PyFrameObject via the C API. Using PyO3’s FFI, you can call: -* `PyEval_GetFrame()` - return current thread state's frame, NULL if no frame is executing -* `PyThreadState_GetFrame(PyThreadState *tstate)` - return a given thread state's frame, NULL if on frame is currently executing. -This yields the top-of-stack frame – if your callback is a C function, that should be the frame of the user code. If your callback is a Python function, you may need frame.f_back to get the user code’s frame.) - -2. **Get all local and closure variables**: Once you have the `PyFrameObject *frame`, retrieve the frame’s local variables mapping. In Python 3.12+, `frame.f_locals` is a proxy that reflects both local variables and any closure (cell/free) variables with their current values. In C, you can use `PyFrame_GetLocals(frame)` - -3. **Get global variables**: The frame’s globals are in `frame.f_globals`. You can obtain this dictionary via `PyFrame_GetGlobals(frame)`. This is the module’s global namespace. - -4. Encode them in the trace state. You can use the function `encode_value` to encode each one of those variables in a format suitable for recording and then record them using the capabilities provided by `runtime_tracing` crate. - -## Important Details and Edge Cases - -* **Closure (free) variables**: In modern CPython, closure variables are handled seamlessly via the frame’s locals proxy. You do not need to separately fetch function.__closure__ or outer frame variables – the frame’s local mapping already includes free vars. The PEP for frame proxies explicitly states that each access to `frame.f_locals` yields a mapping of local and closure variable names to their current values. This ensures that in a nested function, variables from an enclosing scope (nonlocals) appear in the inner frame’s locals mapping (bound to the value in the closure cell). - -* **Comprehensions and generators**: In Python 3, list comprehensions, generator expressions, and the like are implemented as separate function frames. The above approach still works since those have their own frames (with any needed closure variables included similarly). Just grab that frame’s locals and globals as usual. - -* **Class bodies and module level**: A class body or module top-level code is executed in an unoptimized frame where `locals == globals` (module) or a new class namespace dict. You need to make sure that you don't record variables twice! Here's a sketch how to do this: -```rust -use pyo3::prelude::*; -use pyo3::ffi; -use std::ptr; - -pub unsafe fn locals_is_globals_ffi(_py: Python<'_>, frame: *mut ffi::PyFrameObject) -> PyResult { - // Ensure f_locals exists and is synced with fast-locals - if ffi::PyFrame_FastToLocalsWithError(frame) < 0 { - return Err(PyErr::fetch(_py)); - } - let f_locals = (*frame).f_locals; - let f_globals = (*frame).f_globals; - Ok(!f_locals.is_null() && ptr::eq(f_locals, f_globals)) -} -``` - -* **Builtins**: Typically, built-in names (from frame.f_builtins) are implicitly accessible if not shadowed, but they are usually not included in a variables snapshot. You should ignore the builtins - -* **Name resolution order**: If needed, CPython 3.12 introduced PyFrame_GetVar(frame, name) which will retrieve a variable by name as the interpreter would – checking locals (including cells), then globals, then builtins. This could be used to fetch specific variables on demand. However, for capturing all variables, it’s more efficient to pull the mappings as described above rather than querying names one by one. - - -## Putting It Together - -In your Rust/PyO3 tracing recorder, for each line event you can do something like: - -* Get the current frame (`frame_obj`). - -* Get the locals proxy via `PyFrame_GetLocals`. Iterate over each object, construct its representation via `encode_value` and then add it to the trace. - -* If locals != globals, get the globals dict (`globals_dict = PyFrame_GetGlobals(frame_obj)`) and process it just like the locals - - -By using these facilities via PyO3, you can reliably capture all visible variables at each line of execution in your tracing recorder. - -## References - -Python C-API – Frame Objects: functions to access frame attributes (locals, globals, etc.). - -PEP 667 – Frame locals proxy (Python 3.13): frame.f_locals now reflects local + cell + free variables’ values. - -PEP 558 – Defined semantics for locals(): introduced Py - - -# Comprehensive Test Suite for Python Tracing Recorder - -This test suite is designed to verify that a tracing recorder (using sys.monitoring and frame inspection) correctly captures all variables visible at each executable line of Python code. Each test covers a distinct scope or visibility scenario in Python. The tracer should record every variable that is in scope at that line, ensuring no visible name is missed. We include functions, closures, globals, class scopes, comprehensions, generators, exception blocks, and more, to guarantee full coverage of Python's LEGB (Local, Enclosing, Global, Built-in) name resolution rules. - -Each test case below provides a brief description of what it covers, followed by a code snippet (Python script) that exercises that behavior. No actual tracing logic is included – we only show the source code whose execution should be monitored. The expectation is that at runtime, the tracer’s LINE event will fire on each line and the recorder will capture all variables accessible in that scope at that moment. - -## 1. Simple Function: Parameters and Locals - -**Scope**: This test focuses on a simple function with a parameter and local variables. It verifies that the recorder sees function parameters and any locals on each line inside the function. On entering the function, the parameter should be visible; as lines execute, newly assigned local variables become visible too. This ensures that basic function scope is handled. - -```py -def simple_function(x): - a = 1 # Parameter x is visible; local a is being defined - b = a + x # Locals a, b and parameter x are visible (b defined this line) - return a, b # Locals a, b and x still visible at return - - -# Test the function -result = simple_function(5) -``` - -_Expected_: The tracer should capture x (parameter) and then a and b as they become defined in simple_function. - -## 2. Nested Functions and Closure Variables (nonlocal) - -**Scope**: This test covers nested functions, where an inner function uses a closure variable from its outer function. We verify that variables in the enclosing (nonlocal) scope are visible inside the inner function, and that the nonlocal statement allows the inner function to modify the outer variable. Both the outer function’s locals and the inner function’s locals (plus closed-over variables) should be captured appropriately. - -```py -def outer_func(x): - y = 1 - def inner_func(z): - nonlocal y # Declare y from outer_func as nonlocal - w = x + y + z # x (outer param), y (outer var), z (inner param), w (inner local) - y = w # Modify outer variable y - return w - total = inner_func(5) # Calls inner_func, which updates y - return y, total # y is updated in outer scope -result = outer_func(2) -``` - -_Expected_: Inside `inner_func`, the tracer should capture x, y (from outer scope), z, and w at each line. In `outer_func`, it should capture x, y, and later the returned total. This ensures enclosing scope variables are handled (nonlocal variables are accessible to nested functions). - -## 3. Global and Module-Level Variables - -**Scope**: This test validates visibility of module-level (global) variables. It defines globals and uses them inside a function, including modifying a global with the global statement. We ensure that at each line, global names are captured when in scope (either at the module level or when referenced inside a function). - -```py -GLOBAL_VAL = 10 -counter = 0 - -def global_test(): - local_copy = GLOBAL_VAL # Access a global variable - global counter - counter += 1 # Modify a global variable - return local_copy, counter - -# Use the function and check global effects -before = counter -result = global_test() -after = counter -``` - -_Expected_: The tracer should capture *GLOBAL_VAL* and counter as globals on relevant lines. At the module level, GLOBAL_VAL, counter, before, after, etc. are in the global namespace. Inside global_test(), it should capture local_copy and see GLOBAL_VAL as a global. The global counter declaration ensures counter is treated as global in that function and its updated value remains in the module scope. - -## 4. Class Definition Scope and Metaclass - -**Scope:** This test targets class definition bodies, including the effect of a metaclass. When a class body executes, it has a local namespace that becomes the class’s attribute dictionary. We verify that variables assigned in the class body are captured, and that references to those variables or to globals are handled. Additionally, we include a metaclass to ensure that class creation via a metaclass is also traced. - -```python -CONSTANT = 42 - -class MetaCounter(type): - count = 0 - def __init__(cls, name, bases, attrs): - MetaCounter.count += 1 # cls, name, bases, attrs visible; MetaCounter.count updated - super().__init__(name, bases, attrs) - -class Sample(metaclass=MetaCounter): - a = 10 - b = a + 5 # uses class attribute a - print(a, b, CONSTANT) # can access class attrs a, b and global CONSTANT - def method(self): - return self.a + self.b - -# After class definition, metaclass count should have incremented -instances = MetaCounter.count -``` - -**Expected:** Within `MetaCounter`, the tracer should capture class-level attributes like `count` as well as method parameters (`cls`, `name`, `bases`, `attrs`) during class creation. In `Sample`’s body, it should capture `a` once defined, then `b` and `a` on the next line, and even allow access to `CONSTANT` (a global) during class body execution. After definition, `Sample.a` and `Sample.b` exist as class attributes (not directly as globals outside the class). The tracer should handle the class scope like a local namespace for that block. - -## 5. Lambdas and Comprehensions (List, Set, Dict, Generator) - -**Scope:** This combined test covers lambda expressions and various comprehensions, each of which introduces an inner scope. We ensure the tracer captures variables inside these expressions, including any outer variables they close over and the loop variables within comprehensions. Notably, in Python 3, the loop variable in a comprehension is local to the comprehension and not visible outside. - -Lambda: Tests an inline lambda function with its own parameter and expression. - -List Comprehension: Uses a loop variable internally and an external variable. - -Set & Dict Comprehensions: Similar scope behavior with their own loop variables. - -Generator Expression: A generator comprehension that lazily produces values. - -```python -factor = 2 -double = lambda y: y * factor # 'y' is local parameter, 'factor' is captured from outer scope - -squares = [n**2 for n in range(3)] # 'n' is local to comprehension, not visible after -scaled_set = {n * factor for n in range(3)} # set comprehension capturing outer 'factor' -mapping = {n: n*factor for n in range(3)} # dict comprehension with local n -gen_exp = (n * factor for n in range(3)) # generator expression (lazy evaluated) -result_list = list(gen_exp) # force generator to evaluate -``` - -**Expected:** Inside the lambda, `y` (parameter) and `factor` (enclosing variable) are visible to the tracer. In each comprehension, the loop variable (e.g., `n`) and any outer variables (`factor`) should be captured during the comprehension's execution. After the comprehension, the loop variable is no longer defined (e.g., `n` is not accessible outside the list comprehension). The generator expression has a similar scope to a comprehension; its variables should be captured when it's iterated. All these ensure the recorder handles anonymous function scopes and comprehension internals. - -## 6. Generators and Coroutines (async/await) - -**Scope:** This test covers a generator function and an async coroutine function. Generators use yield to produce values and suspend execution, while async coroutines use await. We ensure that local variables persist across yields/awaits and remain visible when execution resumes (on each line hit). This verifies that the tracer captures the state in suspended functions. - -```python -def counter_gen(n): - total = 0 - for i in range(n): - total += i - yield total # At yield: i and total are visible and persisted across resumes - return total - -import asyncio -async def async_sum(data): - total = 0 - for x in data: - total += x - await asyncio.sleep(0) # At await: x and total persist in coroutine - return total - -# Run the generator -gen = counter_gen(3) -gen_results = list(gen) # exhaust the generator - -# Run the async coroutine -coroutine_result = asyncio.run(async_sum([1, 2, 3])) -``` - -**Expected:** In `counter_gen`, at each yield line the tracer should capture `i` and `total` (and after resumption, those values are still available). In `async_sum`, at the await line, `x` and `total` are captured and remain after the await. The tracer must handle the resumption of these functions (triggered by `PY_RESUME` events) and still see previously defined locals. This test ensures generator state and coroutine state do not lose any variables between pauses. - -## 7. Try/Except/Finally and With Statement - -**Scope:** This test combines exception handling blocks and context manager usage. It verifies that the tracer captures variables introduced in a try/except flow (including the exception variable, which has a limited scope) as well as in a with statement context manager. We specifically ensure the exception alias is only visible inside the except block, and that variables from try, else, and finally blocks, as well as the with target, are all accounted for. - -```python -def exception_and_with_demo(x): - try: - inv = 10 / x # In try: 'inv' defined if no error - except ZeroDivisionError as e: - error_msg = f"Error: {e}" # In except: 'e' (exception) and 'error_msg' are visible - else: - inv += 1 # In else: 'inv' still visible here - finally: - final_flag = True # In finally: 'final_flag' visible (e is out of scope here) - - with open(__file__, 'r') as f: - first_line = f.readline() # Inside with: 'f' (file handle) and 'first_line' visible - return locals() # return all locals for inspection - -# Execute with a case that triggers exception and one that does not -result1 = exception_and_with_demo(0) # triggers ZeroDivisionError -result2 = exception_and_with_demo(5) # normal execution -``` - -**Expected:** In the except block, the tracer should capture the exception object name (`e`) and any locals like `error_msg`, but after the block `e` goes out of scope (no longer in `locals()`). The else block runs when no exception, and the tracer sees `inv` there. The finally block executes in both cases, with `final_flag` visible. During the with block, the tracer captures the context manager’s target (`f`) and any inner variables (`first_line`). This test ensures all branches of try/except/else/finally and the scope entering/exiting a with are handled. - -## 8. Decorators and Function Wrappers - -**Scope:** This test involves function decorators, which themselves often use closures. We have a decorator that closes over a free variable and wraps a function. The goal is to ensure that when the decorated function is defined and called, the tracer captures variables both in the decorator’s scope and in the wrapped function’s scope. This covers the scenario of variables visible during decoration and invocation. - -```python -setting = "Hello" - -def my_decorator(func): - def wrapper(*args, **kwargs): - # Inside wrapper: 'args', 'kwargs', and 'setting' from outer scope are visible - print("Decorator wrapping with setting:", setting) - return func(*args, **kwargs) - return wrapper - -@my_decorator -def greet(name): - message = f"Hi, {name}" # Inside greet: 'name' and 'message' are locals - return message - -# Call the decorated function -output = greet("World") -``` - -**Expected:** When defining `greet`, the decorator `my_decorator` is applied. The tracer should capture that process: inside `my_decorator`, the `func` parameter and the outer variable `setting` are visible. Within `wrapper`, on each call, `args`, `kwargs`, and the closed-over `setting` are visible to the tracer. Inside `greet`, normal function locals apply (`name`, `message`). This test ensures decorated functions don’t hide any variables from the tracer (it must trace through the decorator and the function execution). - -## 9. Dynamic Execution (eval and exec) - -**Scope:** This test checks dynamic creation and access of variables using `eval()` and `exec()`. The recorder should capture variables introduced by an exec at the moment they become available, as well as usage of variables via eval strings. We ensure that even dynamically created names or accessed names are seen by the tracer just like normal variables. - -```python -expr_code = "dynamic_var = 99" -exec(expr_code) # Executes code, defining a new variable dynamically -check = dynamic_var + 1 # Uses the dynamically created variable - -def eval_test(): - value = 10 - formula = "value * 2" - result = eval(formula) # 'value' (local) is accessed dynamically via eval - return result -out = eval_test() -``` - -**Expected:** At the `exec(expr_code)` line, the tracer should capture that `dynamic_var` gets created in the global scope. On the next line, `dynamic_var` is visible and used. Inside `eval_test()`, when `eval(formula)` is executed, the tracer should see the local `value` (and `formula`) in that frame, confirming that eval could access `value`. All dynamically introduced or accessed names should be recorded as they appear. - -## 10. Import Statements and Visibility - -**Scope:** This test covers the effect of import statements on variable visibility. Importing modules or names introduces new variables (module objects or imported names) into the local or global namespace. We test both a global import and a local (within-function) import to ensure the tracer captures these names when they become available. - -```python -import math # Import at module level introduces 'math' in globals - -def import_test(): - import os # Import inside function introduces 'os' as a local name - constant = math.pi # Can use global import inside function - cwd = os.getcwd() # Uses the locally imported module - return constant, cwd - -val, path = import_test() -``` - -**Expected:** After the top-level import `math`, the tracer should list `math` as a new global variable. Inside `import_test()`, after the `import os` line, `os` should appear as a local variable in that function’s scope. The usage of `math.pi` shows that globals remain accessible in the function, and the use of `os.getcwd()` confirms `os` is in the local namespace. This test ensures imported names are captured at the appropriate scope (global or local) when they are introduced. - -## 11. Built-in Scope (Builtins) - -**Scope:** This test highlights built-in names, which are always available via Python’s built-in scope (e.g., `len`, `print`, `ValueError`). The tracer is not required to explicitly list all built-ins at each line (as that would be overwhelming), but we include this case to note that built-in functions or constants are accessible in any scope. We ensure usage of a built-in is traced like any other variable access, although the recorder -may choose not to list the entire built-in namespace. - -```python -def builtins_test(seq): - n = len(seq) # 'len' is a built-in function - m = max(seq) # 'max' is another built-in - return n, m - -result = builtins_test([5, 3, 7]) -``` - -**Expected:** In the `builtins_test` function, calls to `len` and `max` are made. The tracer would see `seq`, `n`, and `m` as local variables, and while `len`/`max` are resolved from the built-in scope, the recorder may not list them as they are implicitly available (built-ins are found after global scope in name resolution). The important point is that using built-ins does not introduce new names in the user-defined scopes. This test is mostly a note that built-in scope exists and built-in names are always accessible (the tracer could capture them, but it's typically unnecessary to record every built-in name). - ---- - -**Conclusion:** The above tests collectively cover all major visibility scenarios in Python. By running a tracing recorder with these snippets, one can verify that at every executable line, the recorder correctly identifies all variables that are in scope (function locals, closure variables, globals, class locals, comprehension temporaries, exception variables, etc.). This comprehensive coverage ensures the tracing tool is robust against Python’s various scoping rules and constructs. - -# General Rules - -* This spec is for `/codetracer-python-recorder` project and NOT for `/codetracer-pure-python-recorder` -* Code and tests should be added to `/codetracer-python-recorder/src/runtime_tracer.rs` -* Performance is important. Avoid using Python modules and functions and prefer PyO3 methods including the FFI API. -* If you want to run Python do it like so `uv run python` This will set up the right venv. Similarly for running tests `uv run pytest`. -* After every code change you need to run `just dev` to make sure that you are testing the new code. Otherwise some tests might run against the old code - -* Avoid defensive programming: when encountering edge cases which are - not explicitly mentioned in the specification, the default behaviour - should be to crash (using `panic!`). We will only handle them after - we receive a report from a user which confirms that the edge case - does happen in real life. -* Do not make any code changes to unrelated parts of the code. The only callback that should change behaviour is `on_line` -* If the code has already implemented part of the specification described here find out what is missing and implement that -* If a test fails repeatedly after three attempts to fix the code STOP. Let a human handle it. DON'T DELETE TESTS!!! -* When writing tests be careful with concurrency. If two tests run at the same time using the same Python interpreter (or same Rust process?) they will both try to register callbacks via sys.monitoring and could deadlock. -* If you want to test Rust code without using just, use `cargo nextest`, not `cargo test` +# Value Capture Plan (Bite Size) + +## Goal +For every line event, record every variable the frame can see (locals, nonlocals, globals) without crashing CPython. + +## How to grab data +1. Get the active frame with `PyEval_GetFrame()` (or `PyThreadState_GetFrame`). +2. Call `PyFrame_FastToLocalsWithError` before reading mappings. +3. Pull locals (includes closure vars in 3.12+) via `PyFrame_GetLocals`. +4. Pull globals via `PyFrame_GetGlobals`; skip if it’s the same dict as locals. +5. Encode each entry through our existing `encode_value` helper and write to the trace. +6. Keep builtins out of the snapshot. + +## Edge rules +- Treat filenames like `` elsewhere; value capture only runs when the tracer says the code is traceable. +- Frames for comprehensions, generators, and class bodies work the same—each has its own locals dict. +- Watch for locals==globals (module/class bodies); avoid double-recording. +- If encoding fails, surface a `RecorderError` and mark the event partial. + +## Safety +- Hide all raw pointer work inside a dedicated `frame_inspector` module with RAII guards. +- Ensure helpers run under the GIL and clean up borrowed references. + +## Test checklist +- Simple function: parameters and locals appear as they are assigned. +- Nested closure: inner frame sees outer variables via locals proxy. +- Globals + `global` statement: updates reflect in module scope. +- Class body with metaclass: captures class attributes during definition. +- Comprehensions, lambdas, generator expressions: loop vars captured during execution, not leaked after. +- Generator + async def: values persist across yield/await boundaries. +- Exception handler: `except as err` exposes the alias. +- Context manager `with` target and walrus assignments show up once bound. +- Ensure cycle detection prevents infinite recursion when encoding self-referential structures.