Skip to content

Conversation

Copy link

Copilot AI commented Sep 18, 2025

Summary

This PR introduces __slots__ to 12 frequently-instantiated lightweight classes in cassandra/connection.py to reduce per-instance memory overhead and improve performance. This is the first phase of memory optimization improvements as discussed previously.

Motivation

Many small objects in connection.py are created per connection or per request cycle. Each Python object without __slots__ maintains a per-instance __dict__ (and sometimes a __weakref__) that adds significant overhead (~56+ bytes depending on Python build). For frequently instantiated objects like endpoints, frames, and utility classes, this overhead can accumulate to substantial memory usage.

Changes

Added __slots__ to the following classes with their respective slot definitions:

  • EndPoint: () - Base class with empty slots for consistency
  • DefaultEndPoint: ('_address', '_port')
  • SniEndPoint: ('_proxy_address', '_index', '_resolved_address', '_port', '_server_name', '_ssl_options')
  • UnixSocketEndPoint: ('_unix_socket_path',)
  • _Frame: ('version', 'flags', 'stream', 'opcode', 'body_offset', 'end_pos')
  • ContinuousPagingSession: ('stream_id', 'decoder', 'row_factory', 'connection', '_condition', '_stop', '_page_queue', '_state', 'released')
  • ShardawarePortGenerator: () - Utility class with classmethods only
  • _ConnectionIOBuffer: ('_io_buffer', '_cql_frame_buffer', '_connection', '_segment_consumed')
  • ResponseWaiter: ('connection', 'pending', 'fail_on_error', 'error', 'responses', 'event')
  • HeartbeatFuture: ('_exception', '_event', 'connection', 'owner')
  • Timer: ('end', 'callback', 'canceled')
  • TimerManager: ('_queue', '_new_timers')

Technical Details

  • Fixed Timer class initialization: Moved canceled = False from class variable to proper instance initialization to work correctly with __slots__
  • Fixed _ConnectionIOBuffer: Removed conflicting class variables and ensured all slot attributes are properly initialized in __init__
  • Maintained inheritance: EndPoint base class uses empty slots, allowing subclasses to define their own slots without conflicts
  • Preserved all functionality: No API changes, all existing methods and properties work identically

Benefits

  • Memory Reduction: Eliminates per-instance __dict__ overhead, reducing memory footprint by ~56+ bytes per object
  • Performance Improvement: Better cache locality and reduced garbage collection pressure
  • Safety: Prevents accidental dynamic attribute creation, making attribute errors surface early
  • Scalability: Memory savings multiply with the number of connections and requests

Testing

  • Added comprehensive test suite (tests/unit/test_connection_slots.py) to verify __slots__ implementation
  • Verified all classes instantiate correctly and have no __dict__ attributes
  • Confirmed dynamic attribute assignment properly raises AttributeError
  • Validated inheritance and polymorphism still work correctly
  • Tested all existing functionality remains intact

Compatibility

This is a purely internal memory optimization with no breaking changes. All existing code will continue to work exactly as before, but with reduced memory overhead for these lightweight objects.

Future Work

This sets the foundation for subsequent memory optimization PRs that can:

  • Refactor buffer management to eliminate BytesIO copies
  • Replace request bookkeeping dict with indexed array structures
  • Introduce memoryview-based frame parsing

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • proxy.example.com
    • Triggering command: `python -c

Test inheritance behavior with slots

from cassandra.connection import EndPoint, DefaultEndPoint, SniEndPoint, UnixSocketEndPoint

print('=== Inheritance and Polymorphism Tests ===')

Create instances of all endpoint types

endpoints = [
DefaultEndPoint('127.0.0.1', 9042),
SniEndPoint('proxy.example.com', 'server.example.com', 9042),
UnixSocketEndPoint('/tmp/cassandra.sock')
]

Test that they are all instances of EndPoint

for ep in endpoints:
print(f'{ep.class.name}: isinstance(EndPoint)={isinstance(ep, EndPoint)}, address={ep.address}')
print(f' - has dict: {hasattr(ep, "dict")}')
print(f' - slots: {getattr(ep.class, "slots", "NOT FOUND")}')

Test EndPoint base class

print(f'EndPoint.slots: {getattr(EndPoint, "slots", "NOT FOUND")}')

Test that resolve() method works (polymorphism)

for ep in endpoints:
try:
resolved = ep.resolve()
print(f'{ep.class.name}.resolve() -> {resolved}')
except Exception as e:
print(f'{ep.class.name}.resolve() failed: {e}')

print('\nInheritance tests completed successfully!')` (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

This pull request was created as a result of the following prompt from Copilot chat.

Summary

Introduce slots to several lightweight, frequently-instantiated classes to reduce per-instance memory overhead and object dictionary allocations. This is an incremental memory optimization step discussed previously (adding slots as the first phase of memory utilization improvements).

Rationale

Many small objects in connection.py are created per connection or per request cycle. Each Python object without slots maintains a per-instance dict (and sometimes a weakref) that adds overhead (~56+ bytes depending on Python build). Adding slots for classes with a fixed, known attribute set reduces memory pressure, improves cache locality, and can marginally reduce GC overhead.

Classes targeted are ones that:

  • Have a stable, limited attribute set
  • Are not designed for dynamic attribute injection
  • Are not subclasses of threading.Thread (Thread relies on dynamic attributes)
  • Do not appear to rely on pickling with dynamic attributes

Excluded: Connection, ConnectionHeartbeat (subclasses Thread), and other complex or dynamic classes to avoid risk.

Changes Proposed

Add slots to the following classes in cassandra/connection.py:

  1. EndPoint
  2. DefaultEndPoint
  3. SniEndPoint
  4. UnixSocketEndPoint
  5. _Frame
  6. ContinuousPagingSession
  7. ShardAwarePortGenerator
  8. _ConnectionIOBuffer
  9. ResponseWaiter
  10. HeartbeatFuture
  11. Timer
  12. TimerManager

Slot Definitions (proposed)

  • EndPoint: no instance attributes currently set directly; define empty slots = () for consistency and to prevent accidental dict creation in subclasses that also use slots.
  • DefaultEndPoint: ('_address', '_port')
  • SniEndPoint: ('_proxy_address', '_index', '_resolved_address', '_port', '_server_name', '_ssl_options')
  • UnixSocketEndPoint: ('_unix_socket_path',)
  • _Frame: ('version', 'flags', 'stream', 'opcode', 'body_offset', 'end_pos')
  • ContinuousPagingSession: ('stream_id','decoder','row_factory','connection','_condition','_stop','_page_queue','_state','released')
  • ShardAwarePortGenerator: ('start_port','end_port')
  • _ConnectionIOBuffer: ('_io_buffer','_cql_frame_buffer','_connection','_segment_consumed') (Drops implicit ability to add new attrs)
  • ResponseWaiter: ('connection','pending','fail_on_error','error','responses','event')
  • HeartbeatFuture: ('_exception','_event','connection','owner')
  • Timer: ('end','callback','canceled')
  • TimerManager: ('_queue','_new_timers')

Safety Notes

  • Verified none of these classes assign attributes outside the proposed slot sets.
  • Thread subclass (ConnectionHeartbeat) intentionally excluded.
  • If later code attempts to add new attributes dynamically, an AttributeError will surface early, making it safer to catch design drift.

Testing Guidance

  1. Run existing unit/integration test suite; no functional behavior should change.
  2. (Optional) Add a lightweight regression test ensuring instances of these classes do not expose dict (e.g., assert not hasattr(obj, 'dict')).
  3. Use a memory profiling script (optional, not included) to confirm reduced per-instance size (e.g., via sys.getsizeof and tracemalloc deltas across many instantiations).

Future Work

Subsequent PRs can:

  • Refactor buffer management to eliminate BytesIO copies
  • Replace request bookkeeping dict with indexed array structure
  • Introduce memoryview-based frame parsing

Implementation

Modify cassandra/connection.py accordingly. No API changes.

Please create a new branch and apply these changes.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add __slots__ to lightweight classes to reduce memory usage Add __slots__ to lightweight connection classes for memory optimization Sep 18, 2025
Copilot AI requested a review from mykaul September 18, 2025 10:40
Copilot finished work on behalf of mykaul September 18, 2025 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants