Skip to content

Instantly share code, notes, and snippets.

@blink1073
Created May 27, 2026 12:58
Show Gist options
  • Select an option

  • Save blink1073/a081fecad48626089c65d8973f55cfeb to your computer and use it in GitHub Desktop.

Select an option

Save blink1073/a081fecad48626089c65d8973f55cfeb to your computer and use it in GitHub Desktop.
PYTHON-5745: Consolidate logging and monitoring into a single internal API

Plan: PYTHON-5745 — Consolidate logging and monitoring into a single internal API

Context

The driver currently duplicates code at every telemetry call site: each event (command started/succeeded/failed, connection pool lifecycle, server selection, heartbeats) has two parallel blocks — one calling _debug_log(...) for structured logging and one calling listeners.publish_*() for APM event publishing. This duplication clutters the codebase and will make adding OpenTelemetry spans in PYTHON-5052 very painful (every call site would need a third block).

The solution is a unified telemetry API where a single call handles all telemetry channels simultaneously. A PoC exists in PR #2720 (command events only) using a context manager pattern.

Constraints:

  • No external behavior changes (logging output and APM events must remain identical)
  • No performance regressions (guard checks like isEnabledFor(logging.DEBUG) must be preserved)
  • Async files at pymongo/asynchronous/ are auto-generated from pymongo/synchronous/ via tools/synchro.py — edit sync files only, then regenerate async

Implementation Plan

Step 1: Create pymongo/_telemetry.py

New module with unified telemetry classes. Each class is a context manager that publishes to both logging and APM event channels.

_CommandTelemetry (primary focus, per PR #2720 PoC):

  • __init__: captures command_name, database_name, spec, driver_connection_id, server_connection_id, service_id, address, listeners, request_id, operation_id, client, publish_event; sets _published = False
  • __enter__: logs STARTED + publishes publish_command_start(), returns self
  • handle_succeeded(reply, speculative_hello=None): calculates duration, logs SUCCEEDED + publishes publish_command_success(), sets _handled = True
  • __exit__: if an exception is propagating and _handled is False, calls _handle_failed(exc_val) internally — no need for the caller to handle failure explicitly

_PoolTelemetry (function-based helpers or thin class):

  • Functions _publish_pool_created, _publish_pool_ready, _publish_pool_cleared, _publish_pool_closed, _publish_conn_created, _publish_conn_ready, _publish_conn_closed, _publish_checkout_started, _publish_checkout_succeeded, _publish_checkout_failed, _publish_checkin — each calls both _debug_log() and listeners.publish_*()

_ServerSelectionTelemetry (logging-only currently, but unified for future APM):

  • Functions _log_server_selection_started, _log_server_selection_succeeded, _log_server_selection_failed, _log_server_selection_waiting

Step 2: Refactor command events

Replace paired log+publish blocks in these sync files (async auto-generated):

File Events Lines (approx)
pymongo/synchronous/network.py STARTED, SUCCEEDED, FAILED 163–290
pymongo/synchronous/bulk.py STARTED, SUCCEEDED, FAILED 255–310
pymongo/synchronous/client_bulk.py STARTED, SUCCEEDED, FAILED similar

Pattern before:

if _COMMAND_LOGGER.isEnabledFor(logging.DEBUG):
    _debug_log(_COMMAND_LOGGER, message=_CommandStatusMessage.STARTED, ...)
if publish:
    listeners.publish_command_start(...)
try:
    reply = conn.write_command(...)
    duration = ...
    if _COMMAND_LOGGER.isEnabledFor(logging.DEBUG):
        _debug_log(_COMMAND_LOGGER, message=_CommandStatusMessage.SUCCEEDED, ...)
    if publish:
        listeners.publish_command_success(...)
except Exception as exc:
    ...
    if _COMMAND_LOGGER.isEnabledFor(logging.DEBUG):
        _debug_log(_COMMAND_LOGGER, message=_CommandStatusMessage.FAILED, ...)
    if publish:
        listeners.publish_command_failure(...)
    raise

Pattern after:

with _CommandTelemetry(...) as t:
    reply = conn.write_command(...)
    t.handle_succeeded(reply)
# __exit__ automatically calls _handle_failed if an exception propagates

Step 3: Refactor connection pool events

Replace paired log+CMAP blocks in pymongo/synchronous/pool.py (pool.py lines ~510–1430).

Each pair like:

if self.enabled_for_cmap:
    listeners.publish_pool_created(...)
if self.enabled_for_logging and _CONNECTION_LOGGER.isEnabledFor(logging.DEBUG):
    _debug_log(_CONNECTION_LOGGER, ...)

Becomes a single call to the unified pool telemetry helper.

Step 4: Refactor server selection events

Replace _debug_log calls in pymongo/synchronous/topology.py (lines ~325–449) with calls to _ServerSelectionTelemetry helpers.

Step 5: Regenerate async files

python tools/synchro.py

This regenerates pymongo/asynchronous/ from the sync equivalents.


Critical Files

  • New: pymongo/_telemetry.py
  • Modified (sync): pymongo/synchronous/network.py, pymongo/synchronous/bulk.py, pymongo/synchronous/client_bulk.py, pymongo/synchronous/pool.py, pymongo/synchronous/topology.py
  • Auto-generated (async): pymongo/asynchronous/network.py, pymongo/asynchronous/bulk.py, pymongo/asynchronous/client_bulk.py, pymongo/asynchronous/pool.py, pymongo/asynchronous/topology.py
  • Reference: pymongo/logger.py (reuse _debug_log, _CommandStatusMessage, _ConnectionStatusMessage, _ServerSelectionStatusMessage, all loggers)
  • Reference: pymongo/monitoring.py (_EventListeners.publish_* methods)

Verification

# Run full test suite
python -m pytest test/ -x

# Targeted: command monitoring + logging
python -m pytest test/test_command_monitoring.py test/test_command_logging.py -v

# Targeted: connection pool
python -m pytest test/test_connection_logging.py test/test_monitoring.py -v

# Targeted: server selection logging
python -m pytest test/test_server_selection_logging.py -v

# Type checking + lint
just typing
just pre-commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment