Skip to content

Conversation

@morgan-wowk
Copy link

@morgan-wowk morgan-wowk commented Jan 12, 2026

Issue
#67

Changes:

  • Adds trace context helpers
  • Add trace middleware to apply trace context to API requests
    • This helps logging formatters pull trace ids out of context to set them in logs
    • Sets the trace id on the response for client consumption
  • Wraps API requests with trace context using middleware
  • Passes trace id to pipeline executions using extras parameter
  • Wraps pipeline execution with trace context (using trace id from extras parameter, or creating a new one if None)

Test pipeline notes:

  • Requires ytest, starlette, httpx (e.g. pip3 install ...)

@Ark-kun Ark-kun self-assigned this Jan 13, 2026
@Ark-kun
Copy link
Contributor

Ark-kun commented Jan 13, 2026

Thank you for this PR.

I'm reviewing the changes and I think we might want to design this slightly differently.
I think that trace IDs are mostly relevant to API Server and API requests.
Orchestrator works autonomously, it's not triggered by API requests, so it does not have them or need them. Instead, the ExecutionNode.id and ContainerExecution.id already serve the same role.

On the other hand, API requests might benefit from trace IDs so that all logging messages that are generated when processing a single API call can be filtered and grouped together.

@morgan-wowk
Copy link
Author

morgan-wowk commented Jan 13, 2026

Thank you for this PR.

I'm reviewing the changes and I think we might want to design this slightly differently. I think that trace IDs are mostly relevant to API Server and API requests. Orchestrator works autonomously, it's not triggered by API requests, so it does not have them or need them. Instead, the ExecutionNode.id and ContainerExecution.id already serve the same role.

On the other hand, API requests might benefit from trace IDs so that all logging messages that are generated when processing a single API call can be filtered and grouped together.

This sounds good to me.

I will make some adjustments. Thanks!

@morgan-wowk morgan-wowk changed the title feat: Introduce trace ids to Tangle feat: Introduce logging context to Tangle Jan 19, 2026
@morgan-wowk morgan-wowk force-pushed the trace-id-support branch 7 times, most recently from 6245345 to 2bfcc23 Compare January 19, 2026 22:53
**Changes:**

* Adds logging context helpers
* Add request middleware to generate unique request id and set it in the logging context around API requests
    * Sets the x-tangle-request-id on the response for client consumption
@morgan-wowk
Copy link
Author

Thank you for this PR.

I'm reviewing the changes and I think we might want to design this slightly differently. I think that trace IDs are mostly relevant to API Server and API requests. Orchestrator works autonomously, it's not triggered by API requests, so it does not have them or need them. Instead, the ExecutionNode.id and ContainerExecution.id already serve the same role.

On the other hand, API requests might benefit from trace IDs so that all logging messages that are generated when processing a single API call can be filtered and grouped together.

The work has been re-designed. Here is a summary:

Logging Context Improvements

Summary

Enhanced the Tangle orchestrator logging to automatically include execution context (execution_id, container_execution_id) in all log messages, making it easier to trace and filter logs for specific executions.

Changes Made

1. Added Logging Context to Orchestrator (orchestrator_sql.py)

Wrapped execution processing with logging_context.logging_context() to automatically attach execution IDs to all logs within that scope.

For Queued Executions:

# Set execution context for logging
with logging_context.logging_context(
    execution_id=queued_execution.id
):
    _logger.info("Before processing queued execution")
    # ... process execution ...
    _logger.info("After processing queued execution")

For Running Container Executions:

with logging_context.logging_context(
    execution_id=execution_id,
    container_execution_id=running_container_execution.id
):
    _logger.info("Before processing running container execution")
    # ... process container execution ...
    _logger.info("After processing running container execution")

2. Added Context-Aware Logging Configuration (start_local.py)

Created a custom formatter and filter that dynamically includes context fields in logs only when they're set.

class LoggingContextFilter(logging.Filter):
    """Adds contextual metadata to log records."""
    
    def filter(self, record: logging.LogRecord) -> bool:
        for key, value in get_all_context_metadata().items():
            if value is not None:
                setattr(record, key, value)
        return True


class ContextAwareFormatter(logging.Formatter):
    """Formatter that dynamically includes context fields only when they're set."""
    
    def format(self, record: logging.LogRecord) -> str:
        base_format = "%(asctime)s [%(levelname)s] %(name)s"
        
        # Collect context fields that are present
        context_parts = []
        context_metadata = get_all_context_metadata()
        for key, value in context_metadata.items():
            if value is not None and hasattr(record, key):
                context_parts.append(f"{key}={value}")
        
        # Add context to format if any exists
        if context_parts:
            base_format += " [" + " ".join(context_parts) + "]"
        
        base_format += ": %(message)s"
        
        formatter = logging.Formatter(base_format)
        return formatter.format(record)

3. Cleaned Up Redundant IDs in Log Messages

Removed execution IDs from log messages where they're already present in the logging context, eliminating duplication.

Examples of cleaned up logs:

# Before: ID appears twice (in context and message)
_logger.info(f"Before processing {queued_execution.id=}")

# After: ID only in context
_logger.info("Before processing queued execution")
# Before
_logger.info(f"Container execution {container_execution.id} is now in state {new_status}")

# After
_logger.info(f"Container execution is now in state {new_status}")
# Before
_logger.info(f"Terminating container execution {container_execution.id}.")

# After
_logger.info("Terminating container execution.")

Log Output Examples

Without Context (non-orchestrator logs)

2026-01-19 14:40:52,611 [INFO] start_local: Starting the orchestrator
2026-01-19 14:40:52,620 [INFO] uvicorn.error: Application startup complete.

With Execution Context Only

2026-01-19 14:30:13,650 [INFO] cloud_pipelines_backend.orchestrator_sql [execution_id=019bd8612bb80ea96aba]: Before processing queued execution
2026-01-19 14:30:13,674 [INFO] cloud_pipelines_backend.orchestrator_sql [execution_id=019bd8612bb80ea96aba]: Execution will reuse the old_execution.id='019b3428a0f1e29c82d4'
2026-01-19 14:30:13,681 [INFO] cloud_pipelines_backend.orchestrator_sql [execution_id=019bd8612bb80ea96aba]: After processing queued execution

With Both Execution and Container Execution Context

2026-01-19 14:30:14,687 [INFO] cloud_pipelines_backend.orchestrator_sql [execution_id=019bd8612bb888cc61d4 container_execution_id=019b342ab120726fdee2]: Before processing running container execution
2026-01-19 14:30:14,701 [INFO] cloud_pipelines_backend.orchestrator_sql [execution_id=019bd8612bb888cc61d4 container_execution_id=019b342ab120726fdee2]: Container execution is now in state RUNNING (was PENDING)
2026-01-19 14:30:14,713 [INFO] cloud_pipelines_backend.orchestrator_sql [execution_id=019bd8612bb888cc61d4 container_execution_id=019b342ab120726fdee2]: After processing running container execution

Benefits

  1. Easy Log Filtering: Filter logs by execution_id or container_execution_id in log aggregation tools
  2. Better Traceability: Follow an execution through its entire lifecycle
  3. No Duplication: IDs appear once in context, not repeated in every message
  4. Dynamic Context: Formatter automatically includes any context fields without code changes
  5. Cleaner Messages: Log messages are more readable without ID clutter

Extensibility

The context system is fully extensible. You can add any arbitrary metadata:

with logging_context.logging_context(
    execution_id=execution.id,
    container_execution_id=container.id,
    pipeline_run_id=pipeline.id,  # Would work if added
    user_id=user.id,               # Would work if added
    custom_field="value"           # Any custom field works
):
    _logger.info("Processing with custom context")

All fields automatically appear in logs:

[INFO] module [execution_id=123 container_execution_id=456 pipeline_run_id=789 user_id=admin custom_field=value]: Processing with custom context

Files Modified

  • tangle/cloud_pipelines_backend/orchestrator_sql.py - Added logging context to execution processing
  • tangle/start_local.py - Added LoggingContextFilter and ContextAwareFormatter

Backward Compatibility

  • ✅ All changes are backward compatible
  • ✅ Logs without context still work (no context fields shown)
  • ✅ Existing log parsing tools continue to work
  • ✅ All tests pass

with logging_context.logging_context(
execution_id=queued_execution.id
):
_logger.info("Before processing queued execution")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the execution ID information still be present in non-JSON logs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be, yes. This is achieved through the following change:

class ContextAwareFormatter(logging.Formatter):
    """Formatter that dynamically includes context fields only when they're set."""
    
    def format(self, record: logging.LogRecord) -> str:
        """Format log record with dynamic context fields."""
        # Base format
        base_format = "%(asctime)s [%(levelname)s] %(name)s"
        
        # Collect context fields that are present
        context_parts = []
        context_metadata = get_all_context_metadata()
        for key, value in context_metadata.items():
            if value is not None and hasattr(record, key):
                context_parts.append(f"{key}={value}")
        
        # Add context to format if any exists
        if context_parts:
            base_format += " [" + " ".join(context_parts) + "]"
        
        base_format += ": %(message)s"
        
        # Create formatter with the dynamic format
        formatter = logging.Formatter(base_format)
        return formatter.format(record)

import logging.config
from cloud_pipelines_backend.instrumentation.logging_context import get_all_context_metadata

class LoggingContextFilter(logging.Filter):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these two classes to a dedicated, well-named file/module rather than being coupled with an entrypoint script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants