The Durable Task Framework provides robust error handling capabilities for orchestrations and activities. This guide covers exception handling, compensation, and recovery patterns.
When an activity throws an exception, it becomes a TaskFailedException in the orchestration:
public override async Task<Result> RunTask(OrchestrationContext context, Input input)
{
try
{
var result = await context.ScheduleTask<string>(typeof(RiskyActivity), input);
return new Result { Success = true, Data = result };
}
catch (TaskFailedException ex)
{
// ex.InnerException contains the original exception
return new Result { Success = false, Error = ex.InnerException?.Message };
}
}catch (TaskFailedException ex)
{
var originalException = ex.InnerException;
var activityName = ex.Name; // "RiskyActivity"
var scheduledEventId = ex.ScheduleId; // Event ID in history
_logger.LogError(originalException,
"Activity {Activity} failed", activityName);
}The TaskHubWorker.ErrorPropagationMode property controls how exception information is propagated from failed activities and sub-orchestrations.
The default mode serializes the original exception and makes it available via InnerException:
worker.ErrorPropagationMode = ErrorPropagationMode.SerializeExceptions;
// In orchestration:
catch (TaskFailedException ex)
{
// Original exception is deserialized and available
var originalException = ex.InnerException;
if (originalException is InvalidOperationException invalidOp)
{
// Can catch specific exception types
}
}Limitations:
- Not all exception types can be serialized/deserialized correctly
- Custom exceptions may lose data if not properly serializable
- Doesn't work across language boundaries (e.g., polyglot scenarios)
The UseFailureDetails mode provides consistent, structured error information via FailureDetails:
worker.ErrorPropagationMode = ErrorPropagationMode.UseFailureDetails;With this mode:
InnerExceptionis always null- Error details are available via
FailureDetailsproperty - Works consistently across all exception types and language runtimes
catch (TaskFailedException ex)
{
// InnerException is null in UseFailureDetails mode
// Use FailureDetails instead
FailureDetails details = ex.FailureDetails;
string errorType = details.ErrorType; // e.g., "System.InvalidOperationException"
string errorMessage = details.ErrorMessage; // The exception message
string stackTrace = details.StackTrace; // Full stack trace
bool isNonRetriable = details.IsNonRetriable;
// Check for inner failures (nested exceptions)
FailureDetails innerFailure = details.InnerFailure;
}Use IsCausedBy<T>() to check exception types without deserializing:
catch (TaskFailedException ex) when (ex.FailureDetails?.IsCausedBy<InvalidOperationException>() == true)
{
// Handle InvalidOperationException
}
catch (TaskFailedException ex) when (ex.FailureDetails?.IsCausedBy<TimeoutException>() == true)
{
// Handle TimeoutException
}The same pattern applies to SubOrchestrationFailedException:
try
{
await context.CreateSubOrchestrationInstance<Result>(
typeof(ChildOrchestration),
input);
}
catch (SubOrchestrationFailedException ex)
{
FailureDetails details = ex.FailureDetails;
_logger.LogError(
"Child orchestration failed: {ErrorType}: {Message}",
details.ErrorType,
details.ErrorMessage);
}Use UseFailureDetails when:
- You need consistent error handling across all exception types
- Running orchestrations/activities out-of-process or in other language runtimes
- Custom exceptions may not serialize correctly
- You want to avoid deserialization issues with
InnerException
Warning
Changing ErrorPropagationMode on an existing deployment can break in-flight orchestrations if they contain exception handling logic that depends on InnerException. Plan changes carefully and consider using versioning strategies.
When using UseFailureDetails, you can include custom properties from your exceptions in the FailureDetails.Properties dictionary by implementing IExceptionPropertiesProvider:
public class CustomExceptionPropertiesProvider : IExceptionPropertiesProvider
{
public IDictionary<string, object?>? GetExceptionProperties(Exception exception)
{
// Extract custom properties from known exception types
if (exception is OrderProcessingException orderEx)
{
return new Dictionary<string, object?>
{
["OrderId"] = orderEx.OrderId,
["FailureStage"] = orderEx.Stage,
["RetryCount"] = orderEx.RetryCount
};
}
if (exception is ValidationException validationEx)
{
return new Dictionary<string, object?>
{
["FieldName"] = validationEx.FieldName,
["ValidationRule"] = validationEx.Rule
};
}
// Return null for exceptions without custom properties
return null;
}
}Register the provider with the TaskHubWorker:
var worker = new TaskHubWorker(orchestrationService, loggerFactory);
worker.ErrorPropagationMode = ErrorPropagationMode.UseFailureDetails;
worker.ExceptionPropertiesProvider = new CustomExceptionPropertiesProvider();Access the custom properties in your orchestration's error handling:
catch (TaskFailedException ex)
{
FailureDetails details = ex.FailureDetails;
if (details.Properties != null)
{
if (details.Properties.TryGetValue("OrderId", out var orderId))
{
_logger.LogError("Order {OrderId} failed: {Message}", orderId, details.ErrorMessage);
}
if (details.Properties.TryGetValue("RetryCount", out var retryCount) &&
retryCount is int count && count >= 3)
{
// Too many retries, escalate
await context.ScheduleTask<bool>(typeof(EscalateFailureActivity), details);
}
}
}Note
Property values should be simple, serializable types (strings, numbers, booleans). Complex objects may not serialize correctly across process boundaries.
try
{
await context.ScheduleTask<string>(typeof(PaymentActivity), payment);
}
catch (TaskFailedException ex) when (ex.InnerException is InsufficientFundsException)
{
// Handle specific business error
await context.ScheduleTask<bool>(typeof(NotifyCustomerActivity),
"Payment failed: Insufficient funds");
}
catch (TaskFailedException ex) when (ex.InnerException is PaymentGatewayException)
{
// Retry with different gateway
await context.ScheduleTask<string>(typeof(BackupPaymentActivity), payment);
}
catch (TaskFailedException)
{
// Handle all other failures
throw;
}Use ScheduleWithRetry for transient failures:
var retryOptions = new RetryOptions(
firstRetryInterval: TimeSpan.FromSeconds(5),
maxNumberOfAttempts: 3)
{
BackoffCoefficient = 2.0,
Handle = ex => ex is TimeoutException || ex is HttpRequestException
};
try
{
await context.ScheduleWithRetry<string>(
typeof(UnreliableActivity),
retryOptions,
input);
}
catch (TaskFailedException ex)
{
// All retries exhausted
}See Retries for detailed retry configuration.
try
{
await context.CreateSubOrchestrationInstance<Result>(
typeof(ChildOrchestration),
input);
}
catch (SubOrchestrationFailedException ex)
{
// Child orchestration threw an unhandled exception
var failureReason = ex.InnerException?.Message;
}Compensate previous steps when a later step fails:
public override async Task<OrderResult> RunTask(
OrchestrationContext context,
OrderInput input)
{
// Track completed steps for compensation
var completedSteps = new List<string>();
try
{
// Step 1: Reserve inventory
await context.ScheduleTask<bool>(typeof(ReserveInventoryActivity), input);
completedSteps.Add("inventory");
// Step 2: Charge payment
await context.ScheduleTask<bool>(typeof(ChargePaymentActivity), input);
completedSteps.Add("payment");
// Step 3: Ship order (might fail)
await context.ScheduleTask<bool>(typeof(ShipOrderActivity), input);
return new OrderResult { Success = true };
}
catch (TaskFailedException ex)
{
// Compensate in reverse order
if (completedSteps.Contains("payment"))
{
await context.ScheduleTask<bool>(typeof(RefundPaymentActivity), input);
}
if (completedSteps.Contains("inventory"))
{
await context.ScheduleTask<bool>(typeof(ReleaseInventoryActivity), input);
}
return new OrderResult
{
Success = false,
Error = ex.InnerException?.Message
};
}
}public override async Task<Result> RunTask(OrchestrationContext context, Input input)
{
try
{
await context.CreateSubOrchestrationInstance<Result>(
typeof(ProcessOrderOrchestration),
input);
}
catch (SubOrchestrationFailedException)
{
// Run compensation orchestration
await context.CreateSubOrchestrationInstance<CompensationResult>(
typeof(CompensateOrderOrchestration),
new CompensationInput { OriginalInput = input });
}
}Return errors as results instead of throwing:
// Activity returns result with error info
public class ProcessOrderActivity : AsyncTaskActivity<Order, OrderResult>
{
protected override async Task<OrderResult> ExecuteAsync(
TaskContext context,
Order order)
{
if (!await ValidateOrderAsync(order))
{
// Return error instead of throwing
return new OrderResult
{
Success = false,
ErrorCode = "VALIDATION_FAILED",
ErrorMessage = "Order validation failed"
};
}
// Process order...
return new OrderResult { Success = true, OrderId = newOrderId };
}
}
// Orchestration checks result
public override async Task<Result> RunTask(OrchestrationContext context, Order input)
{
var result = await context.ScheduleTask<OrderResult>(typeof(ProcessOrderActivity), input);
if (!result.Success)
{
// Handle error without exception
await context.ScheduleTask<bool>(typeof(NotifyErrorActivity), result);
return new Result { Success = false, Error = result.ErrorMessage };
}
return new Result { Success = true, OrderId = result.OrderId };
}Activities don't have built-in timeout. Handle in orchestration:
public override async Task<Result> RunTask(OrchestrationContext context, Input input)
{
using var cts = new CancellationTokenSource();
var activityTask = context.ScheduleTask<string>(typeof(LongRunningActivity), input);
var timeoutTask = context.CreateTimer(
context.CurrentUtcDateTime.AddMinutes(30),
true,
cts.Token);
var winner = await Task.WhenAny(activityTask, timeoutTask);
if (winner == activityTask)
{
cts.Cancel();
return new Result { Success = true, Data = await activityTask };
}
else
{
// Activity is still running but we've timed out
// Note: Activity will complete eventually, but result is ignored
return new Result { Success = false, TimedOut = true };
}
}public override async Task<Result> RunTask(OrchestrationContext context, Input input)
{
var deadline = context.CurrentUtcDateTime.AddHours(4);
while (context.CurrentUtcDateTime < deadline)
{
var status = await context.ScheduleTask<Status>(typeof(CheckStatusActivity), input);
if (status.IsComplete)
return new Result { Success = true };
await context.CreateTimer(context.CurrentUtcDateTime.AddMinutes(5), true);
}
return new Result { Success = false, TimedOut = true };
}Prevent repeated failures:
public override async Task<Result> RunTask(OrchestrationContext context, State state)
{
state ??= new State();
// Circuit breaker check
if (state.ConsecutiveFailures >= 5)
{
var cooldownEnd = state.LastFailure.AddMinutes(15);
if (context.CurrentUtcDateTime < cooldownEnd)
{
// Circuit is open - wait before retry
await context.CreateTimer(cooldownEnd, true);
}
state.ConsecutiveFailures = 0; // Reset after cooldown
}
try
{
var result = await context.ScheduleTask<string>(typeof(ExternalServiceActivity), state.Input);
state.ConsecutiveFailures = 0;
return new Result { Success = true, Data = result };
}
catch (TaskFailedException)
{
state.ConsecutiveFailures++;
state.LastFailure = context.CurrentUtcDateTime;
// Continue to retry with backoff
context.ContinueAsNew(state);
return null;
}
}| Practice | Description |
|---|---|
Use UseFailureDetails mode |
Prefer ErrorPropagationMode.UseFailureDetails for consistent error handling |
| Use retries for transient failures | Configure RetryOptions for HTTP, timeout errors |
| Return errors for expected failures | Use result types instead of exceptions for business errors |
| Implement compensation | Use Saga pattern for multi-step transactions |
| Set timeouts | Don't let orchestrations wait indefinitely |
| Log with context | Include instance ID, activity name, error details |
| Test failure scenarios | Verify compensation and recovery logic |
- Retries — Configuring automatic retries
- Replay and Durability — Understanding exception persistence
- Testing — Testing error handling