Homeโ€บ๐Ÿ”” Alerting & Automationโ€บModule 151 min read ยท 16/21

Auto-Remediation Workflows

Hands-on2 exercises

Auto-Remediation Workflows

Gen2 had limited auto-remediation (custom webhooks from problem notifications). Gen3 Workflows enable full auto-remediation pipelines โ€” detect, analyze, decide, act, notify.

Auto-Remediation Pattern

Davis Problem Trigger
  โ†’ DQL: Query problem context (what entity, what metrics)
  โ†’ DQL: Check entity health history (is this recurring?)
  โ†’ DQL: Get related problems (blast radius)
  โ†’ JavaScript: Decision engine (monitor/investigate/remediate/escalate)
  โ†’ Action: Email notification OR HTTP call to remediation API

Decision Engine (JavaScript Task)

// Analyze problem context and decide action
export default async function({ execution_id }) {
  const problem = await fetch(`/platform/automation/v1/executions/${execution_id}/tasks/get_problem`);
  const health = await fetch(`/platform/automation/v1/executions/${execution_id}/tasks/check_health`);
  const history = await fetch(`/platform/automation/v1/executions/${execution_id}/tasks/get_history`);

  const problemData = problem.result.records[0];
  const historyCount = history.result.records.length;

  let action = "MONITOR";
  let reason = "New problem, monitoring";

  if (historyCount > 3) {
    action = "ESCALATE";
    reason = `Recurring problem (${historyCount} times in 24h)`;
  } else if (problemData['event.category'] === 'RESOURCE') {
    action = "INVESTIGATE";
    reason = "Resource problem โ€” check capacity";
  }

  return { action, reason, problem_name: problemData['event.name'] };
}

Available Workflow Actions

Action Type                             Use Case
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
dynatrace.email:send-email              Email notification
dynatrace.slack:slack-send-message      Slack notification
dynatrace.jira:jira-create-issue        Create Jira ticket
dynatrace.servicenow:create-incident    Create ServiceNow incident
HTTP action (any URL)                   Custom API calls, webhooks
DQL query                               Enrich context, check conditions
JavaScript                              Custom logic, formatting, decisions

Service User for Workflows

Workflows run under a service user โ€” a non-human identity with its own permissions. This is critical for production workflows.

Setup:
1. Create service user via IAM API
2. Create IAM policy with required scopes
3. Bind policy to service user's auto-created group
4. Set service user as workflow "actor" (NOT "owner"!)

โš ๏ธ NEVER set owner = service user โ€” permanently locks out the workflow
   Only set actor = service user