Agent Topology
LangGraph investigation pipeline — planner dispatches subagents, synthesizer combines findings
Investigation Flow
Agent Configuration
Planner
Analyzes incoming alerts and creates an investigation plan. Selects which subagents to dispatch and in what order based on alert type, affected services, and past episodes.
Kubernetes
Investigates Kubernetes cluster state — pod health, resource limits, events, deployments, and node conditions. Uses kubectl and K8s API for deep cluster inspection.
Log Analysis
Searches and analyzes application logs from OpenSearch/Elasticsearch. Identifies error patterns, correlates log entries across services, and extracts stack traces.
Metrics
Queries Prometheus and Grafana for metrics analysis. Examines error rates, latency percentiles, resource utilization, and custom application metrics.
Traces
Analyzes distributed traces from Jaeger/Zipkin. Identifies slow spans, error propagation paths, and service dependency bottlenecks.
Synthesizer
Combines findings from all subagents into a coherent root cause analysis. Weighs evidence, resolves conflicting hypotheses, and determines confidence levels.
Writeup
Generates the final investigation report with executive summary, timeline, root cause, action items, and lessons learned. Formats output for Slack and web UI.
