OODA Loop × AI

Applying the OODA loop to network operations — and rethinking each phase with AI. From "monitor → alert → human responds" to "observe → understand → decide → act → verify" as an autonomous loop.

The OODA Loop

A decision-making framework by Colonel John Boyd: Observe → Orient → Decide → Act, cycled rapidly to adapt to changing conditions.

Traditional monitoring is an open loop — an alert fires, a human investigates, and manually remediates. OODA closes the loop by feeding the results of each action back into the next observation cycle.

AI at Each Phase

AI here is not limited to LLMs. Each phase benefits from different techniques — rule engines, statistical ML, vector search, and language models — applied where they are most effective.

Observe
Observe
Static dashboards, threshold-based alerts, manual log review
Continuous multi-source ingestion. NLP for parsing unstructured data (syslogs, vendor docs). Anomaly detection via statistical models on time-series data
Orient
Orient
Operators mentally correlate alerts, check runbooks
Vector similarity search over past incidents (Qdrant). Graph-based topology correlation. LLM-powered causal reasoning with RAG over runbooks and documentation
Decide
Decide
Humans decide remediation based on experience
Rule engines for known patterns. ML risk scoring for blast radius. LLM evaluation of options against policies. Human-in-the-loop for high-risk decisions
Act
Act
Manual config changes, script execution
Declarative workflow execution (Keep). Results feed back into the next Observe cycle, closing the loop. Automated verification confirms recovery
↻ Act results feed back into Observe, closing the loop

Traditional vs. AI-Driven

Traditional MonitoringOODA × AI
LoopOpen (alert → human → action)Closed (automated feedback)
CorrelationHuman memory and experienceVector search + causal inference
InteractionPromQL / SQL queriesNatural language ("Which clusters have the most failures?")
PostureReactive (symptom detection)Proactive (precursor detection)
KnowledgeSiloed in individualsStructured via RAG across the organization

Mapping to the a10y Stack

How each a10y component maps to a phase of the OODA loop.

Phase
Component
Role
Observe
OpenObserve + Vector
Telemetry collection, normalization, and storage. Vector shapes the data; OpenObserve stores it.
Orient
Keep + Qdrant
Keep aggregates and correlates alerts. Qdrant provides similarity search over historical patterns.
Decide
correlation-engine
Causal reasoning, risk assessment, and remediation planning — using rules, ML, and LLMs as appropriate.
Act
Keep workflows
Executes remediation actions via declarative workflows.
Verify
OpenObserve + engine
Re-observes telemetry after action to confirm recovery.
All
NATS
Event transport between all phases. The nervous system that connects the loop.

The AI Inflection Point

Traditional rule-based automation can only handle known patterns. AI changes this in three ways — and not just through LLMs.

1. Understanding unstructured data — Syslog messages, vendor-specific CLI output, natural language ticket descriptions. NLP and LLMs turn previously unparseable data into structured, actionable signals.

2. Reasoning beyond rules — Statistical anomaly detection catches patterns that no human would write rules for. Vector similarity finds "we've seen something like this before" without exact matches. LLMs reason about novel failures using general knowledge and retrieved context.

3. Organizational knowledge at every decision — RAG over runbooks, design documents, and postmortems brings institutional knowledge to bear on every decision. Tribal knowledge becomes shared infrastructure.

The goal is not a network that pages a human faster. It's a network that heals itself.