1.0 SONAR Low-Level Architecture

Low-level architecture requirements for SONAR (SIEM-Oriented Neural Anomaly Recognition) subsystem.

1.1 SONAR training pipeline sequence LARC-019

Training workflow sequence diagram

The training pipeline retrieves historical alerts from Wazuh Indexer, extracts features, and trains the MVAD model.

UML Diagram

Key operations

Component Operation Input Output
Scenario Loader Parse YAML Scenario file path UseCase object
Data Provider Fetch alerts Time range, filters Raw alerts (JSON)
Feature Engineer Extract features Raw alerts Time-series DataFrame
MVAD Engine Train model Time-series data Trained model object
File System Persist model Model object Model file path

Error handling

  • Insufficient data: Warns if sample count < minimum threshold
  • Missing fields: Uses default values or raises validation error
  • API failures: Retries with exponential backoff
  • Model persistence: Validates write permissions before training

Related documentation

  • Training sequence diagram: docs/manual/sonar_docs/uml-diagrams.md#training-workflow

Parent links: SRS-038 Joint Host-Network Training, SRS-048 Default Detector Training

Child links: SWD-022 SONAR class structure and relationships, SWD-023 SONAR feature engineering design, SWD-024 SONAR data shipping design, SWD-025 SONAR debug mode design

1.2 SONAR detection pipeline sequence LARC-020

Detection workflow sequence diagram

The detection pipeline loads a trained model, processes recent alerts, and generates anomaly scores with optional shipping to data streams.

UML Diagram

Detection modes

Mode Behavior Use Case
historical Process fixed time range Batch analysis, validation
realtime Continuous monitoring Production deployment
batch Scheduled execution Periodic scans

Post-processing steps

  1. Thresholding: Filter scores above configured threshold
  2. Consecutive filtering: Require N consecutive anomalies
  3. Enrichment: Add metadata (timestamp, scenario ID, severity)
  4. Formatting: Convert to OpenSearch document format

Related documentation

  • Detection sequence diagram: docs/manual/sonar_docs/uml-diagrams.md#detection-workflow

Parent links: SRS-027 ML-Based Anomaly Detection, SRS-035 Offline Anomaly Detection, SRS-042 Prediction Shipping Feature

Child links: SWD-022 SONAR class structure and relationships, SWD-023 SONAR feature engineering design, SWD-024 SONAR data shipping design, SWD-025 SONAR debug mode design

2.0 RADAR Low-Level Architecture

Low-level architecture requirements for RADAR (Real-time Alert Detection and Automated Response) subsystem.

2.1 RADAR scenario setup flow LARC-015

The diagram below depicts the RADAR scenario setup flow.

RADAR scenario setup

Parent links: HARC-004 RADAR architecture, HARC-005 RADAR Automated Test Framework architecture, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-053 RADAR scenario: malware C2 beaconing

Child links: SWD-018 RATF: ingestion phase, SWD-019 RATF: setup phase

2.2 RADAR active response flow LARC-016

The diagram below depicts the RADAR active response flow.

RADAR active response

Parent links: HARC-004 RADAR architecture, HARC-005 RADAR Automated Test Framework architecture, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-053 RADAR scenario: malware C2 beaconing

Child links: SWD-019 RATF: setup phase, SWD-020 RATF: simulation phase, SWD-021 RATF: evaluation phase

2.3 RADAR integration with Opensearch modules LARC-017

The diagram below depicts how RADAR integrates with Wazuh Opensearch modules.

RADAR integration

Parent links: HARC-004 RADAR architecture, HARC-005 RADAR Automated Test Framework architecture, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-053 RADAR scenario: malware C2 beaconing, SRS-054 RADAR automated test framework

2.4 RADAR logical flow LARC-018

The diagram below depicts the logical flow of RADAR.

RADAR logical flow

Parent links: HARC-004 RADAR architecture, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-053 RADAR scenario: malware C2 beaconing

2.5 RADAR risk engine calculation flow LARC-021

The diagram below depicts the risk calculation flow implemented in the RADAR active response system. This flow combines three detection paradigms into a unified, normalized risk score that drives automated response actions.

Mathematical foundation

The risk calculation follows the formula:

$$R = w_A \cdot A + w_S \cdot S + w_T \cdot T$$

Where:

  • A (Anomaly intensity) = $G \cdot C$, where G is anomaly grade and C is confidence from OpenSearch RCF or SONAR MVAD
  • S (Signature risk) = $L \cdot I$, where L is likelihood and I is impact from rule-based detection
  • T (CTI score) = $1 − \prod_i^n(1 − \omega_i)$, aggregated over CTI indicator weights

Default weights (configurable in ar.yaml):

  • $\omega_a = 0.4$ (behavioral → high information value)
  • $\omega_s = 0.4$ (signature → high precision)
  • $\omega_T = 0.2$ (CTI → confirmatory)

Tier determination

Risk scores map to response tiers:

  • Low (0.0 ≤ R < 0.33): Email notification only
  • Medium (0.33 ≤ R < 0.66): Email + case creation + light mitigation
  • High (0.66 ≤ R ≤ 1.0): Full notification + case + strong containment

Flow sequence

  1. Input collection: Extract AD outputs (G, C), signature values (L, I), CTI flags
  2. Component calculation: Compute A, S, T from inputs
  3. Weighted combination: Apply weights to compute R
  4. Tier assignment: Map R to Low/Medium/High based on thresholds
  5. Action selection: Determine response actions based on tier and scenario configuration

See also: /radar/scenarios/active_responses/ar.yaml for configuration schema and /docs/manual/radar_docs/radar-risk-math.md for detailed mathematical specification.

Parent links: HARC-012 RADAR risk engine architecture, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-053 RADAR scenario: malware C2 beaconing, SRS-054 RADAR automated test framework, SRS-055 RADAR scenario: Geo-IP AC via whitelisting, SRS-056 RADAR scenario: log size change, SRS-057 RADAR scenario: ransomware, SRS-058 RADAR scenario: DLP2 - network data exfiltration

Child links: SWD-026 RADAR risk engine implementation design

2.6 RADAR detector creation workflow LARC-022

The diagram below depicts the sequence of operations for creating and starting OpenSearch anomaly detectors via the detector.py module.

Workflow stages

1. Configuration loading

  • Read scenario definition from config.yaml
  • Load environment variables from .env (OS_URL, OS_USER, OS_PASS, OS_VERIFY_SSL)
  • Validate scenario exists in configuration

2. Detector existence check

  • Query OpenSearch AD API: GET /_plugins/_anomaly_detection/detectors/_search
  • Search by name pattern: {scenario}_DETECTOR
  • If found, return existing detector ID (idempotent operation)

3. Detector specification building

Construct detector JSON specification including:

  • indices: Index pattern to monitor (e.g., wazuh-ad-log-volume-*)
  • time_field: Timestamp field for time-series analysis (@timestamp)
  • feature_attributes: Aggregation queries from config (e.g., max(data.log_bytes))
  • detection_interval: How often to run detection (minutes)
  • window_delay: Buffer time for late-arriving data (minutes)
  • category_field: Field for high-cardinality detection (e.g., agent.name for per-endpoint baselines)
  • shingle_size: Temporal sequence window size for RCF algorithm
  • result_index: Custom index for storing detection results

4. Detector creation

  • POST detector specification to OpenSearch AD plugin
  • Endpoint: POST /_plugins/_anomaly_detection/detectors
  • Receive detector ID in response

5. Detector activation

  • Start the detector to begin analysis
  • Endpoint: POST /_plugins/_anomaly_detection/detectors/{detector_id}/_start
  • Detector begins processing data at configured intervals

6. Output

  • Return detector ID to stdout for pipeline chaining
  • Used by monitor.py in subsequent workflow stage

Key functions

find_detector_id(detector_name: str) -> str | None
detector_spec(scenario_config: dict) -> dict
create_detector(spec: dict) -> str
start_detector(detector_id: str) -> None

Detector Creation Sequence

UML Diagram

Implementation reference

See radar/anomaly_detector/detector.py for implementation details.

Parent links: HARC-004 RADAR architecture, HARC-012 RADAR risk engine architecture, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-056 RADAR scenario: log size change

Child links: SWD-028 RADAR detector module design

2.7 RADAR monitor and webhook workflow LARC-023

The diagram below depicts the sequence for creating OpenSearch monitors and webhook notification channels via monitor.py and webhook.py modules.

Workflow overview

The monitor workflow ensures anomaly detection results trigger automated responses when thresholds are exceeded. Monitors continuously evaluate detector outputs and send structured notifications to a webhook endpoint, which integrates with Wazuh's rule engine.

Workflow stages

1. Webhook destination setup

Execute ensure_webhook() from webhook.py:

  • Query existing notification destinations: GET /_plugins/_notifications/configs
  • Search for webhook by name pattern
  • If not found, create new webhook destination:

    • Endpoint: POST /_plugins/_notifications/configs
    • Configuration: Custom webhook type, POST method, webhook URL from environment
    • Return webhook destination ID

2. Monitor existence check

  • Query existing monitors: GET /_plugins/_alerting/monitors/_search
  • Search by name pattern: {scenario}_Monitor
  • If found, return existing monitor ID (idempotent operation)

3. Monitor specification building

Construct monitor JSON including:

  • Schedule: Evaluation frequency (defaults to detector_interval if monitor_interval not specified)
  • Inputs: Query detector's result index for recent anomalies
  • Triggers: Condition evaluating anomaly scores
  • Actions: Webhook notification when triggered

4. Trigger condition configuration

Default trigger logic:

anomaly_grade > threshold AND confidence > threshold

Where thresholds are defined in scenario configuration (typical values: 0.3-0.5 for balanced sensitivity/precision).

5. Webhook action specification

Notification payload includes:

{
  "monitor": {"name": "{{ctx.monitor.name}}"},
  "trigger": {"name": "{{ctx.trigger.name}}"},
  "entity": "{{ctx.results.0.hits.hits.0._source.entity.0.value}}",
  "periodStart": "{{ctx.periodStart}}",
  "periodEnd": "{{ctx.periodEnd}}"
}

6. Monitor creation

  • Create monitor via OpenSearch Alerting API
  • Endpoint: POST /_plugins/_alerting/monitors
  • Monitor begins evaluating detector results at configured intervals

7. Output

  • Return monitor ID to stdout
  • Monitor continuously watches detector and triggers webhook on anomalies

Key functions

# webhook.py
notif_find_id(webhook_name: str) -> str | None
notif_create(webhook_url: str) -> str
ensure_webhook() -> str

# monitor.py
find_monitor_id(monitor_name: str) -> str | None
monitor_payload(detector_id: str, webhook_id: str, config: dict) -> dict
create_monitor(payload: dict) -> str

Integration flow

Monitor → Detector Results → Evaluate Threshold → Webhook POST → /var/log/ad_alerts.log → Wazuh Rules → Active Response

Monitor and Webhook Sequence

UML Diagram

Implementation reference

See implementation details:

Parent links: HARC-004 RADAR architecture, HARC-012 RADAR risk engine architecture, LARC-022 RADAR detector creation workflow, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-056 RADAR scenario: log size change

Child links: SWD-029 RADAR monitor and webhook module design, SWD-035 RADAR webhook service design

2.8 RADAR Ansible deployment pipeline flow LARC-024

The diagram below depicts the end-to-end deployment flow orchestrated by build-radar.sh and Ansible playbooks. The pipeline automates infrastructure setup, scenario configuration, and service initialization across multiple deployment modes.

Pipeline stages

1. Mode selection and validation

  • Parse command-line arguments: ./build-radar.sh <scenario> --manager <local|remote> --agent <local|remote>
  • Validate deployment mode combination
  • Set Ansible variables: manager_mode, agent_mode, scenario_name
  • Load inventory configuration from inventory.yaml

2. Volume resolution

The volume-first architecture maps Wazuh configuration directories to host paths:

  • Parse volumes.yml Docker Compose configuration
  • Extract bind mount mappings (container path → host path)
  • Key volumes:

    • /var/ossec/etc/radar-srv/wazuh/manager/etc
    • /var/ossec/active-response/bin/radar-srv/wazuh/manager/active-response/bin
    • /etc/filebeat/radar-srv/wazuh/manager/filebeat/etc
  • Derive host-side file paths for direct manipulation

3. Infrastructure deployment

  • Deploy Wazuh core stack (if manager_mode != existing):

    • Wazuh Manager container
    • Wazuh Indexer (OpenSearch)
    • Wazuh Dashboard
  • Deploy Wazuh agents (if agent_mode == local):

    • Agent Docker containers
    • Register with manager
  • Install OpenSearch AD plugin in indexer

4. Scenario configuration injection

For the specified scenario, inject configurations via Ansible roles:

Decoders (/var/ossec/etc/decoders/):

  • Copy scenario-specific XML decoders
  • Marker-based appending to avoid duplicates
  • Set ownership: root:wazuh, permissions: 640

Rules (/var/ossec/etc/rules/):

  • Copy scenario-specific XML rules
  • Use unique markers: <!-- BEGIN RADAR {scenario} --> / <!-- END RADAR {scenario} -->
  • Idempotent insertion (skip if marker exists)

Active Response (/var/ossec/active-response/bin/):

  • Copy radar_ar.py and dependencies
  • Set executable permissions: 750
  • Copy ar.yaml configuration

OSSEC Configuration (/var/ossec/etc/ossec.conf):

  • Inject localfile, command, and active_response blocks
  • Use marker-based insertion for idempotency
  • Configure log monitoring and response commands

Filebeat Pipelines (if applicable):

  • Configure ingest pipelines for data enrichment
  • Set up index templates

5. RADAR Helper deployment (if scenario requires)

For geographic-enrichment scenarios (geoip_detection, suspicious_login):

  • Copy radar-helper.py to /opt/radar/ on agent hosts
  • Install Python dependencies (maxminddb)
  • Copy MaxMind GeoLite2 databases to /usr/share/GeoIP/
  • Deploy systemd service: radar-helper.service
  • Start and enable service

6. Service management

  • Restart Wazuh Manager: /var/ossec/bin/wazuh-control restart
  • Reload Filebeat configuration
  • Verify service health

7. Build radar-cli container

  • Build Docker image with detector.py, monitor.py, webhook.py
  • Load environment variables from .env
  • Image used by run-radar.sh for detector/monitor setup

Idempotency mechanisms

  • File checksums: Compare content before copying to avoid unnecessary operations
  • Marker-based injection: Scenario-specific markers prevent duplicate configuration
  • Conditional logic: Check for existing resources before creation
  • State tracking: Ansible facts maintain deployment state

Diagram: Ansible deployment pipeline showing mode selection, infrastructure setup, scenario injection, and service management. See assets/RADAR-ansible-deployment-flow.md for detailed documentation.

Deployment modes

manager_mode agent_mode Execution context
docker_local N/A Docker on controller host
docker_remote N/A Docker on remote host via SSH
host_remote N/A Bare metal installation via SSH

All modes accept local or remote agents independently.

Note: Diagram placeholder - to be created in Phase 7 showing flowchart: mode selection → volume resolution → infrastructure → config injection → helper deploy → service restart.

See also:

  • /radar/roles/wazuh_manager/tasks/main.yml for playbook implementation
  • /radar/build-radar.sh for orchestration script
  • /docs/manual/radar_docs/radar-manager-ansible-playbook.md for detailed documentation

Parent links: HARC-006 RADAR deployment: Remote Agent and Remote Manager mode, HARC-007 RADAR deployment: Remote Agent and Local Manager mode, HARC-008 RADAR deployment: Local Agent and Local Manager mode, HARC-013 RADAR Ansible automation architecture

Child links: SWD-032 RADAR configuration management design, SWD-031 RADAR Ansible role architecture

2.9 RADAR helper enrichment pipeline LARC-025

The diagram below depicts the real-time log enrichment pipeline implemented by the RADAR Helper service running on Wazuh agents.

Pipeline overview

The RADAR Helper is a multi-threaded Python daemon that monitors authentication logs, enriches them with geographic and behavioral context, and writes enhanced logs for Wazuh to ingest. This enrichment enables signature-based detection of geographic anomalies and impossible travel scenarios.

Pipeline stages

1. Log monitoring

AuthLogWatcher continuously monitors /var/log/auth.log:

  • Implements tail-like following with rotation handling
  • Detects SSH authentication events (success and failure)
  • Extracts: username, source IP, timestamp (parsed from the log line via parse_event_ts()), outcome

Note: The timestamp used for all subsequent behavioral calculations is parsed directly from the log line header, not taken from time at processing. This ensures accurate velocity estimates for replayed or delayed log streams.

2. Geographic enrichment

GeoLookup Service queries MaxMind GeoLite2 databases:

  • City database: /usr/share/GeoIP/GeoLite2-City.mmdb
  • ASN database: /usr/share/GeoIP/GeoLite2-ASN.mmdb

Extracted fields:

  • country: ISO 3166-1 alpha-2 country code
  • region: State/province name
  • city: City name
  • asn: Autonomous System Number
  • asn_placeholder_flag: True if ASN lookup failed
  • Geographic coordinates: latitude, longitude (for velocity calculation)

3. User state management

UserState maintains per-user historical data:

  • Last login location (latitude, longitude)
  • Last login timestamp (epoch seconds, sourced from the parsed event timestamp)
  • ASN history (90-day sliding window)

State enables temporal and behavioral analysis:

  • Velocity between consecutive logins
  • ASN novelty detection
  • Country change tracking

4. Behavioral calculations

Geographic velocity:

Algorithm: Calculate velocity between consecutive logins
  Input: previous_location (lat, lon, event_timestamp),
         current_location  (lat, lon, event_timestamp)

  distance_km ← haversine_distance(prev_lat, prev_lon, curr_lat, curr_lon)
  dt_h        ← abs(curr_event_ts - prev_event_ts) / 3600
  if abs(dt_h) < DT_EPS_H (1e-9):
      dt_h ← DT_EPS_H          // guard against near-simultaneous events
  velocity_kmh ← distance_km / dt_h

  Output: velocity_kmh

Note: Velocity is no longer capped at a fixed maximum. The previous 2000 km/h ceiling has been removed. The DT_EPS_H guard (1×10⁻⁹ hours) prevents division by zero for events with identical or near-identical timestamps without distorting physically plausible velocities.

Country change indicator:

Algorithm: Detect country transition
  country_change_i ← (curr_country ≠ prev_country) ? 1 : 0

ASN novelty indicator:

Algorithm: Check if ASN is novel within retention window
  asn_novelty_i ← (curr_asn ∉ user_asn_history[−90d]) ? 1 : 0

5. Enriched log writing

Formatted log line written to /var/log/suspicious_login.log:

timestamp hostname sshd[PID]: radar_outcome="success" radar_user="alice"
  radar_src_ip="203.0.113.42" radar_country="US" radar_region="California"
  radar_city="San Francisco" radar_asn="15169" radar_asn_placeholder_flag="false"
  radar_geo_velocity_kmh="450.23" radar_country_change_i="1"
  radar_asn_novelty_i="0"

6. Wazuh ingestion

Wazuh agent monitors /var/log/suspicious_login.log:

  • Custom decoders extract radar_* fields
  • Rules evaluate conditions (velocity > 900 km/h, country not in whitelist)
  • Active responses trigger on rule matches

Algorithms

Haversine distance (great-circle distance):

Algorithm: Calculate great-circle distance between two geographic points
  Input: lat1, lon1, lat2, lon2 (in degrees)

  R    ← 6371.0088  // Earth mean radius in km
  Δlat ← radians(lat2 - lat1)
  Δlon ← radians(lon2 - lon1)

  a ← sin²(Δlat/2) + cos(radians(lat1)) × cos(radians(lat2)) × sin²(Δlon/2)
  c ← 2 × atan2(√a, √(1-a))
  distance ← R × c

  Output: distance (in kilometers)

ASN history maintenance:

Algorithm: Maintain time-windowed ASN history for user
  Input: user_state, retention_window_sec (default: 90 × 24 × 3600 = 7,776,000 s)

  cutoff_timestamp ← current_event_ts - retention_window_sec

  for each (asn, timestamp) in user_state.asn_history:
    if timestamp < cutoff_timestamp:
      remove (asn, timestamp) from user_state.asn_history
      if asn not referenced by any remaining entry:
        remove asn from user_state.asn_set

  Output: updated user_state with pruned history and set

Timestamp parsing (parse_event_ts):

Algorithm: Parse log line timestamp to Unix epoch float
  Input: ts_str (string)

  if ts_str matches ISO 8601 pattern (contains "T" and timezone offset):
    return datetime.fromisoformat(ts_str).timestamp()

  if ts_str matches syslog pattern (e.g., "Jan 15 10:30:00"):
    reconstruct datetime using current year and local timezone
    return datetime(...).timestamp()

  fallback:
    return time.time()

  Output: float (Unix epoch seconds)

Multi-threading design

  • Main thread: Creates a RadarLogger singleton (RADAR_LOG) at startup, instantiates AuthLogWatcher with it, and manages the watcher lifecycle
  • RadarLogger: Encapsulates all logger setup; provides a shared debug logger and a factory method (build_out_logger) for per-watcher output loggers
  • Watcher threads: One per log file; AuthLogWatcher is active in production; AuditLogWatcher is defined as a stub for future use
  • Graceful shutdown: Stop event signaling and thread join with 5-second timeout on exit

Implementation reference

See radar/radar-helper/radar-helper.py for complete implementation.

See also

  • /docs/manual/radar_docs/radar-scenarios/geoip_detection_explained.md for usage in GeoIP scenario
  • /docs/manual/radar_docs/radar-scenarios/suspicious_login_explained.md for usage in suspicious login scenario

Parent links: HARC-014 RADAR helper architecture, SRS-051 RADAR scenario: suspicious login, SRS-055 RADAR scenario: Geo-IP AC via whitelisting

Child links: SWD-030 RADAR helper module class design, SWD-032 RADAR configuration management design

2.10 RADAR active response decision pipeline LARC-026

The diagram below depicts the comprehensive decision-making pipeline implemented in radar_ar.py, which orchestrates automated threat response based on risk-aware analysis.

This expands on LARC-016 (RADAR active response flow) with detailed implementation logic, scenario identification, context collection, and tiered action planning.

Pipeline stages

1. Alert intake

  • Read Wazuh alert JSON from stdin
  • Parse alert structure: rule ID, level, groups, agent info, timestamp, data fields
  • Validate alert completeness
  • Log alert reception

2. Scenario identification

ScenarioIdentifier maps rule IDs to scenarios:

  • Load ar.yaml configuration
  • Iterate through scenario definitions
  • Match alert rule ID to configured rule mappings
  • Determine detection type: signature or ad (anomaly detection)
  • Return scenario context: {name, detection, alert, config}

Exit if no scenario matches (no-op, log warning).

3. Scenario-specific behavior resolution

BaseScenario and subclasses determine:

Time window resolution:

  • AD-based: Use delta_ad_minutes (default: 10) from config
  • Signature-based: Use delta_signature_minutes (default: 1) from config
  • Window: [alert_timestamp - delta, alert_timestamp]

Effective agent resolution:

  • AD-based: Return None (query all agents for per-entity baselines)
  • Signature-based: Return alert.agent.name (scope to triggering agent)

AD score extraction:

  • Extract anomaly grade and confidence from alert data fields
  • Handle scenario-specific field names

4. Context collection

Query OpenSearch for correlated events within time window:

Query specification:
  time_range: [alert_timestamp - window_delta, alert_timestamp]
  agent_filter: effective_agent (if signature-based) OR all agents (if AD-based)

Algorithm: Extract IOCs from alert and correlated events
  For each event:
    Extract IP addresses from: srcip, dstip, data.srcip, data.dstip
    Extract usernames from: srcuser, dstuser, data.srcuser, data.dstuser
    Extract hashes from: data.md5, data.sha256
    Extract domains from: data.url, data.hostname

  Return: deduplicated IOC collection

5. CTI enrichment (SATRAP integration)

For each extracted IOC:

  • Query SATRAP CTI database via REST API
  • Check threat intelligence feeds:
    • IP reputation (blacklists, malicious ASNs)
    • Domain reputation (phishing, malware delivery)
    • Hash reputation (known malware signatures)
    • User compromise flags
  • Aggregate CTI indicators with weights
  • Calculate T score: T = 1 - ∏(1 - wᵢ) over n indicators

6. Risk calculation

Apply risk engine formula (see LARC-021):

Algorithm: Calculate composite risk score
  A ← anomaly_grade × confidence  // AD intensity
  S ← likelihood × impact         // Signature risk (from ar.yaml)
  T ← cti_score                   // CTI aggregation

  R ← w_A × A + w_S × S + w_T × T  // Weighted combination

  Where weights are loaded from scenario configuration

Weights loaded from scenario configuration in ar.yaml.

7. Decision ID generation

Create unique identifier for idempotency tracking:

Algorithm: Generate deterministic decision ID
  decision_id ← SHA256(alert.id + ":" + alert.timestamp + ":" + scenario_name)
  decision_id ← first_16_hex_chars(decision_id)

Check if decision already processed (prevents duplicate actions on alert re-ingestion).

8. Tier assignment and action planning

Map risk score to tier using per-scenario boundaries from ar.yaml:

  • Tier 0 (R < tier1_min): No actions; audit log entry only
  • Tier 1 (tier1_min <= R < tier1_max): email + DECIPHER incident creation
  • Tier 2 (tier1_max <= R < tier2_max): Tier 1 actions + mitigations_tier2 (if allow_mitigation)
  • Tier 3 (R >= tier2_max): Tier 1 actions + mitigations_tier3 (if allow_mitigation)

DECIPHER incident creation (FlowIntel case) runs in run() before action planning, gated on tier >= 1 and DECIPHER health check.

Apply configuration flag from ar.yaml:

  • allow_mitigation: Enable mitigation execution at Tier 2 and Tier 3

Apply safety gates:

  • Production environment check
  • Whitelist verification
  • Rate limiting (max actions per time window)

9. Action execution

Email notification:

  • Format alert summary with risk score and tier
  • Include IOCs and CTI hits
  • Send via SMTP (credentials from .env)

DECIPHER incident creation (tier >= 1, if DECIPHER reachable):

  • Call DecipherClient.create_incident(decision)
  • DECIPHER creates FlowIntel case and returns case_id and case_url
  • Case URL included in email notification

Mitigation actions (via Wazuh API):

  • Firewall drop: PUT /active-response?agents_list={agent_id} with command firewall_drop and IP argument
  • Account disable: Active response command to disable compromised user
  • Service termination: termination of process connected to malicious IP or termination of malicious service

10. Audit logging

Write structured JSON to /var/ossec/logs/active-responses.log:

{
  "timestamp": "...",
  "decision_id": "...",
  "scenario": "...",
  "rule_id": "...",
  "risk_score": 0.75,
  "tier": 2,
  "actions_planned": ["email", "case", "firewall_drop"],
  "actions_executed": ["email", "case", "firewall_drop"],
  "execution_results": {...},
  "iocs": [...],
  "cti_hits": [...]
}

11. Exit

Return exit code:

  • 0: Success (scenario processed)
  • 1: No scenario match or alert invalid
  • 2: Critical exception

Error handling

  • Transient failures: Retry with exponential backoff (OpenSearch, CTI queries)
  • Non-critical failures: Log and continue (e.g., email send failure doesn't block case creation)
  • Critical failures: Abort pipeline, log, return error code

Configuration schema (ar.yaml)

scenarios:
  geoip_detection:
    ad:
      rule_ids: []
    signature:
      rule_ids: ["100900", "100901"]
    w_ad: 0.0
    w_sig: 0.6
    w_cti: 0.4
    delta_signature_minutes: 1
    signature_impact: 0.6
    signature_likelihood: 0.8
    tiers:
      tier1_min: 0.0
      tier1_max: 0.33
      tier2_max: 0.66
    allow_mitigation: true
    mitigations_tier2:
      - firewall-drop
    mitigations_tier3:
      - firewall-drop

Active Response Sequence

UML Diagram

Implementation reference

See radar/scenarios/active_responses/radar_ar.py for complete implementation.

See also

  • /docs/manual/radar_docs/radar-active-response.md for comprehensive documentation
  • LARC-016 for simplified flow diagram

Parent links: HARC-004 RADAR architecture, HARC-012 RADAR risk engine architecture, HARC-013 RADAR Ansible automation architecture, LARC-021 RADAR risk engine calculation flow, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-053 RADAR scenario: malware C2 beaconing, SRS-054 RADAR automated test framework, SRS-055 RADAR scenario: Geo-IP AC via whitelisting, SRS-056 RADAR scenario: log size change, SRS-057 RADAR scenario: ransomware, SRS-058 RADAR scenario: DLP2 - network data exfiltration

Child links: SWD-026 RADAR risk engine implementation design, SWD-027 RADAR active response script design, SWD-037 RADAR-SONAR integration design, SWD-038 RADAR-DECIPHER FlowIntel integration design

2.11 RADAR data ingestion pipeline LARC-027

The diagram below depicts the data ingestion workflow orchestrated by run-radar.sh and implemented by scenario-specific wazuh_ingest.py scripts.

Pipeline purpose

Data ingestion generates synthetic time-series data for training OpenSearch RCF anomaly detectors. Each behavior-based scenario requires historical baseline data to establish normal patterns before real-time detection begins.

Workflow stages

1. Script invocation

Execute from run-radar.sh:

Command: Execute scenario ingestion script in containerized environment
  Container: radar-cli:latest
  Environment: Load OS_URL, OS_USER, OS_PASS, OS_VERIFY_SSL from .env
  Volumes: Mount scenarios directory
  Script: /app/scenarios/ingest_scripts/{scenario}/wazuh_ingest.py

2. Baseline query (scenario-dependent)

Log volume scenario:

  • Query last 10 minutes of existing data from wazuh-ad-log-volume-*
  • Retrieve 2 most recent documents
  • Calculate delta between log byte values
  • Fallback: 20,000 bytes if insufficient data

Suspicious login scenario:

  • Query user authentication patterns
  • Analyze login frequency distribution
  • Determine normal session intervals

3. Time series generation

Log volume algorithm:

Algorithm: Generate realistic time series baseline
  Input: baseline_value, delta, lookback_minutes, sampling_interval_seconds

  total_points ← (lookback_minutes × 60) / sampling_interval_seconds
  start_time ← current_time - lookback_minutes

  for i from 0 to total_points:
    timestamp ← start_time + (i × sampling_interval_seconds)
    value ← baseline_value + (delta × i / total_points)  // Linear progression

    document ← {
      "@timestamp": timestamp (ISO8601 format),
      "agent": {"name": agent_name, "id": agent_id},
      "data": {"log_path": path, "log_bytes": value},
      "predecoder": {"program_name": metric_name}
    }

    append document to bulk_data

  Output: bulk_data collection

Example generated data characteristics:

  • Default parameters: 240 minutes lookback, 20-second sampling = 720 points
  • Monotonic trend: Realistic growth from baseline to baseline+delta
  • Proper timestamps: ISO8601 with consistent spacing
  • Scenario-specific fields: Match detector configuration exactly

4. Bulk indexing

OpenSearch Bulk API specification:

  • Endpoint: POST /{index}/_bulk
  • Content-Type: application/x-ndjson
  • Format: Newline-delimited JSON (NDJSON)

Example NDJSON structure:

{"index": {"_index": "wazuh-ad-log-volume-2026-02-16"}}
{"@timestamp": "2026-02-16T10:00:00Z", "agent": {...}, "data": {...}}
{"index": {"_index": "wazuh-ad-log-volume-2026-02-16"}}
{"@timestamp": "2026-02-16T10:00:20Z", "agent": {...}, "data": {...}}

Batch processing:

  • Default batch size: 500-1000 documents per bulk request
  • Error handling: Retry on transient failures (network, temporary unavailability)
  • Progress logging: Document count, timestamp range after each batch

5. Verification

Algorithm: Verify successful ingestion
  1. Query document count in target index for time range
  2. Compare actual_count with expected_count
  3. Verify earliest and latest timestamps match expected range
  4. Validate field mappings (numeric fields are numbers, not strings)

  if all checks pass:
    return SUCCESS
  else:
    return FAILURE with diagnostic information

6. Output

  • Log ingestion summary: document count, time range, index name
  • Return success/failure exit code
  • Output used by run-radar.sh to proceed to detector creation

Scenario-specific ingestion patterns

Log volume

  • Index: wazuh-ad-log-volume-*
  • Key field: data.log_bytes (numeric)
  • Pattern: Monotonic increase with realistic deltas
  • Volume: 720 points over 240 minutes

Suspicious login

  • Index: wazuh-archives-* or custom index
  • Key fields: srcuser, srcip, @timestamp, enriched RADAR fields
  • Pattern: Normal login distribution with occasional geographic diversity
  • Volume: Variable based on user count and session patterns

Insider threat (archived)

  • Index: wazuh-archives-*
  • Key fields: File access patterns, data transfer volumes
  • Pattern: Baseline file activity with gradual increase
  • Volume: Per-user baselines

Data quality considerations

  • Timestamp accuracy: Proper ISO8601 format with timezone
  • Field types: Numeric fields as numbers, not strings (critical for aggregations)
  • Index patterns: Match detector configuration exactly
  • Agent consistency: Use consistent agent.name for high-cardinality detection
  • Realistic patterns: Avoid synthetic steps or unrealistic spikes that confuse training

Error handling

  • Connection failures: Retry with exponential backoff
  • Index creation: Auto-create if not exists (OpenSearch default)
  • Mapping conflicts: Log error, fail fast
  • Bulk API errors: Parse response, retry failed documents

Data Ingestion Sequence

UML Diagram

Implementation reference

Scenario-specific implementations:

See also

  • /docs/manual/radar_docs/radar-run-ad.md for workflow documentation

Parent links: HARC-004 RADAR architecture, HARC-014 RADAR helper architecture, LARC-015 RADAR scenario setup flow, SRS-050 RADAR scenario: DLP1 - insider data exfiltration, SRS-051 RADAR scenario: suspicious login, SRS-052 RADAR scenario: DDoS detection, SRS-056 RADAR scenario: log size change

Child links: SWD-033 RADAR data ingestion module design, SWD-037 RADAR-SONAR integration design

2.12 RADAR GeoIP detection scenario flow LARC-028

The diagram below depicts the end-to-end flow for the GeoIP detection scenario, which uses signature-based detection to identify and block authentication attempts from non-whitelisted geographic locations.

Scenario overview

GeoIP detection is a signature-based scenario that does not use anomaly detection. It relies on:

  • Real-time log enrichment via RADAR Helper
  • Custom decoders for field extraction
  • Rules with country whitelist matching
  • Active response for automated notification and optional mitigation

Flow stages

1. Authentication event

  • SSH authentication attempt recorded in /var/log/auth.log on monitored endpoint
  • Event includes: timestamp, outcome (success/failure), username, source IP

2. RADAR Helper enrichment (see LARC-025)

  • AuthLogWatcher detects new authentication event
  • GeoLookup queries MaxMind databases for source IP
  • Enrichment adds: country, region, city, ASN, geo_velocity_kmh, country_change_i, asn_novelty_i
  • Writes enriched log to /var/log/suspicious_login.log

3. Wazuh agent ingestion

  • Wazuh agent monitors /var/log/suspicious_login.log
  • Reads enriched log line
  • Forwards to Wazuh Manager

4. Decoder extraction

  • Custom decoder (0310-ssh.xml) parses enriched log
  • Extracts structured fields: radar_outcome, radar_country, radar_src_ip, radar_user, etc.
  • Populates alert data structure

5. Rule evaluation

Rule 100900: Connection from non-whitelist country (list-based)

  • Condition: radar_outcome="success" AND radar_country NOT IN /var/ossec/etc/lists/whitelist_countries
  • Level: 10
  • Groups: authentication_success, geoip_detection

Rule 100901: Connection from non-whitelist country (hardcoded fallback)

  • Condition: authentication_success AND srcgeoip NOT IN {predefined EU countries}
  • Level: 10
  • Fallback if list-based check unavailable

6. Active response trigger

  • Rule match triggers active response command: radar-ar
  • Wazuh executes: /var/ossec/active-response/bin/radar_ar.py
  • Alert JSON passed via stdin

7. Active response processing (see LARC-026)

  • Scenario identification: Maps rule ID 100900/100901 → geoip_detection scenario
  • Context collection: Query recent authentication events for user
  • CTI enrichment: Check source IP against threat intelligence
  • Risk calculation: Primarily signature-based (w_sig = 0.7, w_cti = 0.3)
  • Tier determination: Based on risk score and configuration
  • Action execution:

    • Low/Medium tier: Email notification with alert details
    • High tier (if mitigation enabled): Email + Flowintel case + firewall block

8. Audit logging

  • Decision recorded in /var/ossec/logs/active-responses.log
  • Includes: scenario, rule ID, risk score, tier, actions executed, IOCs, CTI hits

Configuration

Whitelist (/var/ossec/etc/lists/whitelist_countries):

US
CA
GB
DE
FR
...

ar.yaml configuration:

geoip_detection:
  rules:
    signature: ["100900", "100901"]
  detection_params:
    delta_signature_minutes: 1
  risk_params:
    likelihood: 0.4
    impact: 0.9
    weights: {w_ad: 0.0, w_sig: 0.7, w_cti: 0.3}
  tier_thresholds: {low: 0.33, high: 0.66}
  actions:
    email_enabled: true
    case_creation_enabled: true
    mitigation_enabled: false

Key characteristics

  • Real-time: No training phase required
  • Deterministic: Rule-based matching, no probabilistic scoring
  • Low false positives: Whitelist approach ensures legitimate geographic regions allowed
  • Operational flexibility: Whitelist easily updated without retraining
  • Fast response: No detector delays, immediate rule evaluation

See also:

  • /docs/manual/radar_docs/radar-scenarios/geoip_detection_explained.md for detailed documentation
  • /radar/scenarios/decoders/geoip_detection/ for decoder implementations
  • /radar/scenarios/rules/geoip_detection/ for rule definitions

Parent links: HARC-004 RADAR architecture, HARC-014 RADAR helper architecture, LARC-025 RADAR helper enrichment pipeline, LARC-026 RADAR active response decision pipeline, SRS-053 RADAR scenario: malware C2 beaconing, SRS-055 RADAR scenario: Geo-IP AC via whitelisting

Child links: SWD-034 RADAR custom rule and decoder patterns

2.13 RADAR log volume detection scenario flow LARC-029

The diagram below depicts the end-to-end flow for the log volume detection scenario, which uses behavior-based anomaly detection (OpenSearch RCF) to identify abnormal increases in log generation that may indicate attacks, system issues, or data exfiltration.

Scenario overview

Log volume detection is a behavior-based scenario using OpenSearch Anomaly Detection:

  • Monitors log file size growth on endpoints
  • Uses RRCF (Robust Random Cut Forest) algorithm for anomaly detection
  • Per-endpoint baselines via high-cardinality detection
  • Webhook integration routes anomalies to Wazuh rule engine
  • Active response executes tiered actions based on risk score

Flow stages

1. Log collection on agent

  • Wazuh agent runs localfile command every 20 seconds: xml <localfile> <log_format>command</log_format> <command>du -sb /var/log | awk '{print $1}'</command> <alias>log_volume_metric</alias> <frequency>20</frequency> </localfile>
  • Command output: log size in bytes for /var/log

2. Log forwarding

  • Agent forwards log to Wazuh Manager
  • Manager applies decoder: local_decoder.xml extracts data.log_bytes field
  • Log indexed to OpenSearch: wazuh-ad-log-volume-* index

3. Historical data ingestion (initial setup, see LARC-027)

  • run-radar.sh executes wazuh_ingest.py
  • Generates 240 minutes of baseline data (720 points)
  • Realistic time series with monotonic growth
  • Bulk indexed to OpenSearch

4. Detector creation (see LARC-022)

  • run-radar.sh executes detector.py
  • Creates OpenSearch AD detector with:

    • Feature: max(data.log_bytes)
    • Category field: agent.name (per-endpoint baselines)
    • Shingle size: 8 (temporal sequence)
    • Detection interval: 5 minutes
    • Window delay: 1 minute
    • Starts detector

5. Real-time anomaly detection

OpenSearch RCF detector runs every 5 minutes:

  • Query index for recent data points per agent
  • Extract feature values: max log bytes in interval
  • Apply RCF model to detect outliers
  • Compute anomaly grade (0-1) and confidence (0-1)
  • Write results to opensearch-ad-plugin-result-log-volume index

6. Monitor evaluation (see LARC-023)

OpenSearch monitor runs every 5 minutes:

  • Query detector result index
  • Evaluate trigger condition: anomaly_grade > 0.3 AND confidence > 0.3
  • If condition met, trigger webhook action

7. Webhook notification

  • Monitor POSTs JSON payload to webhook endpoint: http://manager:8080/notify
  • Payload includes: monitor name, trigger name, entity (agent name), period start/end times

8. Webhook service processing

  • Flask service (ad_alerts_webhook.py) receives POST request
  • Extracts alert details from payload
  • Formats as syslog message
  • Writes to /var/log/ad_alerts.log: Feb 16 10:30:15 wazuh-manager opensearch_ad: LogVolume-Growth-Detected entity=edge.vm grade=0.85 confidence=0.92

9. Wazuh rule matching

Rule 100300: Generic OpenSearch AD alert

  • Decoder: opensearch_ad extracts fields
  • Rule level: 5
  • Matches any AD alert

Rule 100309: Log Volume Growth specific

  • Parent: Rule 100300
  • Condition: trigger.name = "LogVolume-Growth-Detected"
  • Level: 12
  • Groups: log_volume, anomaly

10. Active response trigger

  • Rule 100309 match triggers radar-ar active response
  • Alert JSON passed to radar_ar.py via stdin

11. Active response processing (see LARC-026)

  • Scenario identification: Maps rule ID 100309 → log_volume scenario
  • Time window: Last 10 minutes (delta_ad_minutes)
  • Context collection: Query correlated events from all agents (high-cardinality)
  • Extract AD scores: anomaly_grade, confidence from alert data
  • CTI enrichment: Check for malicious activity indicators
  • Risk calculation: Primarily AD-based (w_ad = 0.6, w_sig = 0.2, w_cti = 0.2) A = grade × confidence = 0.85 × 0.92 = 0.782 S = likelihood × impact = 0.5 × 0.6 = 0.3 T = CTI aggregation (varies) R = 0.6 * 0.782 + 0.2 * 0.3 + 0.2 * T
  • Tier determination: Based on R value
  • Action execution:
    • Low tier (R < 0.3): Email only
    • Medium tier (0.3 ≤ R < 0.7): Email + Flowintel case
    • High tier (R ≥ 0.7): Email + case + investigation escalation

12. Audit logging

  • Decision recorded with full context: scenario, risk score, tier, actions, anomaly details

Configuration

config.yaml:

log_volume:
  index_prefix: "wazuh-ad-log-volume-*"
  result_index: "opensearch-ad-plugin-result-log-volume"
  time_field: "@timestamp"
  categorical_field: "agent.name"
  detector_interval: 5
  delay_minutes: 1
  shingle_size: 8
  anomaly_grade_threshold: 0.3
  confidence_threshold: 0.3
  features:
    - feature_name: "log_volume_max"
      feature_enabled: true
      aggregation_query:
        log_volume_max:
          max:
            field: "data.log_bytes"

ar.yaml:

log_volume:
  rules:
    ad: ["100309"]
  detection_params:
    delta_ad_minutes: 10
  risk_params:
    likelihood: 0.5
    impact: 0.6
    weights: {w_ad: 0.6, w_sig: 0.2, w_cti: 0.2}
  tier_thresholds: {low: 0.3, high: 0.7}
  actions:
    email_enabled: true
    case_creation_enabled: true
    mitigation_enabled: false

Key characteristics

  • Adaptive: Learns normal patterns per endpoint
  • High-cardinality: Separate baselines per agent prevent statistical masking
  • Streaming: Real-time detection without batch retraining
  • Configurable sensitivity: Threshold tuning balances false positives vs. detection rate
  • Multi-stage pipeline: Decouples detection (OpenSearch) from response (Wazuh)

Implementation reference

See implementation details:

Parent links: HARC-004 RADAR architecture, LARC-022 RADAR detector creation workflow, LARC-023 RADAR monitor and webhook workflow, LARC-026 RADAR active response decision pipeline, SRS-054 RADAR automated test framework, SRS-056 RADAR scenario: log size change

Child links: SWD-034 RADAR custom rule and decoder patterns, SWD-035 RADAR webhook service design

2.15 RADAR adversarial defense implementation flow LARC-031

The diagram below depicts the implementation flow for adversarial ML defense mechanisms in RADAR, protecting anomaly detection systems against data poisoning, evasion attacks, and model tampering.

Defense layers

Layer 1: Baseline initialization with clean data

Implementation:

  • Clean period identification: Analyze historical data for known-clean periods (pre-incident, honeypot-free)
  • Gold-standard datasets: Use verified attack-free data for initial training
  • Digital clean room: Temporary system lockdown to capture pristine baselines
  • Exclusion filtering: Remove time segments with suspected attacker presence

Configuration:

baseline_init:
  use_gold_standard: true
  clean_period_start: "2026-01-01T00:00:00Z"
  clean_period_end: "2026-01-15T00:00:00Z"
  excluded_hosts: ["suspected-compromised-01"]

Layer 2: Concept drift detection

Implementation:

  • Baseline shift monitoring: Track mean, variance, and distribution shape changes
  • Velocity thresholds: Alert when baseline shift rate exceeds historical norms
  • Correlation analysis: Flag simultaneous shifts across multiple features
  • Automated gating: Freeze model updates when anomalous drift detected

Algorithm:

Algorithm: Detect abnormal baseline shifts
  Input: current_stats, historical_stats, threshold (default: 0.2)

  // Calculate relative changes
  mean_shift ← |current_stats.mean - historical_stats.mean| / historical_stats.mean
  std_shift ← |current_stats.std - historical_stats.std| / historical_stats.std

  // Check correlation across features
  correlated_shifts ← count_features_with_simultaneous_shifts(current_stats, historical_stats)

  // Determine if drift is anomalous
  drift_detected ← (mean_shift > threshold) OR
                     (std_shift > threshold) OR
                     (correlated_shifts ≥ 3)

  drift_metrics ← {
    mean_shift: mean_shift,
    std_shift: std_shift,
    correlated_features: correlated_shifts
  }

  Output: drift_detected (boolean), drift_metrics (dictionary)

Configuration:

drift_detection:
  enabled: true
  check_interval_hours: 24
  threshold_percent: 20
  correlation_threshold: 3
  freeze_on_drift: true
  require_analyst_approval: true

Layer 3: Multi-layer validation (defense in depth)

Implementation: RADAR already implements this via hybrid detection:

  • Signature-based (Wazuh, Suricata): Fast, deterministic, resistant to ML poisoning
  • Multivariate AD (SONAR MVAD): Complex patterns, correlation-aware
  • Streaming AD (OpenSearch RRCF): Real-time, distinct algorithm
  • Cross-layer correlation: Flag events detected by multiple layers as high confidence

Validation algorithm:

Algorithm: Validate alert across detection layers
  Input: alert

  layers_triggered ← empty_list

  if signature_detection_fired(alert):
    append 'signature' to layers_triggered

  if mvad_detection_fired(alert):
    append 'mvad' to layers_triggered

  if rrcf_detection_fired(alert):
    append 'rrcf' to layers_triggered

  // Multi-layer agreement increases confidence
  confidence_boost ← count(layers_triggered) × 0.15
  high_confidence ← count(layers_triggered) ≥ 2

  Output: {
    layers: layers_triggered,
    confidence_boost: confidence_boost,
    high_confidence: high_confidence
  }

Layer 4: Human-in-the-loop (HITL) oversight

Implementation:

  • Transparent reasoning: Expose model decisions (which points anomalous, why)
  • Analyst review workflows: Dashboard for reviewing flagged baseline changes
  • Feedback loops: Analysts flag incorrect classifications
  • Approval gates: Model updates require manual approval when drift detected

Workflow:

  1. System detects concept drift or baseline shift
  2. Generate alert to SOC dashboard
  3. Analyst reviews: - Shift magnitude and velocity - Affecting features and entities - Timeline correlation with known events
  4. Analyst decision: - Approve: Legitimate change (new application, infrastructure update) - Reject: Suspected poisoning, freeze baseline - Investigate: Escalate to incident response

Layer 5: System hardening

Implementation:

Log integrity:

Process: Cryptographic hashing of log files
  1. Generate SHA-256 hash of log file
     Command: sha256sum /var/log/wazuh/alerts.json > /var/log/wazuh/alerts.json.sha256

  2. Enable append-only logging (immutable storage)
     Command: chattr +a /var/log/wazuh/alerts.json

  3. Forward-secure audit logs
     Mechanism: Time-stamped cryptographic signatures preventing retroactive tampering

Model security:

Process: Secure model file storage and versioning
  1. Set restrictive permissions
     Permissions: read-only for radar-ml group (mode 440)
     Owner: root:radar-ml

  2. Model versioning and integrity
     Version control: Git repository for model files
     Integrity: SHA-256 checksums for all model files

  3. Audit logging
     Log all model updates with: timestamp, user, model version

Access control specification: See RBAC configuration in YAML section below for role definitions and approval requirements.

Integration with RADAR workflows

Detector creation (LARC-022)

  • Before training: Validate data cleanliness
  • Contamination parameter: Configure RCF contamination tolerance
  • Baseline documentation: Record training period and data sources

Monitor evaluation (LARC-023)

  • Drift detection: Monitor checks for baseline shift before evaluating anomalies
  • Multi-layer correlation: Cross-reference with signature-based rules
  • Confidence adjustment: Boost detection confidence when multiple layers agree

Active response (LARC-026)

  • Risk calculation: Include drift detection status in context
  • Action planning: Require higher confidence when drift suspected
  • Audit logging: Record all defense layer activations

Configuration (adversarial_defense.yaml)

adversarial_defense:
  baseline_initialization:
    use_verified_clean_data: true
    gold_standard_dataset: "/opt/radar/baseline/gold_standard.json"
    clean_period:
      start: "2026-01-01T00:00:00Z"
      end: "2026-01-15T00:00:00Z"

  drift_detection:
    enabled: true
    check_interval_hours: 24
    mean_shift_threshold: 0.20
    std_shift_threshold: 0.25
    correlation_threshold: 3
    freeze_on_drift: true

  multi_layer_validation:
    enabled: true
    require_layers: 2  # Minimum layers for high confidence
    confidence_boost_per_layer: 0.15

  hitl_oversight:
    enabled: true
    require_approval_for:
      - model_retraining
      - baseline_reset
      - significant_drift
    dashboard_url: "https://radar.example.com/oversight"

  system_hardening:
    log_integrity:
      enable_hashing: true
      hash_algorithm: "sha256"
      append_only_logs: true
    model_security:
      enable_versioning: true
      enable_checksums: true
      restrict_access: true
    access_control:
      enable_rbac: true
      roles:
        radar_viewer:
          permissions: [view_alerts, view_detectors]
        radar_operator:
          permissions: [view_alerts, view_detectors, create_detectors, create_monitors]
        radar_admin:
          permissions: ["*", model_training, model_deployment]
      approval_required:
        - model_training       # Requires radar_admin role
        - baseline_reset       # Requires radar_admin role + security_approval
        - model_deployment     # Requires radar_admin role + change_control
      audit_all_operations: true

Monitoring and alerting

Drift detection alerts:

  • Email to SOC when drift exceeds threshold
  • Dashboard visualization of baseline trends
  • Automated freeze of model updates

  • Access control alerts:

  • Failed authentication attempts to model files

  • Unauthorized model update attempts
  • Suspicious baseline reset requests

Adversarial Defense Implementation Flow

UML Diagram

Implementation reference

See docs/manual/radar_docs/adversarial-ml-guidance.md for comprehensive defense guidance.

See also

  • HARC-015 for overall adversarial defense architecture

Parent links: HARC-015 RADAR adversarial ML defense architecture, LARC-022 RADAR detector creation workflow

Child links: SWD-036 RADAR model security and adversarial defense implementation

3.0 ADBox v1 Low-Level Architecture (Maintenance)

Low-level architecture for ADBox v1 (MTAD-GAT legacy system) - maintenance mode only.

3.1 ADBox training pipeline flow LARC-001

Training pipeline flow diagram

The diagram summarizes the flow of the training pipeline orchestrated by the ADBox Engine.

ADBox training pipeline flow diagram

Parent links: SRS-038 Joint Host-Network Training

Child links: SWD-001 ADBox training pipeline

3.1 ADBox Shipper LARC-014

ADBox Shipper context diagram

The diagram below depicts the ADBox Shipper subpackage.

ADBox Shipper

Parent links: SRS-042 Prediction Shipping Feature

Child links: SWD-015 ADBox Shipper and Template Handler, SWD-016 ADBox shipping of prediction data, SWD-017 ADBox creation of a detector stream

3.2 ADBox historical data prediction pipeline flow LARC-002

Prediction pipeline flow diagram for historical (offline) run mode

The diagram summarizes the flow of the predict pipeline for historical (offline) runmode orchestrated by the ADBox Engine.

ADBox predict pipeline flow - historical diagram

Parent links: SRS-035 Offline Anomaly Detection

Child links: SWD-002 ADBox prediction pipeline, SWD-013 ADBox Prediction pipeline's inner body

3.3 ADBox preprocessing flow LARC-003

Preprocessing flow diagram of ADBox data transformer

The diagram summarizes the flow of the method Preprocessor.preprocessing by the ADBox Data Transformer.

ADBox Preprocessor.preprocessing flow diagram

Parent links: SRS-029 Host & Network Ingestion

Child links: SWD-010 ADBox data transformer, SWD-011 ADBox preprocessing

3.4 ADBox batch and real-time prediction flow LARC-008

Batch and real-time ADBox run modes prediction flow diagrams

The diagram summarizes the flow of the prediction pipeline for online run modes orchestrated by the ADBox Engine.

Specifically,

  • batch mode runs the loop every batch interval,

  • real-time mode runs the loop every granularity interval.

ADBox predict pipeline flow - online diagram

Parent links: SRS-027 ML-Based Anomaly Detection

Child links: SWD-002 ADBox prediction pipeline, SWD-013 ADBox Prediction pipeline's inner body

3.5 ADBox machine learning package LARC-009

ADBox machine learning package diagram

ADBox ML-packages folder containing the machine learning packages called by the AD pipelines.

ADBox ML-packages

Parent links: SRS-039 Algorithm Selection Option

Child links: SWD-003 MTAD-GAT training, SWD-004 MTAD-GAT prediction, SWD-005 Peak-over-threshold (POT), SWD-006 ADBox Predictor score computation, SWD-007 ADBox MTAD-GAT anomaly prediction, SWD-008 ADBox MTAD-GAT Predictor

3.6 ADBox data manager LARC-010

ADBox data manager diagram

The diagram below depicts the ADBox Data Manager.

ADBox Data Manager

Parent links: SRS-040 Data Management Subpackage

Child links: SWD-009 ADBox data managers

3.7 ADBox TimeManager LARC-011

ADBox TimeManager context diagram

The diagram below depicts the ADBox TimeManager.

ADBox Time Manager

Parent links: SRS-041 Time Management Package

Child links: SWD-012 ADBox TimeManager

3.8 ADBox ConfigManager LARC-012

ADBox ConfigManager context diagram

The diagram below depicts the ADBox ConfigManager.

ADBox Time Manager

Parent links: SRS-018 ML Hyperparameter Tuning, SRS-021 Default Use Case Update

Child links: SWD-014 ADBox config managers

3.9 ADBox RequestResponseHandler LARC-013

ADBox RequestResponseHandler context diagram

The diagram below depicts the ADBox RequestResponse Handler subpackage.

ADBox RequestResponseHandler

Parent links: SRS-042 Prediction Shipping Feature

4.0 Infrastructure Low-Level Architecture

Low-level architecture for deployment, integration, and system-wide infrastructure.

4.1 IDPS-ESCAPE end-point integrated arch. LARC-004

IDPS-ESCAPE end-point integrated architecture diagram

The diagram illustrates the architecture of IDPS-ESCAPE end-point integrated model.

IDPS-ESCAPE end-point integrated model

Parent links: SRS-033 Remote Endpoint Deployment

4.2 IDPS-ESCAPE end-point hybrid arch. LARC-005

IDPS-ESCAPE end-point hybrid model architecture diagram

The diagram illustrates the architecture of IDPS-ESCAPE end-point hybrid model.

IDPS-ESCAPE end-point hybrid model

Parent links: SRS-033 Remote Endpoint Deployment

4.3 IDPS-ESCAPE end-point host-only IDS arch. LARC-006

IDPS-ESCAPE end-point host-only IDS model architecture diagram

The diagram illustrates the architecture of IDPS-ESCAPE end-point HIDS only model.

IDPS-ESCAPE end-point HIDS only model

Parent links: SRS-033 Remote Endpoint Deployment

4.4 IDPS-ESCAPE end-point capture-only arch. LARC-007

IDPS-ESCAPE end-point capture-only model architecture diagram

The diagram illustrates the architecture of IDPS-ESCAPE end-point capture only model.

IDPS-ESCAPE end-point capture only model

Parent links: SRS-033 Remote Endpoint Deployment