Complete guide to scenario-based anomaly detection with SONAR (SIEM-Oriented Neural Anomaly Recognition).
SONAR uses YAML-based scenarios to define complete anomaly detection workflows. Each scenario specifies what data to analyze, how to train models, and how to detect anomalies.
name: "Scenario Name"
description: "What this scenario detects"
enabled: true
# Optional: Custom model name for saving/loading
model_name: "my_scenario_baseline_v1" # Auto-generated if omitted
# Optional: OpenSearch query filter to limit processed alerts
# NOTE: This feature is planned but not yet implemented
# query_filter:
# bool:
# should:
# - match: {"rule.groups": "authentication"}
# - match: {"rule.groups": "sudo"}
# # Also supports: must, must_not, filter for complex queries
# Training phase (optional)
training:
lookback_hours: 168
numeric_fields:
- "rule.level"
- "data.cpu_usage_%"
categorical_fields:
- "agent.id"
- "rule.groups"
categorical_top_k: 10
bucket_minutes: 5
sliding_window: 200
device: "cpu"
derived_features: true # Enable computed security features
extra_params: {}
# Optional: Data shipping configuration (for production)
shipping:
enabled: false # Set true to ship anomalies to dedicated data streams
install_templates: true # Install index templates on first run
scenario_id: "custom_id" # Optional: custom scenario ID
# Detection phase (optional)
detection:
mode: "batch" # or "historical" or "realtime"
lookback_minutes: 60
threshold: 0.7
min_consecutive: 2
name: Scenario identifierdescription: What the scenario detectstraining or detection sectionenabled: Enable/disable scenario (default: true)model_name: Custom model filename for saving/loading (auto-generated if omitted)query_filter: OpenSearch query DSL to filter alerts before processing (Planned feature - not yet implemented)training: Training phase configurationdetection: Detection phase configurationshipping: Data shipping configuration for production deploymentsCustom identifier for saving and loading trained models:
model_name: "brute_force_baseline_v2_20260125"
Behavior:
./models/{model_name}.pklExample naming conventions:
{scenario}_{version}_{date}: brute_force_v2_20260125{scenario}_{environment}: auth_anomaly_production{scenario}_{datasource}: lateral_movement_dc1Status: Planned feature - not yet implemented in current version
OpenSearch query DSL to pre-filter alerts before feature extraction:
# query_filter: # Commented out until implemented
# bool:
# must:
# - match: {"rule.groups": "authentication"}
# should:
# - match: {"rule.groups": "sudo"}
# - match: {"rule.groups": "ssh"}
# must_not:
# - match: {"agent.name": "test-agent"}
Planned use cases:
{"range": {"rule.level": {"gte": 7}}}{"match": {"rule.mitre.technique": "T1078"}}Note: Currently, alerts are processed from all indices matching the pattern. Use scenario-specific models for focused detection.
Configure data stream shipping for production deployments:
shipping:
enabled: true
install_templates: true
scenario_id: "auth_monitoring"
Fields:
enabled: Enable data shipping to dedicated streamsinstall_templates: Install index templates (disable after first run)scenario_id: Custom stream identifier (auto-generated if omitted)For complete shipping configuration, RADAR integration, and troubleshooting: See data-shipping-guide.md.
The scenario system automatically determines execution based on YAML sections:
| YAML Sections | Execution Behavior | Use Case |
|---|---|---|
training + detection |
Train → Detect (batch) | Full workflow with fresh model |
training only |
Train (save model) | Establish baseline, no detection |
detection only |
Detect (load model) | Ad-hoc investigation with existing model |
name: "Brute Force Detection"
description: "Detect authentication attack patterns"
enabled: true
training:
lookback_hours: 168 # 1 week baseline
numeric_fields: ["rule.level"]
categorical_fields: ["agent.id", "rule.groups"]
bucket_minutes: 5
sliding_window: 200
detection:
mode: "batch"
lookback_minutes: 60
threshold: 0.7
Execution:
poetry run sonar scenario --use-case brute_force_detection.yaml
Output:
Phase 1: Training model on 168 hours of data...
✓ Training complete. Model saved to ./model/mvad_model.pkl
Phase 2: Running detection on last 60 minutes...
✓ Detection complete. Found 3 anomalies.
name: "Weekly Baseline"
description: "Establish detection baseline (no immediate detection)"
enabled: true
training:
lookback_hours: 336 # 2 weeks
numeric_fields: ["rule.level"]
categorical_fields: ["agent.id"]
bucket_minutes: 10
sliding_window: 250
Execution:
poetry run sonar scenario --use-case weekly_baseline.yaml
Use case: Scheduled weekly retraining via cron without immediate detection.
name: "Ad-hoc Investigation"
description: "Investigate recent activity with existing model"
enabled: true
detection:
mode: "historical"
lookback_minutes: 120
threshold: 0.75
Execution:
poetry run sonar scenario --use-case adhoc_investigation.yaml
Use case: Quick investigation using pre-trained model from baseline.
One-shot detection on recent data:
detection:
mode: "historical"
lookback_minutes: 60
threshold: 0.7
Detection immediately after training:
training:
lookback_hours: 24
detection:
mode: "batch"
lookback_minutes: 10
Continuous monitoring:
detection:
mode: "realtime"
lookback_minutes: 10
polling_interval_seconds: 60
threshold: 0.8
How much historical data to use for training:
training:
lookback_hours: 168 # 1 week
Guidelines:
Numeric features to analyze:
training:
numeric_fields:
- "rule.level"
- "data.cpu_usage_%"
- "data.memory_usage_%"
Common fields:
rule.level: Alert severityrule.firedtimes: Alert frequencydata.cpu_usage_%: CPU metricsdata.memory_usage_%: Memory metricsdata.login_attempts: Authentication attemptsCategorical features (one-hot encoded):
training:
categorical_fields:
- "agent.id"
- "agent.name"
- "rule.groups"
categorical_top_k: 10
Common fields:
agent.id: Agent identifieragent.name: Agent hostnamerule.groups: Rule categorydata.srcip: Source IP addressdata.dstip: Destination IP addressNote: Limit categorical fields to avoid feature explosion. Use categorical_top_k to keep only top N values.
Time bucket size for aggregation:
training:
bucket_minutes: 5
Guidelines:
Trade-off: Smaller buckets = more features = longer training
MVAD algorithm parameter:
training:
sliding_window: 200
Guidelines:
Must be ≤ number of time buckets in training data.
Minimum samples required for training:
training:
min_samples: 500
Status: This field is currently ignored by the training engine.
This parameter was intended to enforce a minimum number of alert samples before training, but validation is not yet implemented. SONAR currently validates only that enough time buckets exist for the sliding window size (≥ sliding_window + 1).
Planned behavior:
Current workaround: Use lookback_hours to ensure sufficient training data. Monitor logs for time bucket count warnings.
Enable automatic computation of security-relevant derived features:
training:
derived_features: true # Default
What are derived features?
Instead of just raw alert fields (e.g., rule.level), SONAR computes additional features:
When to use:
Example: For authentication monitoring:
rule.level = 5auth_failures_per_minute = 15, unique_srcip_count = 8, hour_of_day_encoded = 0.75Impact: Adds ~5-10 computed columns to feature set, improves detection of complex attack patterns.
Execution mode (see Detection modes):
detection:
mode: "historical" # or "batch" or "realtime"
How much recent data to analyze:
detection:
lookback_minutes: 60
Guidelines:
Anomaly score threshold (0-1):
detection:
threshold: 0.7
Guidelines:
Minimum consecutive anomalous buckets:
detection:
min_consecutive: 2
Reduces false positives by requiring sustained anomalies.
Filter which alerts to analyze using OpenSearch query DSL:
query_filter:
bool:
should:
- match: {"rule.groups": "authentication"}
- match: {"rule.groups": "sudo"}
query_filter:
bool:
must:
- terms:
agent.id: ["001", "002", "003"]
query_filter:
range:
rule.level:
gte: 10
query_filter:
bool:
must:
- range:
rule.level:
gte: 5
should:
- match: {"rule.groups": "web"}
- match: {"rule.groups": "attack"}
must_not:
- match: {"agent.name": "test-agent"}
SONAR includes ready-to-use scenarios in sonar/scenarios/ (from project root):
File: sonar/scenarios/brute_force_detection.yaml
name: "Brute Force Detection"
description: "Detect authentication attack patterns"
enabled: true
query_filter:
bool:
should:
- match: {"rule.groups": "authentication"}
training:
lookback_hours: 168
numeric_fields: ["rule.level"]
categorical_fields: ["agent.id", "rule.groups"]
bucket_minutes: 5
sliding_window: 200
detection:
mode: "historical"
lookback_minutes: 60
threshold: 0.7
Use case: Detect unusual authentication patterns indicating brute force attacks.
File: sonar/scenarios/lateral_movement_detection.yaml
name: "Lateral Movement Detection"
description: "Detect lateral movement via authentication patterns"
enabled: true
query_filter:
bool:
should:
- match: {"rule.groups": "authentication"}
- match: {"rule.groups": "ssh"}
training:
lookback_hours: 336 # 2 weeks
numeric_fields: ["rule.level"]
categorical_fields: ["agent.id", "data.srcip"]
bucket_minutes: 10
sliding_window: 250
detection:
mode: "batch"
lookback_minutes: 120
threshold: 0.75
Use case: Identify unusual cross-host authentication patterns.
File: sonar/scenarios/privilege_escalation_detection.yaml
name: "Privilege Escalation Detection"
description: "Monitor privilege escalation attempts"
enabled: true
query_filter:
bool:
should:
- match: {"rule.groups": "sudo"}
- match: {"rule.groups": "privilege_escalation"}
training:
lookback_hours: 168
numeric_fields: ["rule.level"]
categorical_fields: ["agent.id", "data.command"]
categorical_top_k: 15
bucket_minutes: 15
sliding_window: 150
detection:
mode: "historical"
lookback_minutes: 60
threshold: 0.8
Use case: Detect unusual privilege escalation attempts via sudo/su.
File: sonar/scenarios/linux_resource_monitoring.yaml
name: "Linux Resource Monitoring"
description: "Detect resource exhaustion and abnormal usage"
enabled: true
query_filter:
bool:
should:
- match: {"rule.groups": "syslog"}
- match: {"rule.groups": "performance"}
training:
lookback_hours: 24
numeric_fields:
- "data.cpu_usage_%"
- "data.memory_usage_%"
- "rule.level"
categorical_fields:
- "agent.name"
- "rule.groups"
categorical_top_k: 10
bucket_minutes: 1
sliding_window: 300
device: "cpu"
detection:
mode: "batch"
lookback_minutes: 10
threshold: 0.85
Use case: Identify CPU spikes, memory leaks, fork bombs, resource exhaustion.
Maintain fresh baseline with weekly retraining:
# Crontab entry: Every Sunday at 2 AM
0 2 * * 0 cd /home/alab/soar && poetry run sonar scenario --use-case scenarios/brute_force_detection.yaml
Scenario should include only training: section to update model without immediate detection.
Run detection continuously:
detection:
mode: "realtime"
lookback_minutes: 10
polling_interval_seconds: 60
Deploy as systemd service:
[Unit]
Description=SONAR Continuous Monitoring
After=network.target
[Service]
Type=simple
User=alab
WorkingDirectory=/home/alab/soar
ExecStart=/home/alab/.local/bin/poetry run sonar scenario --use-case scenarios/realtime_monitor.yaml
Restart=always
[Install]
WantedBy=multi-user.target
Test new scenarios safely:
# 1. Training only (validate data quality)
name: "New Scenario - Training Test"
training:
lookback_hours: 24
# ... parameters
# 2. Detection with high threshold (reduce noise)
name: "New Scenario - Detection Test"
detection:
mode: "historical"
threshold: 0.9 # Very strict
# 3. Production with tuned threshold
name: "New Scenario - Production"
training:
lookback_hours: 168
detection:
mode: "batch"
threshold: 0.75 # Tuned based on testing
Run multiple scenarios for comprehensive coverage:
#!/bin/bash
# monitor_all.sh
poetry run sonar scenario --use-case scenarios/brute_force_detection.yaml
poetry run sonar scenario --use-case scenarios/lateral_movement_detection.yaml
poetry run sonar scenario --use-case scenarios/privilege_escalation_detection.yaml
poetry run sonar scenario --use-case scenarios/linux_resource_monitoring.yaml
Schedule via cron:
*/30 * * * * /home/alab/soar/monitor_all.sh >> /var/log/sonar/monitor.log 2>&1
Test scenarios without Wazuh:
# Test with local data
poetry run sonar scenario --use-case scenarios/example_scenario.yaml --debug
Ensure debug configuration points to appropriate test data:
debug:
enabled: true
data_dir: "./test_data/resource_monitoring"
training_data_file: "resource_monitoring_training.json"
detection_data_file: "resource_monitoring_detection.json"
Before deploying scenarios:
Feature mismatch errors:
Empty training data:
lookback_hours or adjust query_filterToo many features:
categorical_top_k or limit categorical_fields