Example Scenario: A privileged employee starts accessing an abnormal number of confidential files outside business hours and copying data to an external drive. This deviates from their normal behavior and may signal malicious intent or account compromise. For instance, a database administrator who typically queries customer records during the day is now running massive exports at midnight, or an engineer is downloading an unusual volume of sensitive documents not related to their project.
Categorical features: To enhance detection, categorize anomalies by user or user attributes. Enabling a category field for the username (or user ID) ensures the anomaly model learns a separate baseline for each user. This way, one user’s normal activity does not mask another’s outliers. Additional categorical dimensions could include user department or role (if such metadata can be appended to log events), as insider threats often stand out when compared to peers. Essentially, we want to “slice” the data per user for UEBA, so that anomalies are detected relative to each user’s own typical behavior.
To simulate today’s data and feed it to the Wazuh AD plugin, we shift each file’s dates into the last five days. The script is located in insider-threat/wazuh_ingest.py. Run it from the soar-radar folder:
python3 ./insider-threat/wazuh_ingest.py
today + offset@timestamp (ISO), event_hour, content_byteswazuh-ad-insider-threat-2025.06.07wazuh-ad-insider-threat-*.@timestamp (date)user (string)pc, filename, content_bytes, etc.insider-threat-detectorwazuh-ad-insider-threat-*@timestamp5m (with 1m window delay)| Feature name | Method | Field | Notes |
|---|---|---|---|
user_file_access_count |
count() |
user.keyword |
Counts file events per user |
user_data_volume_bytes |
sum() |
content_bytes |
Total bytes read or written |
user_distinct_files_touched |
Custom expression | - | Workaround via ingest pipeline (see below) |
The OpenSearch UI does not support cardinality() directly, which is why we use a custom expression:
{
"distinct_files_cardinality": {
"cardinality": {
"field": "filename.keyword"
}
}
}
Under Categorical field, select the user identifier user.keyword.
This ensures each user gets its own statistical model, preventing Alice’s behavior from obscuring Bob’s anomalies.
Click Next to Review.
In the insider-threat-detector anomaly overview, set up an alert:
insider-threat-detector monitor, which will create an alert when an anomaly is detected.Insider-Threat-DetectedWhen choosing thresholds for firing alerts, you must balance sensitivity (catching real threats) against precision (avoiding false positives). A balanced strategy is to require:
Starting here helps minimize alerts on spikes. Particularly important in high-cardinality, per-user detectors where data volume per user can vary widely. Tuning can then adjust these up or down based on observed false-positive rates during the analysis.
RADARMessage (must be JSON):
{
"monitor": {
"name": ""
},
"trigger": {
"name": ""
},
"entity": "",
"periodStart": "",
"periodEnd": ""
}
When the condition is met, this monitor will send structured JSON to the webhook.
This AD alerts webhook is a simple Flask application that receives the monitor’s payload and appends a single line to /var/log/ad_alerts.log. To deploy the webhook in the Wazuh manager:
Copy the file from this repository into the Wazuh manager to a custom wazuh_webhook directory.
Ensure execution permissions: chmod +x
Run under a python3:
python3 ad_alerts_webhook.py
/var/ossec/etc/ossec.conf needs to be configured:<localfile>
<log_format>syslog</log_format>
<location>/var/log/ad_alerts.log</location>
</localfile>
Add the content of the file local_decoder.xml in this repository into the file /var/ossec/etc/decoders/local_decoder.xml in the Wazuh manager.
Add the content of the file local_rules.xml in this repository into the file /var/ossec/etc/rules/local_rules.xml in the Wazuh manager.
var/ossec/bin/wazuh-control restart in Docker or systemctl restart wazuh-manager)./var/log/ad_alerts.log.ossec.conf on the manager, register and bind only the ad_context_insider_active_response.py script. Script can be found in Active Response directory.<ossec_config>
<!-- 1) Command declaration -->
<command>
<name>ad_enrich</name>
<executable>ad_context_insider_active_response.py</executable>
<timeout_allowed>yes</timeout_allowed>
</command>
<!-- 2) Active-response binding -->
<active-response>
<disabled>no</disabled>
<command>ad_enrich</command>
<location>server</location>
<rules_id>100301</rules_id>
<timeout>120</timeout>
</active-response>
</ossec_config>
ad_context_insider_active_response.py user_keyword period_start period_end.active_responses directory to /var/ossec/active-response/bin in the Wazuh manager. Note: remember to update the Wazuh access credentials (username, password) in the script based on your setup.chmod 750 /var/ossec/active-response/bin/ad_context_insider_active_response.py
chown root:wazuh /var/ossec/active-response/bin/ad_context_insider_active_response.py
python3 -m pip install requests
Install jq (used by the scripts for JSON parsing):
sudo apt update
sudo apt install -y jq
Create your log files (for enrichment):
sudo touch /var/ossec/logs/ad_pc_enriched.log
sudo chown root:wazuh /var/ossec/logs/ad_pc_enriched.log
sudo chmod 664 /var/ossec/logs/ad_pc_enriched.log
Create your blocked-users log (for lock/unlock):
sudo touch /var/ossec/logs/blocked_users.log
sudo chown root:wazuh /var/ossec/logs/blocked_users.log
sudo chmod 664 /var/ossec/logs/blocked_users.log
Deploy the Scripts
Copy your two scripts into the agent’s AR directory. Scripts can be found in Active Response directory.
sudo cp write_contextual_logs_insider_active_response.sh \
/var/ossec/active-response/bin/
sudo cp lock_user_linux_active_response.sh \
/var/ossec/active-response/bin/
Set ownership and permissions so Wazuh can execute them:
sudo chown root:wazuh /var/ossec/active-response/bin/write_contextual_logs_insider_active_response.sh
sudo chmod 750 /var/ossec/active-response/bin/write_contextual_logs_insider_active_response.sh
sudo chown root:wazuh /var/ossec/active-response/bin/lock_user_linux_active_response.sh
sudo chmod 750 /var/ossec/active-response/bin/lock_user_linux_active_response.sh
In order for the manager to invoke these scripts via the API or <active-response> blocks, the agent must accept remote commands.
Create or update:
/var/ossec/etc/local_internal_options.con
and include:
wazuh_command.remote_commands=1
ossec.confEdit:
/var/ossec/etc/ossec.conf
Under the top-level <ossec_config> element, add both <command> entries:
<ossec_config>
…
<command>
<name>write_contextual_logs_insider_active_response.sh</name>
<executable>write_contextual_logs_insider_active_response.sh</executable>
<timeout_allowed>yes</timeout_allowed>
</command>
<command>
<name>lock_user_linux_active_response.sh</name>
<executable>lock_user_linux_active_response.sh</executable>
<timeout_allowed>yes</timeout_allowed>
</command>
…
</ossec_config>
sudo systemctl restart wazuh-agent
For most environments, it is prudent to implement a two-tier response:
By reserving lockouts for only the most extreme, high-confidence events, you mitigate the risk of inadvertently locking legitimate users during benign but unusual patterns (such as quarterly bulk exports).
Receive Trigger Parameters
The script is called with three pieces of information are extracted by the Wazuh decoder and rule:
Authenticate to the Wazuh API
It requests a JWT token from the manager’s security endpoint, using the configured API credentials. This token is needed for all subsequent calls to query agents or dispatch further Active Responses.
Query Anomaly Events
Using the OpenSearch client, the script fetches every logged event for that user within the anomaly window from the test index. This returns raw details: timestamps, file names, bytes processed, host names, etc.
Group Events by Host (Agent)
The returned events are regrouped in memory by their pc field. Each group corresponds to one Wazuh agent where the suspicious activity occurred.
Enrichment Log Dispatch
For each agent:
ad_pc_enriched.log), giving analysts full context of what happened on each machine.Optional Account Lockout
Immediately after logging, the script can invoke the lock_user command on the same agent to disable the account locally. By default this step is commented out to prevent accidental lockouts during testing (simply uncomment it once you are satisfied with the detection quality).
Result Handling and Logging
Each Active Response call is made with a “wait for completion” flag. The script parses the API response to confirm which agents successfully received the command, which (if any) failed, and logs that outcome back on the manager for audit and troubleshooting.
Roll-Back Mechanism
If lockouts are enabled, you can configure a timeout (in Wazuh <active-response> settings) so that accounts are automatically re-enabled after a safe period.
In the scenario of a privileged user anomalously exporting sensitive data outside of business hours, the risk associated with this behavior is computed using the classical formulation:
R = C × I
where C is the model’s confidence that the behavior is malicious, and I is the impact severity derived from the Common Vulnerability Scoring System (CVSS). Although CVSS was originally designed for assessing external software vulnerabilities, we adopt it as a measure in cyber risk modeling and UEBA (User and Entity Behavior Analytics) to handle insider threat scenarios by mapping observed behavior to Confidentiality, Integrity, and Availability (CIA) impacts.
In this particular case, the user is accessing and exfiltrating large amounts of sensitive data, implying a complete loss of confidentiality and partial compromise of data integrity, but with minimal availability impact. These characteristics align with the “High” impact range in CVSS v3 scoring (7.0–8.9). Therefore, we conservatively assign an impact score of:
I = 7.5
This score represents a mid-point within the high-severity band and reflects the severity of potential data loss and misuse of privilege. The resulting risk score becomes:
R = C × 7.5
Depending on the thresholding technique used, e.g., empirically obtained values, the tiered system can be used to take different types of automated response actions:
This thresholding strategy ensures a data-driven, explainable escalation path where only sufficiently confident and impactful insider threats are prioritized, while low-confidence anomalies are deprioritized. The formulation remains transparent, consistent, and interpretable across operational environments.
For Contextual Enrichment and Threat Intelligence, corresponding Active Response can be triggered on every Anomaly detection. The instructions can be found in the Automated OpenCTI enrichment README.
The dataset stored in the dataset subfolder of this RADAR scenario was obtained from the kilthub repository of Carnegie Mellon University.