idps-escape

Here we provide a complete tutorial ranging from use case definition to building a custom Wazuh dashboard for a nicely formatted and dynamically generated visualization of the ADBox prediction results for an easier and more efficient analysis.

Our (informal) objectives are as follows:

to create a detector that correlates the number of alerts with memory usage;
to get predictions starting from the start of the month until now;
to start realtime prediction;
to build a dedicated monitoring dashboard.

Note that the chosen setting is similar to the one in the example using notebooks.

Outline

Creating a shipping detector
Running real time prediction
Creation a detector dashboard

Step 1: Create a (shipping) detector

To achieve (1), we make the following selections:

features
- data.memory_usage_% (avg)
- data.cpu_usage_% (avg)
- rule.firedtimes (count)
training index: October 2024
name: detector_test_mem_cpu_alert_count
granularity: 30s
detection interval: 5.5 min (window_size 10)
training epochs: 30

For (2), we have to choose HISTORICAL prediction with start_time 2024-11-01T00:00:00Z. Indeed, if not specified, the end time is determined using the request’s timestamp.

Step 1.1: Define a use case (training + historical prediction)

We define a use case:

training:
  aggregation: true
  aggregation_config:
    features:
      data.cpu_usage_%:
      - average
      data.memory_usage_%:
      - average     
      rule.firedtimes:
      - count
    fill_na_method: Zero
    granularity: 30s
    padding_value: 0
  categorical_features: false
  columns:
  - data.memory_usage_%
  - data.cpu_usage_%
  - rule.firedtimes
  display_name: detector_test_mem_cpu_alert_count
  index_date: '2024-10-*'
  train_config:
    epochs: 30
    window_size: 10

prediction: 
    run_mode: HISTORICAL
    start_time: "2024-11-01T00:00:00Z"

Let’s assume we have cloned the repository into /home/user/SIEM-MTAD-GAT and there are already 12 use cases defined, we must save our newly created ones at /home/user/SIEM-MTAD-GAT/siem_mtad_gat/assets/drivers/uc_13.yaml.

We have chosen to unify the training and historical prediction into a single use case. However, by creating two separate use cases (one for training and one for historical prediction) and running them in sequence, we would obtain the same result.

Step 1.2: Create detector and run historical prediction

(Re)build ADBox container:

$ ./build-adbox.sh

Run ADBox with the shipping option/flag:

$ ./adbox.sh -u 13 -s

Effects

The effect of this command is:

the creation of a detector folder ../siem_mtad_gat/assets/detector_models/baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15

  baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15
  ├── input
  │   ├── detector_input_parameters.json
  │   └── training_config.json
  ├── prediction
  │   ├── uc-13_predicted_anomalies_data-1_2024-11-08T09:43:41Z.json
  │   └── uc-13_predicted_data-1_2024-11-08T09:43:41Z.json
  └── training
      ├── losses_train_data.json
      ├── model.pt
      ├── scaler.pkl
      ├── spot
      │   ├── spot_feature-0.pkl
      │   ├── spot_feature-1.pkl
      │   ├── spot_feature-2.pkl
      │   └── spot_feature-global.pkl
      ├── test_output.pkl
      ├── train_losses.png
      ├── train_output.pkl
      └── validation_losses.png

the creation of a detector data stream adbox_detector_mtad_gat_baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15 in the Wazuh indexer and its corresponding template and template component. This can be verified by accessing the Wazuh dashboard, and navigating to Indexer Management>Index Management (see figures below).
the addition of historical prediction. This can be either verified by following the steps explained in dashboard integration or using the Wazuh Indexer API (OpenSearch API) at Indexer Management>Dev Tools.

Step 2: Run real time prediction

To achieve (3), we need a different use case choosing realtime.

If not specified, the engine runs the prediction pipeline with the last trained detector.

Step 2.1: Define/find a use case (realtime)

To specify a detector, save a new use case ../siem_mtad_gat/assets/drivers/uc_14.yaml

prediction: 
  run_mode: "realtime"
  detector_id: "baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15"

Otherwise, we can use an already defined use case ../siem_mtad_gat/assets/drivers/uc_7.yaml

prediction: 
  run_mode: "realtime"

Step 2.2: Realtime prediction

Before starting

Tip 1: If you have defined a custom use case, remember to rebuild the container.
Tip 2: If you plan to close the terminal, use GNU screen to be able to detach while the detector keeps running
Tip 3: Start the realtime prediction immediately after training, to avoid losing data points. If not, you can use a tailored historical use case to retrieve the missing windows.

Start prediction

Run ./adbox.sh -u 13 -s

Effects

The effect of this command is:

updating the detector’s folder

  baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15
  ├── input
  │   ├── detector_input_parameters.json
  │   └── training_config.json
  ├── prediction
  │   ├── spot
  │   │   ├── spot_feature-0.pkl
  │   │   ├── spot_feature-1.pkl
  │   │   ├── spot_feature-2.pkl
  │   │   └── spot_feature-global.pkl
  │   ├── uc-13_predicted_anomalies_data-1_2024-11-08T09:43:41Z.json
  │   ├── uc-13_predicted_data-1_2024-11-08T09:43:41Z.json
  │   ├── uc-7_predicted_anomalies_data-1_2024-11-08T09:45:38Z.json
  │   └── uc-7_predicted_data-1_2024-11-08T09:45:38Z.jso
  └── training
      ├── losses_train_data.json
      ├── model.pt
      ├── scaler.pkl
      ├── spot
      │   ├── spot_feature-0.pkl
      │   ├── spot_feature-1.pkl
      │   ├── spot_feature-2.pkl
      │   └── spot_feature-global.pkl
      ├── test_output.pkl
      ├── train_losses.png
      ├── train_output.pkl
      └── validation_losses.png

shipping new documents to the indexer. This can be easily verified by using the steps explained in dashboard integration.

Step 3: Create a detector dashboard

Step 3.1 Add pattern

Add the detector data stream pattern to the Wazuh dashboard as explained in dashboard integration.

Step 3.2 Create a Detector dashboard

In the Wazuh dashboard (ref. version 4.8.1)

Go to Explore>Dashboard
Click on Create Dashboard
Save empty Dashboard

Step 3.3 Add visualizations

A dedicated dashboard can include different visualizations that provide a complete overview. We provide some examples both regarding global and feature-wise statistics. Indeed, even though feature-wise scores and thresholds do not contribute to determining if a window is anomalous, they can help us understand trends in time series. See also Example.

Since the detector data stream adbox_detector_mtad_gat_baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15 was defined using a dedicated template, we can manipulate all the numerical fields.

Eventually, the dashboard should look as shown below:

Global score and threshold

The MTAD-GAT algorithm marks as anomalous windows with an anomaly score higher than the local threshold. Therefore, as base graphic we add visualizations showing both these parameters in our timeseries .

Video

Remarks:

We used vizbuilder, but different option are also valuable
Remember to select the correct pattern. Moreover, for aggregating score and threshold we suggest using max to be sure to visualize anomalies.

Feature-wise score and threshold

Using the same visualization type for every feature, we add a visualization for feature-wise score and threshold to the dashboard.

Feature-wise true, forecasted and reconstructed univariate timeseries

Among the feature-wise fields of the adbox_detector_mtad_gat_baa9b7bf-e05d-4ce9-a1c1-3e82ff4c9f15 document, we find the (preprocessed) true value, its forecasted and reconstructed univariate timeseries for every feature. Therefore, we can build a visualization comparing them.

Tip 1: use the same aggregation method defined at training time for consistency.
Tip 2: Place feature-wise graphics next to each other

Anomaly table

We use a table to display numerical and boolean values explicitly.

Anomaly gauge

Gauges provide a fairly simple visualization. Simply adding a filter using the using the boolean field is_anomaly with the count metric we get an anomaly counter.

Step 3.3 Complement with visualizations using other indices

To complement the data produced by the detector, we add some visualizations using the index pattern wazuh-alerts-* to the dashboard. Namely, information from the the original data.

A table displaying the value of original data.
A time series line plot displaying the value of original memory and cpu usage in percentage.
A time series line plot displaying the alert counts.

(Finally) monitor

Combining Discover Dashboard and our Detector Dashboard we can investigate anomalies.

This site is open source. Improve this page.

idps-escape

Tutorial for creating an ADBox detector dashboard in Wazuh

Step 1: Create a (shipping) detector

Step 1.1: Define a use case (training + historical prediction)

Step 1.2: Create detector and run historical prediction

Effects

Step 2: Run real time prediction

Step 2.1: Define/find a use case (realtime)

Step 2.2: Realtime prediction

Before starting

Start prediction

Effects

Step 3: Create a detector dashboard

Step 3.1 Add pattern

Step 3.2 Create a Detector dashboard

Step 3.3 Add visualizations

Global score and threshold

Feature-wise score and threshold

Feature-wise true, forecasted and reconstructed univariate timeseries

Anomaly table

Anomaly gauge

Step 3.3 Complement with visualizations using other indices

(Finally) monitor