182 lines
6.0 KiB
Markdown
182 lines
6.0 KiB
Markdown
# Broker Benchmarks
|
|
|
|
Broker ships with benchmarking tools that allow developers and users to
|
|
investigate system performance in various deployment and configuration setups.
|
|
|
|
## Clustering: `broker-cluster-benchmark`
|
|
|
|
This is the primary benchmark suite that runs Broker in a full end-to-end
|
|
deployment. Unlike real deployments, this tool allows all Broker endpoints run
|
|
in a single OS process.
|
|
|
|
### Setup and Configuration
|
|
|
|
Running `broker-cluster-benchmark` requires a cluster configuration file using
|
|
CAF's config syntax:
|
|
|
|
|
|
```sh
|
|
; comments start with a semicolon
|
|
foo = "bar" ; strings use double quotes
|
|
homepage = <https://zeek.org> ; URIs use angle brackets
|
|
list = [1, 2, 3] ; Lists use square brackets
|
|
```
|
|
|
|
The cluster config contains all participating Broker endpoints under `nodes`.
|
|
Each node must have at least an `id` (URI) and `topics` (list of strings). The
|
|
`id` is the network-wide identifier for peering. Use `local:$name` if a node
|
|
does not accept incoming connections and `tcp://$ip:$port` otherwise.
|
|
|
|
Nodes that publish data must have a `generator-file`. Nodes that wait for data
|
|
must set `num-inputs`. A minimal example file might look like this:
|
|
|
|
```sh
|
|
nodes {
|
|
earth {
|
|
id = <local:earth>
|
|
peers = ["mars"]
|
|
topics = ["/benchmark/events"]
|
|
num-inputs = 100000
|
|
}
|
|
mars {
|
|
id = <tcp://[::1]:8001>
|
|
topics = ["/benchmark/events"]
|
|
generator-file = "mars.dat"
|
|
num-outputs = 100000
|
|
}
|
|
}
|
|
```
|
|
|
|
This config file will start two nodes: `earth` and `mars`. On startup, `mars`
|
|
opens port 8001 and waits for its peers to connect while `earth` will not open
|
|
any port since it has a `local:` ID. The entry `peers` for `earth` will cause
|
|
this node to connect to `mars` by trying to connect to `tcp://[::1]:8001`.
|
|
|
|
The generator file `mars.dat` contains previously recorded meta data from a
|
|
live system. Setting `num-outputs` causes `broker-cluster-benchmark` to emit
|
|
exactly that amount of messages. The node will ignore additional messages in
|
|
the generator file if it contains more than `num-outputs` entries or loop
|
|
through the file if it contains less entries.
|
|
|
|
### Recording Meta Data
|
|
|
|
Setting the configuration parameter `broker.recording-directory` (or setting
|
|
the environment variable `BROKER_RECORDING_DIRECTORY`) to a non-empty path
|
|
triggers Broker to record meta data such as subscriptions, peerings, and
|
|
published data at this endpoint. The meta data is about 2MB for each 1M
|
|
recorded messages (depending on the structure of the data).
|
|
|
|
Setting the configuration parameter `broker.output-generator-file-cap` (or
|
|
setting the environment variable `BROKER_OUTPUT_GENERATOR_FILE_CAP`) to an
|
|
unsigned integer limits recording to that many published messages.
|
|
|
|
An example for how to record data from a Zeek cluster simply involves adding
|
|
a line for each node in `/usr/local/zeek/etc/node.cfg` like:
|
|
|
|
```
|
|
env_vars=BROKER_RECORDING_DIRECTORY=/your/desired/path/zeek-recording-<node>
|
|
```
|
|
|
|
Where `<node>` would be replaced by the specific node name to avoid nodes
|
|
overwriting each other's data.
|
|
|
|
### Generating Config Files from Recorded Meta Data
|
|
|
|
After recording meta data for *all* Broker nodes, the tool
|
|
`broker-cluster-benchmark` can automatically generate a cluster configuration
|
|
by analyzing the recorded files. The generated config file uses the directory
|
|
names as node names and establishes the recorded peering relations.
|
|
|
|
The tool generates config files when passing the `--generate-config` option
|
|
by scanning all specified directories. For example, the following command
|
|
prints a configuration for a recorded Broker session with two endpoints:
|
|
|
|
```sh
|
|
broker-cluster-benchmark --mode=generate-config recordings/server recordings/client
|
|
```
|
|
|
|
The tool assumes the directories `server` and `client` to contain the following
|
|
files:
|
|
|
|
```
|
|
recordings/
|
|
├── client
|
|
│ ├── id.txt
|
|
│ ├── messages.dat
|
|
│ ├── peers.txt
|
|
│ └── topics.txt
|
|
└── server
|
|
├── id.txt
|
|
├── messages.dat
|
|
├── peers.txt
|
|
└── topics.txt
|
|
```
|
|
|
|
The produced configuration will contain two nodes: `client` and `server`. All
|
|
other fields and peering relations are automatically generated from the file
|
|
contents. It is worth mentioning that the tool does a linear scan over all
|
|
`messages.dat` files to compute the number of expected messages in the system.
|
|
This step may take some time.
|
|
|
|
### Running the Benchmark
|
|
|
|
The tool `broker-cluster-benchmark` expects at least `-c $configFile`. Passing
|
|
`-v` also enables verbose output to get a glimpse into the program state at
|
|
runtime. When running a configuration for the first time, we strongly recommend
|
|
running in verbose mode:
|
|
|
|
```sh
|
|
broker-cluster-benchmark -c cluster.conf -v
|
|
```
|
|
|
|
Running in verbose mode prints various state messages to the console:
|
|
|
|
```sh
|
|
Peering tree (multiple roots are allowed):
|
|
mars, topics: ["/benchmark/events"]
|
|
└── earth, topics: ["/benchmark/events"]
|
|
|
|
mars starts listening at [::1]:8001
|
|
mars up and running
|
|
earth starts peering to [::1]:8001 (mars)
|
|
earth up and running
|
|
all nodes are up and running, run benchmark
|
|
earth waits for messages
|
|
mars starts publishing
|
|
... snip ...
|
|
```
|
|
|
|
Before the tool spins up all Broker endpoints, it makes sure that the
|
|
configured topology is safe to deploy:
|
|
|
|
- No loops allowed.
|
|
- Each node must set the mandatory fields `id` and `topics`.
|
|
|
|
Broker's source distribution includes a working setup to get started at
|
|
`tests/benchmark/cluster-example.zip`.
|
|
|
|
### Inspecting Generator Files
|
|
|
|
If you're unsure which topics appear in a generator file or how many messages
|
|
it contains, you can add the `dump-stats` mode:
|
|
|
|
```sh
|
|
broker-cluster-benchmark -c cluster.conf -v --mode=dump-stats
|
|
```
|
|
|
|
In this mode, the tool only prints the contents of all generator files and then
|
|
exits. The output simply includes all generator files, which topics they contain
|
|
and how many messages they produce:
|
|
|
|
```sh
|
|
mars.dat
|
|
├── entries: 1000
|
|
| ├── data-entries: 1000
|
|
| └── command-entries: 0
|
|
└── topics:
|
|
└── /benchmark/events
|
|
```
|
|
|
|
Note that the tool has to linearly scan each generator file, which may take
|
|
some time.
|