6.0 KiB
Broker Benchmarks
Broker ships with benchmarking tools that allow developers and users to investigate system performance in various deployment and configuration setups.
Clustering: broker-cluster-benchmark
This is the primary benchmark suite that runs Broker in a full end-to-end deployment. Unlike real deployments, this tool allows all Broker endpoints run in a single OS process.
Setup and Configuration
Running broker-cluster-benchmark requires a cluster configuration file using
CAF's config syntax:
; comments start with a semicolon
foo = "bar" ; strings use double quotes
homepage = <https://zeek.org> ; URIs use angle brackets
list = [1, 2, 3] ; Lists use square brackets
The cluster config contains all participating Broker endpoints under nodes.
Each node must have at least an id (URI) and topics (list of strings). The
id is the network-wide identifier for peering. Use local:$name if a node
does not accept incoming connections and tcp://$ip:$port otherwise.
Nodes that publish data must have a generator-file. Nodes that wait for data
must set num-inputs. A minimal example file might look like this:
nodes {
earth {
id = <local:earth>
peers = ["mars"]
topics = ["/benchmark/events"]
num-inputs = 100000
}
mars {
id = <tcp://[::1]:8001>
topics = ["/benchmark/events"]
generator-file = "mars.dat"
num-outputs = 100000
}
}
This config file will start two nodes: earth and mars. On startup, mars
opens port 8001 and waits for its peers to connect while earth will not open
any port since it has a local: ID. The entry peers for earth will cause
this node to connect to mars by trying to connect to tcp://[::1]:8001.
The generator file mars.dat contains previously recorded meta data from a
live system. Setting num-outputs causes broker-cluster-benchmark to emit
exactly that amount of messages. The node will ignore additional messages in
the generator file if it contains more than num-outputs entries or loop
through the file if it contains less entries.
Recording Meta Data
Setting the configuration parameter broker.recording-directory (or setting
the environment variable BROKER_RECORDING_DIRECTORY) to a non-empty path
triggers Broker to record meta data such as subscriptions, peerings, and
published data at this endpoint. The meta data is about 2MB for each 1M
recorded messages (depending on the structure of the data).
Setting the configuration parameter broker.output-generator-file-cap (or
setting the environment variable BROKER_OUTPUT_GENERATOR_FILE_CAP) to an
unsigned integer limits recording to that many published messages.
An example for how to record data from a Zeek cluster simply involves adding
a line for each node in /usr/local/zeek/etc/node.cfg like:
env_vars=BROKER_RECORDING_DIRECTORY=/your/desired/path/zeek-recording-<node>
Where <node> would be replaced by the specific node name to avoid nodes
overwriting each other's data.
Generating Config Files from Recorded Meta Data
After recording meta data for all Broker nodes, the tool
broker-cluster-benchmark can automatically generate a cluster configuration
by analyzing the recorded files. The generated config file uses the directory
names as node names and establishes the recorded peering relations.
The tool generates config files when passing the --generate-config option
by scanning all specified directories. For example, the following command
prints a configuration for a recorded Broker session with two endpoints:
broker-cluster-benchmark --mode=generate-config recordings/server recordings/client
The tool assumes the directories server and client to contain the following
files:
recordings/
├── client
│ ├── id.txt
│ ├── messages.dat
│ ├── peers.txt
│ └── topics.txt
└── server
├── id.txt
├── messages.dat
├── peers.txt
└── topics.txt
The produced configuration will contain two nodes: client and server. All
other fields and peering relations are automatically generated from the file
contents. It is worth mentioning that the tool does a linear scan over all
messages.dat files to compute the number of expected messages in the system.
This step may take some time.
Running the Benchmark
The tool broker-cluster-benchmark expects at least -c $configFile. Passing
-v also enables verbose output to get a glimpse into the program state at
runtime. When running a configuration for the first time, we strongly recommend
running in verbose mode:
broker-cluster-benchmark -c cluster.conf -v
Running in verbose mode prints various state messages to the console:
Peering tree (multiple roots are allowed):
mars, topics: ["/benchmark/events"]
└── earth, topics: ["/benchmark/events"]
mars starts listening at [::1]:8001
mars up and running
earth starts peering to [::1]:8001 (mars)
earth up and running
all nodes are up and running, run benchmark
earth waits for messages
mars starts publishing
... snip ...
Before the tool spins up all Broker endpoints, it makes sure that the configured topology is safe to deploy:
- No loops allowed.
- Each node must set the mandatory fields
idandtopics.
Broker's source distribution includes a working setup to get started at
tests/benchmark/cluster-example.zip.
Inspecting Generator Files
If you're unsure which topics appear in a generator file or how many messages
it contains, you can add the dump-stats mode:
broker-cluster-benchmark -c cluster.conf -v --mode=dump-stats
In this mode, the tool only prints the contents of all generator files and then exits. The output simply includes all generator files, which topics they contain and how many messages they produce:
mars.dat
├── entries: 1000
| ├── data-entries: 1000
| └── command-entries: 0
└── topics:
└── /benchmark/events
Note that the tool has to linearly scan each generator file, which may take some time.