zeek/auxil/zeekctl/doc/main.rst
Patrick Kelley 8fd444092b initial
2025-05-07 15:35:15 -04:00

531 lines
24 KiB
ReStructuredText

.. -*- mode: rst-mode -*-
..
.. Note: This file includes further autogenerated ones.
..
.. Version number is filled in automatically.
.. |version| replace:: 2.6.0-11
===========
ZeekControl
===========
.. rst-class:: opening
This document summarizes installation and use of *ZeekControl*,
a tool for operating Zeek installations. *ZeekControl*
has two modes of operation: a *stand-alone* mode for
managing a traditional, single-system Zeek setup; and a *cluster*
mode for maintaining a multi-system setup of coordinated Zeek
instances load-balancing the work across a set of independent
machines. Once installed, the operation is pretty similar
for both types; just keep in mind that if this document refers to
"nodes" and you're in a stand-alone setup, there is only a
single one and no worker/proxies.
.. contents::
Download
--------
You can find the latest ZeekControl release for download at
https://www.zeek.org/download.
ZeekControl's git repository is located at https://github.com/zeek/zeekctl.
This document describes ZeekControl |version|. See the ``CHANGES``
file for version history.
Prerequisites
-------------
Running ZeekControl requires the following prerequisites:
- A Unix system. FreeBSD, Linux, and Mac OS X are supported and
should work out of the box. Other Unix systems will quite likely
require some tweaking.
- A version of *Python* >= 3.9 (on FreeBSD, the package "pyXY-sqlite3" must
also be installed for whatever your Python X.Y version is).
- A *bash* (note that on FreeBSD, *bash* is not installed by default).
- If *sendmail* is installed, then ZeekControl can send mail (for a cluster
setup, it would be needed on the manager only). Otherwise, ZeekControl
will not attempt to send mail.
- If *gdb* (*lldb* on Mac OS X, which is included with Xcode) is installed
and if Zeek crashes with a core dump, then ZeekControl can include
a backtrace in its crash report (that can be helpful for debugging
problems with Zeek). Otherwise, crash reports will not include a backtrace.
For a cluster setup that spans more than one machine, there are
additional requirements:
- All machines in the cluster must be running exactly the *same* operating
system (even the version must be the same).
- Every host in the cluster must have *rsync* installed.
- The manager host must have *ssh* installed, and every other host in the
cluster must have *sshd* installed and running.
- Decide which user account will be running ZeekControl, and then make sure
this user account is set up on all hosts in your cluster.
Note that if you plan to run zeekctl using sudo (i.e., "sudo zeekctl"), then
the user running zeekctl will be "root" (and in that case the user running
sudo does not need to exist on the other hosts in your cluster).
- Make sure the user running ZeekControl can ``ssh`` from the manager host
to each of the other hosts in your cluster, and this must work without
being prompted for anything (one way to accomplish this is to use ssh
public key authentication). You will need to try this manually before
attempting to run zeekctl, because zeekctl uses ssh to connect to other
hosts in your cluster.
If you're using a load-balancing method (such as PF_RING), then there is
additional software to install (for details, see the
:doc:`Cluster Configuration <../../configuration/index>` documentation).
Installation
------------
Follow the directions to install Zeek and ZeekControl
in the :doc:`Installing Zeek <../../install/install>`
documentation. Note that if you are planning to run Zeek in a cluster
configuration, then you need to install Zeek and ZeekControl only on the
manager host (the ZeekControl install_ or deploy_ commands will install Zeek
and all required scripts to the other hosts in your cluster).
Configuration
-------------
Before attempting to run ZeekControl, you first need to edit the ``zeekctl.cfg``,
``node.cfg``, and ``networks.cfg`` files. All three of these configuration
files contain a valid configuration by default, but you might need to
customize a few things.
First, edit the ``node.cfg`` file and specify the nodes that you will be
running. You need to decide whether you will be running Zeek standalone or
in a cluster. For a standalone configuration, there must be only one Zeek node
defined in this file. For a cluster configuration, at a minimum there
must be a manager node, a proxy node, and one or more worker nodes.
There is a :doc:`Cluster Configuration <../../configuration/index>`
guide that provides examples and additional information.
Each node defined in the ``node.cfg`` file has a set of options. A few options
are required to be specified on every node, and some options are allowed only
on certain node types (zeekctl will issue an error if you make a mistake).
By default, the ``node.cfg`` file contains a valid configuration for
a standalone setup and has a valid cluster configuration commented-out.
If you want to use the default configuration, then at least check if
the "interface" option is set correctly for your system. For a
description of every option available for nodes, see the `Node`_ section below.
In the ``zeekctl.cfg`` file, you should review the ZeekControl options and
check if any are not set correctly for your environment. The options have
default values that are reasonable for most users (the MailTo_ option is
probably the one that you will most likely want to change), but for a
description of every ZeekControl option, see the `Option Reference`_ section
below.
ZeekControl options are used in three different ways: some options
override the value of a Zeek script constant (these are noted in the
documentation), some affect only ZeekControl itself, and others affect Zeek.
Finally, edit the ``networks.cfg`` file and add each network (using standard
CIDR notation) that is considered local to the monitored environment (by
default, the ``networks.cfg`` file just lists the private IPv4 address spaces).
The information in the ``networks.cfg`` file is used when creating connection
summary reports. Also, ZeekControl takes the information in the
``networks.cfg`` file and puts it in the global Zeek script constant
``Site::local_nets``, and this global constant is used by several
standard Zeek scripts.
Basic Usage
-----------
There are two ways to run ZeekControl commands: by specifying a ZeekControl
command on the command-line (e.g. "zeekctl deploy"), or by entering
ZeekControl's interactive shell by running the zeekctl script without
any arguments (e.g. "zeekctl"). The interactive shell expects
commands on its command-line::
> zeekctl
Welcome to ZeekControl x.y
Type "help" for help.
[ZeekControl] >
As the message says, type help_ to see a list of
all commands. We will now briefly summarize the most important
commands. A full reference follows `Command Reference`_.
If this is the first time you are running ZeekControl, then the first command
you must run is the ZeekControl deploy_ command. The "deploy" command
will make sure all of the files needed by ZeekControl and Zeek are brought
up-to-date based on the configuration specified in the ``zeekctl.cfg``,
``node.cfg``, and ``networks.cfg`` files. It will also check if there
are any syntax errors in your Zeek policy scripts. For a cluster setup it will
copy all of the required scripts and executables to all the other hosts
in your cluster. Then it will successively start the logger, manager,
proxies, and workers (for a standalone configuration, only one Zeek instance
will be started).
The status_ command can be used to check that all nodes are "running".
If any nodes have a status of "crashed", then use the diag_ command to
see diagnostic information (you can specify the name of a crashed node
as an argument to the diag command to show diagnostics for only that one
node).
If you want to stop the monitoring, issue the stop_ command. After all
nodes have stopped, the status_ command should show all nodes as "stopped".
The exit_ command leaves the shell (you can exit ZeekControl while Zeek
is running).
Whenever the ZeekControl or Zeek configuration is modified in any way,
including changes to configuration files and site-specific policy
scripts or upgrading to a new version of Zeek, deploy_ must
be run (deploy will check all policy scripts, install all needed files, and
restart Zeek). No changes will take effect until deploy_ is run.
ZeekControl cron command
-----------------------
The main purpose of the ZeekControl cron_ command is to check for Zeek nodes
that have crashed, and to restart them. The command also performs other
housekeeping tasks, such as removing expired log files, checking if there is
sufficient free disk space, etc. Although this command can be run directly
by a user, it is intended to be run from a cron job so that crashed nodes
will be restarted automatically.
For example, to setup a cron job that runs once every
five minutes, insert the following entry into the crontab of the
user running ZeekControl (change the path to the actual location of zeekctl
on your system) by running the ``crontab -e`` command::
*/5 * * * * /usr/local/zeek/bin/zeekctl cron
It is important to make sure that the cron job runs as the same user that
normally runs zeekctl on your system. For a cluster configuration, this
should be run only on the manager host.
Note that on some systems, the default PATH for cron jobs might not include
the directory where python or bash are installed (the symptoms of this
problem would be that "zeekctl cron" works when run directly by the user,
but does not work from a cron job). The simplest fix for this problem
would be to redefine PATH on a line immediately before the line that
runs zeekctl in your crontab.
If the ``"zeekctl cron disable"`` command is run, then zeekctl cron will be
disabled (i.e., zeekctl cron won't do anything) until the
``"zeekctl cron enable"`` command is run. To check the status at any
time, run ``"zeekctl cron ?"``.
Log Files
---------
Log rotation and archival
~~~~~~~~~~~~~~~~~~~~~~~~~
While Zeek is running you can find the current set of (aggregated) logs
in ``logs/current`` (which is a symlink to the corresponding spool directory).
In a cluster setup, logs are written on the logger host (however, if there
is no logger defined in your node.cfg, then the manager writes logs).
Zeek logs are automatically rotated once per hour by default, or whenever Zeek
is stopped. A rotated log is renamed to contain a timestamp in the filename.
For example, the ``conn.log`` might be renamed to
``conn.2015-01-20-15-23-42.log``.
Immediately after a log is rotated, it is archived automatically. When a log
is archived, it is moved to a subdirectory of ``logs/`` named by date (such
as ``logs/2015-01-20``), then it is renamed again, and gzipped. For example,
a rotated log file named ``conn.2015-01-20-15-23-42.log`` might be archived
to ``logs/2015-01-20/conn.15:48:23-16:00:00.log.gz``. If the archival was
successful, then the original (rotated) log file is removed.
If, for some reason, a rotated log file cannot be archived then it will be
left in the node's working directory. Next time when ZeekControl either stops
Zeek or tries to restart a crashed Zeek, it will try to archive such log files
again. If this attempt fails, then an email is sent which contains the
name of a directory where any such unarchived logs can be found.
Log files created only when using ZeekControl
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are several log files that are not created by Zeek, but rather are
created only when using ZeekControl to run Zeek.
When ZeekControl starts Zeek it creates two files "stdout.log" and "stderr.log",
which just capture stdout and stderr from Zeek. Although these are not
actually Zeek logs, they might contain useful error or diagnostic information.
The contents of these files are included in crash reports and also
in the output of the "zeekctl diag" command.
Also, whenever logs are rotated, a connection summary report is generated if the
`trace-summary <https://github.com/zeek/trace-summary>`_ tool, included in the
Zeek distribution by default, is available. Although these are not actually
Zeek logs, they follow the same filename convention as other Zeek logs and they
have the filename prefix "conn-summary". If you don't want these connection
summary files to be created, then you can set the value of the TraceSummary_
option to an empty string.
Zeek Scripts
-----------
Site-specific Customization
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you want to adapt the Zeek policy to the local environment, then
you will most likely need to write local policy scripts.
Sample local policy scripts (which you can edit)
are located in ``share/zeek/site``. The file called ``local.zeek`` gets
loaded automatically.
The recommended way to modify the policy is to use only "@load" directives
in the ``local.zeek`` script. For example, you can add a "@load" directive
to load a Zeek policy script that is included with Zeek but is not loaded
by default. You can also create custom site-specific
policy scripts in the same directory as the ``local.zeek`` script, and "@load"
them from the ``local.zeek`` script. For example, you could create
your own Zeek script ``mypolicy.zeek`` in the ``share/zeek/site`` directory,
and then add a line "@load mypolicy" (without the quotes) to the ``local.zeek``
script.
After creating or modifying your local policy scripts, you must install them
by using the ZeekControl "install" or "deploy" command. Next, you can use the
ZeekControl "scripts" command to verify that your new scripts will be loaded
when you start Zeek.
Load Order of Scripts
~~~~~~~~~~~~~~~~~~~~~
When writing custom site-specific policy scripts, it can be useful
to know in which order the scripts are loaded. For example, if more than
one script sets a value for the same global variable, then the value that
takes effect is the one set by the last such script loaded. The
ZeekControl "scripts" command shows the load order of every script
loaded by Zeek.
When Zeek starts up, the first script it loads is init-bare.zeek, followed
by init-default.zeek (keep in mind that each of these scripts loads many
other scripts). Note that these are the only scripts that are automatically
loaded when running Zeek directly (instead of using ZeekControl to run Zeek).
The next script loaded is the local.zeek script. By default, this script
loads a variety of other scripts. You can edit local.zeek and comment-out
anything that your site doesn't need (or add new "@load" directives).
Next, the "zeekctl" script package is loaded. This consists of some standard
settings that ZeekControl needs.
The next scripts loaded are ``local-networks.zeek`` and ``zeekctl-config.zeek``.
These scripts are automatically generated by ZeekControl based on the
contents of the ``networks.cfg`` and ``zeekctl.cfg`` files. Also, some
ZeekControl plugins might generate script code that will be automatically
inserted into the ``zeekctl-config.zeek`` script.
The last scripts loaded are any node-specific scripts specified with the
option ``aux_scripts`` in ``node.cfg``. This option is seldom ever
needed, but can be used to load additional scripts to individual nodes only.
For example, one could add a script ``experimental.zeek`` to a single worker
for trying out new experimental code.
Mails
-----
There are several situations when ZeekControl sends mail to the address given in
MailTo_ (note that ZeekControl will not be able to send any mail when the
value of the SendMail_ option is an empty string):
1. When the "zeekctl cron" command runs it performs various tasks (such as
checking available disk space, expiring old log files, etc.). If
any problems occur, a mail will be sent containing a list of those issues.
In order to reduce the amount of mail, the value of the following options
can be changed (see documentation of each option): MailHostUpDown_,
MinDiskSpace_, StatsLogEnable_, MailReceivingPackets_.
2. When ZeekControl tries to start or stop (via any of these commands:
start, stop, restart, deploy, or cron) a node that has crashed,
a crash report is mailed (one for each crashed node). The crash report
is essentially just the output of the "zeekctl diag" command.
3. When ZeekControl stops Zeek or restarts a crashed Zeek, if any log files
could not be archived, then mail will be sent to warn about this problem.
This mail can be disabled by setting ``MailArchiveLogFail=0``.
4. If `trace-summary <https://github.com/zeek/trace-summary>`_
is installed, a traffic summary is mailed each rotation interval. To
disable this mail, set ``MailConnectionSummary=0`` (however, the
connection summary file will still be created and archived along with
all other log files).
Using ZeekControl as an unprivileged user
----------------------------------------
If you decide to run ZeekControl as an unprivileged user, there are a
few issues that you may encounter.
If you installed Zeek and ZeekControl as the "root" user, then you will need
to adjust the ownership or permissions of the "logs" and "spool" directories
(and everything in those directories) so that the user running ZeekControl
has write permission.
If you're using a cluster setup that spans multiple machines, and if
your ZeekControl ``install`` or ``deploy`` commands fail with a permission
denied error, then it's most likely due to the user running ZeekControl
not having permission to create the install prefix directory
(by default, this is ``/usr/local/zeek``) on each remote machine.
A simple workaround is to login to each machine in your cluster and
manually create the install prefix directory and then set ownership
or permissions of this directory so that the user who will run ZeekControl
has write access to it.
Finally, on the worker nodes (or the standalone node), Zeek must have access
to the target network interface in promiscuous mode. If Zeek doesn't have
the necessary permissions, then it will fail almost immediately upon
startup.
Zeek communication
-----------------
This section summarizes the network communication between Zeek and ZeekControl,
which is useful to understand if you need to reconfigure your firewall. If
your firewall is preventing Zeek communication, then either the "deploy"
command or the "peerstatus" command will fail.
For a cluster setup, ZeekControl uses ssh to run commands on other hosts in
the cluster, so the manager host needs to connect to TCP port 22 on each
of the other hosts in the cluster. Note that ZeekControl never attempts
to ssh to the localhost, so in a standalone setup ZeekControl does not use ssh.
Each instance of Zeek in a cluster needs to communicate directly with other
instances of Zeek regardless of whether these instances are running on the same
host or not. Each proxy and worker needs to connect to the manager, and each
worker needs to connect to each proxy. If one or more logger nodes are
defined, then each of the other nodes needs to connect to each of the loggers.
Note that you can change the port that Zeek listens on by changing the value
of the "ZeekPort" option in your ``zeekctl.cfg`` file (this should be needed
only if your system has another process that listens on the same port). By
default, a standalone Zeek listens on TCP port 27760. For a cluster setup,
the logger listens on TCP port 27761, and the manager listens on TCP port 27762
(or 27761 if no logger is defined). Each proxy is assigned its own port
number, starting with one number greater than the manager's port. Likewise,
each worker is assigned its own port starting one number greater than the
highest port number assigned to a proxy.
Finally, a few ZeekControl commands (such as "print" and "peerstatus") rely
on Broker to communicate with Zeek. This means that for those commands to
function, ZeekControl needs to connect to each Zeek instance.
Command Reference
-----------------
The following summary lists all commands supported by ZeekControl.
If not specified otherwise, commands taking
*[<nodes>]* as arguments apply their action either to the given set of
nodes, to the manager node if "manager" is given, to all proxy nodes if
"proxies" is given, to all worker nodes if "workers" is given, or to all
nodes if none are given.
.. include:: commands.rst
Option Reference
----------------
This section summarizes the options that can be set in ``zeekctl.cfg``
for customizing the behavior of ZeekControl (the option names are not
case-sensitive). Usually, one only needs
to change the "user options", which are listed first. The "internal
options" are, as the name suggests, primarily used internally and set
automatically. They are documented here only for reference.
.. include:: options.rst
Plugins
-------
ZeekControl provides a plugin interface to extend its functionality. A
plugin is written in Python and can do any, or all, of the following:
* Perform actions before or after any of the standard ZeekControl
commands is executed. When running before the actual command, it
can filter which nodes to operate or stop the execution
altogether. When running after the command, it gets access to
the commands success status on a per-node basis (where applicable).
* Add custom commands to ZeekControl.
* Add custom options to ZeekControl defined in ``zeekctl.cfg``.
* Add custom keys to nodes defined in ``node.cfg``.
A plugin is written by deriving a new class from ZeekControl class
`Plugin`_. The Python script with the new plugin is then copied into a
plugin directory searched by ZeekControl at startup. By default,
ZeekControl searches ``<prefix>/lib/zeek/python/zeekctl/plugins``; additional directories
may be configured by setting the SitePluginPath_ option. Note that any plugin
script must end in ``*.py`` to be found. ZeekControl comes with some
example plugins that can be used as a starting point; see
the ``<prefix>/lib/zeek/python/zeekctl/plugins`` directory.
In the following, we document the API that is available to plugins. A
plugin must be derived from the `Plugin`_ class, and can use its
methods as well as those of the `Node`_ class.
.. include:: plugins.rst
.. _FAQ:
Questions and Answers
---------------------
*Can I use an NFS-mounted partition as the cluster's base directory to avoid the ``rsync``'ing?*
Yes. ZeekBase_ can be on an NFS partition.
Configure and install the shell as usual with
``--prefix=<ZeekBase>``. Then add ``HaveNFS=1`` and
``SpoolDir=<spath>`` to ``zeekctl.cfg``, where ``<spath>`` is a
path on the local disks of the nodes; ``<spath>`` will be used for
all non-shared data (make sure that the parent directory exists
and is writable on all nodes!). Then run ``make install`` again.
Finally, you can remove ``<ZeekBase>/spool`` (or link it to <spath>).
In addition, you might want to keep the log files locally on the nodes
as well by setting LogDir_ to a non-NFS directory. (Only
the manager's logs will be kept permanently, the logs of
workers/proxies are discarded upon rotation.)
*What do I need to do when something in the Zeek distribution changes?*
After pulling from the main Zeek git repository, just re-run ``make
install`` inside your build directory. It will reinstall all the
files from the distribution that are not up-to-date. Then do
``zeekctl deploy`` to make sure everything gets pushed out.
*Can I change the naming scheme that ZeekControl uses for archived log files?*
Yes, set MakeArchiveName_ to a
script that outputs the desired destination file name for an
archived log file. The default script for that task is
``<ZeekBase>/share/zeekctl/scripts/make-archive-name``, which you
can use as a template for creating your own version. See
the beginning of that script for instructions.
*Can ZeekControl manage a cluster of nodes over non-global IPv6 scope (e.g. link-local)?*
This used to be supported through a ``ZoneID`` option in
``zeekctl.cfg``, but no longer works in later versions
of Zeek which use Broker as the communication framework. Please
file a feature request if this is important to you.