671 lines
24 KiB
ReStructuredText
671 lines
24 KiB
ReStructuredText
|
|
.. _tutorial:
|
|
|
|
=========================
|
|
Tutorial: A Real Analyzer
|
|
=========================
|
|
|
|
In this chapter we will develop a simple protocol analyzer from
|
|
scratch. Our analyzer will parse the
|
|
*Trivial File Transfer Protocol (TFTP)* in its original incarnation,
|
|
as described in `RFC 1350 <https://tools.ietf.org/html/rfc1350>`_.
|
|
TFTP provides a small protocol for copying files from a server to a
|
|
client system. It is most commonly used these days for providing boot
|
|
images to devices during initialization. The protocol is sufficiently
|
|
simple that we can walk through it end to end. See its `Wikipedia page
|
|
<https://en.wikipedia.org/wiki/Trivial_File_Transfer_Protocol>`_ for
|
|
more background.
|
|
|
|
.. rubric:: Contents
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
Creating a Spicy Grammar
|
|
========================
|
|
|
|
We start by developing Spicy grammar for TFTP. The protocol is
|
|
packet-based, and our grammar will parse the content of one TFTP
|
|
packet at a time. While TFTP is running on top of UDP, we will
|
|
Spicy parse just the actual UDP
|
|
application-layer payload, as described in `Section 5
|
|
<https://tools.ietf.org/html/rfc1350#section-5>`_ of the protocol
|
|
standard.
|
|
|
|
Parsing One Packet Type
|
|
-----------------------
|
|
|
|
TFTP is a binary protocol that uses a set of standardized, numerical
|
|
opcodes to distinguish between different types of packets---a common
|
|
idiom with such protocols. Each packet contains the opcode inside the
|
|
first two bytes of the UDP payload, followed by further fields that
|
|
then differ by type. For example, the following is the format of a
|
|
TFTP "Read Request" (RRQ) that initiates a download from a server::
|
|
|
|
2 bytes string 1 byte string 1 byte (from RFC 1350)
|
|
------------------------------------------------
|
|
| Opcode | Filename | 0 | Mode | 0 |
|
|
------------------------------------------------
|
|
|
|
A Read Request uses an opcode of 1. The *filename* is a sequence of
|
|
ASCII bytes terminated by a null byte. The *mode* is another
|
|
null-terminated byte sequence that usually is either ``netascii``,
|
|
``octet``, or ``mail``, describing the desired encoding for data that
|
|
will be received.
|
|
|
|
Let's stay with the Read Request for a little bit and write a Spicy
|
|
parser just for this one packet type. The following is a minimal Spicy
|
|
unit to parse the three fields:
|
|
|
|
.. spicy-code:: rrq.spicy
|
|
|
|
module TFTP; # [1]
|
|
|
|
public type ReadRequest = unit { # [2]
|
|
opcode: uint16; # [3]
|
|
filename: bytes &until=b"\x00"; # [4]
|
|
mode: bytes &until=b"\x00"; # [5]
|
|
|
|
on %done { print self; } # [6]
|
|
};
|
|
|
|
Let's walk through:
|
|
|
|
- ``[1]`` All Spicy source files must start with a ``module`` line
|
|
defining a namespace for their content. By convention, the
|
|
namespace should match what is being parsed, so we call ours
|
|
``TFTP``. Naming our module ``TFTP`` also implies saving it
|
|
under the name ``tftp.spicy``, so that other modules can find it
|
|
through ``import TFTP;``. See :ref:`modules` for more on all of
|
|
this.
|
|
|
|
- ``[2]`` In Spicy, one will typically create a ``unit`` type for
|
|
each of the main data units that a protocol defines. We want to
|
|
parse a Read Request, so we call our type accordingly. We
|
|
declare it as public because we want to use this unit as the
|
|
starting point for parsing data. The following lines then lay
|
|
out the elements of such a request in the same order as the
|
|
protocol defines them.
|
|
|
|
- ``[3]`` Per the TFTP specification, the first field contains the
|
|
``opcode`` as an integer value encoded over two bytes. For
|
|
multi-byte integer values, it is important to consider the byte
|
|
order for parsing. TFTP uses `network byte order
|
|
<https://en.wikipedia.org/wiki/Endianness#Networking>`_ which
|
|
matches Spicy's default, so there is nothing else for us to do
|
|
here. (If we had to specify the order, we would add the
|
|
:ref:`&byte-order <attribute_order>` attribute).
|
|
|
|
- ``[4]`` The filename is a null-terminated byte sequence, which
|
|
we can express directly as such in Spicy: the ``filename`` field
|
|
will accumulate bytes until a null byte is encountered. Note
|
|
that even though the specification of a Read Request shows the
|
|
``0`` as separate element inside the packet, we don't create a
|
|
field for it, but rather exploit it as a terminator for the file
|
|
name (which will not be included into the ``filename`` stored).
|
|
|
|
- ``[5]`` The ``mode`` operates just the same as the
|
|
``filename``.
|
|
|
|
- ``[6]`` Once we are done parsing a Read Request, we print out
|
|
the result for debugging.
|
|
|
|
We should now be able to parse a Read Request. To try it, we need the
|
|
actual payload of a corresponding packet. With TFTP, the format is
|
|
simple enough that we can start by faking data with ``printf``
|
|
and pipe that into the Spicy tool :ref:`spicy-driver <spicy-driver>`:
|
|
|
|
.. spicy-output:: rrq.spicy 1
|
|
:exec: printf '\000\001rfc1350.txt\000octet\000' | spicy-driver %INPUT
|
|
:show-with: tftp.spicy
|
|
|
|
Here, ``spicy-driver`` compiles our ``ReadRequest`` unit into an
|
|
executable parser and then feeds it with the data it is receiving on
|
|
standard input. The output of ``spicy-driver`` is the result of our
|
|
``print`` statement executing at the end.
|
|
|
|
.. _testing-with-batch-mode:
|
|
|
|
What would we do with a more complex protocol where we cannot easily
|
|
use ``printf`` to create some dummy payload? We would probably have
|
|
access to some protocol traffic in pcap traces, however we can't just
|
|
feed those into ``spicy-driver`` directly as they will contain all the
|
|
other network layers as well that our grammar does not handle (e.g.,
|
|
IP and UDP). One way to test with a trace would be proceeding with
|
|
Zeek integration at this point, so that we could let Zeek strip off
|
|
the lower layers and then feed our parser only the TFTP application
|
|
payload. However, during development it is often easier to avoid
|
|
Zeek's additional complexity at first, and stay with ``spicy-driver``
|
|
until the protocol parsing is mostly in place.
|
|
|
|
To facilitate that, ``spicy-driver`` offers a :ref:`batch mode
|
|
<spicy-driver-batch>`, which allows feeding connection-based,
|
|
bi-directional packet payloads into a parser, just as Zeek (or any
|
|
other network application) would do after stripping off the lower
|
|
layers. In this mode, ``spicy-driver`` reads input from a
|
|
specially-crafted batch file that retains the packet structure of the
|
|
underlying network communication as well as (just) the payload data
|
|
that we want parse.
|
|
|
|
To create such a batch input file, we can leverage Zeek itself: it
|
|
comes with a corresponding script that turns any PCAP trace into a
|
|
``spicy-driver`` batch file. Let's use that script with a tiny TFTP
|
|
trace, ``tftp_rrq.pcap``, borrowed from `Wireshark's pcap archive
|
|
<https://wiki.wireshark.org/SampleCaptures#tftp>`_. First, we confirm
|
|
with ``tcpdump`` that the trace contains a single file download:
|
|
|
|
.. code::
|
|
|
|
# tcpdump -ttnr tftp_rrq.pcap
|
|
1367411051.972852 IP 192.168.0.253.50618 > 192.168.0.10.69: 20 RRQ "rfc1350.txtoctet" [\|tftp]
|
|
1367411052.077243 IP 192.168.0.10.3445 > 192.168.0.253.50618: UDP, length 516
|
|
1367411052.081790 IP 192.168.0.253.50618 > 192.168.0.10.3445: UDP, length 4
|
|
[...]
|
|
|
|
We now run Zeek on that trace to perform the batch conversion:
|
|
|
|
.. code::
|
|
|
|
# zeek -r tftp_rrq.pcap policy/frameworks/spicy/record-spicy-batch SpicyBatch::filename=tftp_rrq.dat
|
|
tracking [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp]
|
|
tracking [orig_h=192.168.0.10, orig_p=3445/udp, resp_h=192.168.0.253, resp_p=50618/udp]
|
|
recorded 2 sessions total
|
|
output in tftp_rrq.dat
|
|
|
|
This leaves a new ``spicy-driver`` batch file in ``tftp_rrq.dat`` (if
|
|
we had left off the ``SpicyBatch::filename`` argument, the default
|
|
output name is ``batch.dat``).
|
|
|
|
Now we can pass that batch file into ``spicy-driver``:
|
|
|
|
.. spicy-output:: rrq.spicy 2
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp%orig=TFTP::ReadRequest %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp%orig=TFTP::ReadRequest tftp.spicy
|
|
|
|
The one additional piece here is that we need to tell ``spicy-driver``
|
|
on which packets inside the batch file to deploy our parser (because,
|
|
in principle, the batch could contain many different protocols
|
|
distributed over independent connections). We achieve that through
|
|
``-P 69/udp%orig=TFTP::ReadRequest``, which specifies that we want to
|
|
use the ``TFTP::ReadRequest`` on all originator-side UDP packets for
|
|
any connections on port ``69/udp``. See :ref:`spicy-driver
|
|
documentation <spicy-driver-batch>` for more on that syntax.
|
|
|
|
.. note::
|
|
|
|
.. versionadded:: 1.13 parser aliases
|
|
|
|
That option ``-P`` (aka ``--parser-alias``) is a feature added to
|
|
Spicy in version 1.13. An alternative to using that option---which
|
|
works with older Spicy version as well---is providing a
|
|
:ref:`%port <unit_meta_data>` property inside the
|
|
``TFTP::ReadRequest`` unit; the two mechanisms have the
|
|
same effect.
|
|
|
|
Altogether, this gives us an easy way to test our TFTP parser with
|
|
actual packet data, without needing to switch to full Zeek integration
|
|
yet.
|
|
|
|
The batch mode of ``spicy-driver`` is generally worth keeping in mind
|
|
while developing a new analyzer: even if the eventual goal is to
|
|
create a Zeek analyzer, it is usually easier to work with
|
|
``spicy-driver`` for as long as possible before transitioning to the
|
|
Zeek-side glue layer later. The same observation applies to debugging:
|
|
tracking down why a parser isn't quite doing what you would expect is
|
|
normally quicker with Zeek out of the picture. You can even *craft*
|
|
input for ``spicy-driver`` manually if you need to test specific edge
|
|
cases, for example by simply editing the payload data inside an
|
|
existing batch file, tweaking it the way you need it.
|
|
|
|
|
|
Generalizing to More Packet Types
|
|
---------------------------------
|
|
|
|
So far we can parse a Read Request, but nothing else. In fact, we are
|
|
not even examining the ``opcode`` yet at all to see if our input
|
|
actually *is* a Read Request. To generalize our grammar to other TFTP
|
|
packet types, we will need to parse the ``opcode`` on its own first,
|
|
and then use the value to decide how to handle subsequent data. Let's
|
|
start over with a minimal version of our TFTP grammar that looks at
|
|
just the opcode:
|
|
|
|
.. spicy-code:: tftp-1.spicy
|
|
|
|
module TFTP;
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16;
|
|
|
|
on %done { print self; }
|
|
};
|
|
|
|
.. spicy-output:: tftp-1.spicy
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
As you see, we now use ``-P 69/udp=TFTP::Packet`` because we no longer
|
|
need to worry about the direction: from now on, the same ``Packet``
|
|
unit handles both originator and responder sides. However, because the
|
|
way TFTP works, we need an additional parser mapping for the data
|
|
connection that's part of the PCAP as well, because that happens on a
|
|
different port: ``-P 50618/udp=TFTP::Packet``. The handling of such
|
|
dynamic, non-standard ports is something that normally the host
|
|
application (e.g., Zeek) would handle on its side. With
|
|
``spicy-driver``, we need to do it manually ourselves.
|
|
|
|
With this in place we now, in fact, see output for all the packets
|
|
that the original PCAP contains.
|
|
|
|
Next we create a separate type to parse the fields that are specific
|
|
to a Read Request:
|
|
|
|
.. spicy-code::
|
|
|
|
type ReadRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
|
|
We do not declare this type as public because we will use it only
|
|
internally inside our grammar; it is not a top-level entry point for
|
|
parsing (that's ``Packet`` now).
|
|
|
|
Now we need to tie the two units together. We can do that by adding
|
|
the ``ReadRequest`` as a field to the ``Packet``, which will let Spicy
|
|
parse it as a sub-unit:
|
|
|
|
.. spicy-code:: tftp-2.spicy
|
|
|
|
module TFTP;
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16;
|
|
rrq: ReadRequest;
|
|
|
|
on %done { print self; }
|
|
};
|
|
|
|
# %hide-begin%
|
|
type ReadRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
# %hide-end%
|
|
|
|
.. spicy-output:: tftp-2.spicy
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
However, this does not help us much yet: it still resembles our
|
|
original version in that it continues to hardcode one specific packet
|
|
type. Indeed, we are now getting error messages for packets of other
|
|
opcodes because we told ``spicy-driver`` to use ``Packet`` for them as
|
|
well, even though our current definition of ``Packet`` cannot actually
|
|
parse them successfully.
|
|
|
|
But the direction of using sub-units remains promising, we only need
|
|
to instruct the parser to leverage the ``opcode`` to decide what
|
|
particular sub-unit to use. Spicy provides a ``switch`` construct for
|
|
such dispatching:
|
|
|
|
.. spicy-code:: tftp-3.spicy
|
|
|
|
module TFTP;
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16;
|
|
|
|
switch ( self.opcode ) {
|
|
1 -> rrq: ReadRequest;
|
|
};
|
|
|
|
on %done { print self; }
|
|
};
|
|
|
|
# %hide-begin%
|
|
type ReadRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
# %hide-end%
|
|
|
|
.. spicy-output:: tftp-3.spicy 1
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
The ``self`` keyword always refers to the unit instance currently
|
|
being parsed, and we use that to get to the opcode for switching on.
|
|
If it is ``1``, we descend down into a Read Request. We are still
|
|
getting error messages for other opcodes, but now ``spicy-driver`` is
|
|
no longer complaining that it can't parse it them as a Read Request.
|
|
Instead, we're rightfully being told that our ``switch`` statement
|
|
doesn't provide the alternatives for other opcodes yet.
|
|
|
|
Of course, it is now easy to add more unit types for handling other
|
|
opcodes. Let's start with acknowledgments:
|
|
|
|
.. spicy-code:: tftp-4.spicy
|
|
|
|
# %hide-begin%
|
|
module TFTP;
|
|
# %hide-end%
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16;
|
|
|
|
switch ( self.opcode ) {
|
|
1 -> rrq: ReadRequest;
|
|
4 -> ack: Acknowledgement;
|
|
};
|
|
|
|
on %done { print self; }
|
|
};
|
|
|
|
type Acknowledgement = unit {
|
|
num: uint16; # block number being acknowledged
|
|
};
|
|
# %hide-begin%
|
|
|
|
type ReadRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
# %hide-end%
|
|
|
|
.. spicy-output:: tftp-4.spicy
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
As expected, the output shows that for opcode 4, our TFTP parser now
|
|
descends into the ``ack`` field while leaving ``rrq`` unset. Now
|
|
opcode 3 is the only one remaining in our input that is not handled
|
|
yet, hence the remaining error messages.
|
|
|
|
In total, TFTP defines three more opcodes for other packet types:
|
|
``2`` is a Write Request, ``3`` is file data being sent, and ``5`` is
|
|
an error. Let's add these to our grammar as well, so that we get the
|
|
whole protocol covered (please refer to the RFC for specifics of each
|
|
opcode type):
|
|
|
|
.. spicy-code:: tftp-complete-1.spicy
|
|
|
|
module TFTP;
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16;
|
|
|
|
switch ( self.opcode ) {
|
|
1 -> rrq: ReadRequest;
|
|
2 -> wrq: WriteRequest;
|
|
3 -> data: Data;
|
|
4 -> ack: Acknowledgement;
|
|
5 -> error: Error;
|
|
};
|
|
|
|
on %done { print self; }
|
|
};
|
|
|
|
type ReadRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
|
|
type WriteRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
|
|
type Data = unit {
|
|
num: uint16;
|
|
data: bytes &eod; # parse until end of data (i.e., packet) is reached
|
|
};
|
|
|
|
type Acknowledgement = unit {
|
|
num: uint16;
|
|
};
|
|
|
|
type Error = unit {
|
|
code: uint16;
|
|
msg: bytes &until=b"\x00";
|
|
};
|
|
|
|
.. spicy-output:: tftp-complete-1.spicy
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
Now we are finally error-free.
|
|
|
|
This grammar works well already, but we can improve it a bit more.
|
|
|
|
Using Enums
|
|
-----------
|
|
|
|
The use of integer values inside the ``switch`` construct is not
|
|
exactly pretty: they are hard to read and maintain. We can improve our
|
|
grammar by using an enumerator type with descriptive labels instead.
|
|
We first declare an ``enum`` type that provides one label for each
|
|
possible opcode:
|
|
|
|
.. spicy-code::
|
|
|
|
type Opcode = enum { RRQ = 1, WRQ = 2, DATA = 3, ACK = 4, ERROR = 5 };
|
|
|
|
Now we can change the ``switch`` to look like this:
|
|
|
|
.. spicy-code:: tftp-enum.spicy
|
|
|
|
# %hide-begin%
|
|
module TFTP;
|
|
|
|
type Opcode = enum { RRQ = 1, WRQ = 2, DATA = 3, ACK = 4, ERROR = 5 };
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16 &convert=Opcode($$);
|
|
# %hide-end%
|
|
|
|
switch ( self.opcode ) {
|
|
Opcode::RRQ -> rrq: ReadRequest;
|
|
Opcode::WRQ -> wrq: WriteRequest;
|
|
Opcode::DATA -> data: Data;
|
|
Opcode::ACK -> ack: Acknowledgement;
|
|
Opcode::ERROR -> error: Error;
|
|
};
|
|
|
|
# %hide-begin%
|
|
on %done { print self; }
|
|
};
|
|
|
|
type ReadRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
|
|
type WriteRequest = unit {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
};
|
|
|
|
type Data = unit {
|
|
num: uint16;
|
|
data: bytes &eod; # parse until end of data (i.e., packet) is reached
|
|
};
|
|
|
|
type Acknowledgement = unit {
|
|
num: uint16;
|
|
};
|
|
|
|
type Error = unit {
|
|
code: uint16;
|
|
msg: bytes &until=b"\x00";
|
|
};
|
|
# %hide-end%
|
|
|
|
Much better, but there is a catch still: this will not compile because
|
|
of a type mismatch. The switch cases' expressions have type
|
|
``Opcode``, but ``self.opcode`` remains of type ``uint16``. That is
|
|
because Spicy cannot know on its own that the integers we parse into
|
|
``opcode`` match the numerical values of the ``Opcode`` labels. But
|
|
we can convert the former into the latter explicitly by adding a
|
|
:ref:`&convert <attribute_convert>` attribute to the ``opcode`` field:
|
|
|
|
.. spicy-code::
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16 &convert=Opcode($$);
|
|
...
|
|
};
|
|
|
|
This does two things:
|
|
|
|
1. Each time an ``uint16`` gets parsed for this field, it is not
|
|
directly stored in ``opcode``, but instead first passed through the
|
|
expression that ``&convert`` specifies. Spicy then stores the
|
|
*result* of that expression, potentially adapting the field's type
|
|
accordingly. Inside the ``&convert`` expression, the parsed value is
|
|
accessible through the special identifier ``$$``.
|
|
|
|
2. Our ``&convert`` expression passes the parsed integer into the
|
|
constructor for the ``Opcode`` enumerator type, which lets Spicy
|
|
create an ``Opcode`` value with the label that corresponds to the
|
|
integer value.
|
|
|
|
With this transformation, the ``opcode`` field now has type ``Opcode``
|
|
and hence can be used with our updated switch statement. You can see
|
|
the new type for ``opcode`` in the output as well:
|
|
|
|
.. spicy-output:: tftp-enum.spicy
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
See :ref:`attribute_convert` for more on ``&convert``, and
|
|
:ref:`type_enum` for more on the ``enum`` type.
|
|
|
|
.. note::
|
|
|
|
What happens when ``Opcode($$)`` receives an integer that does not
|
|
correspond to any of the labels? Spicy permits that and will
|
|
substitute an implicitly defined ``Opcode::Undef`` label. It will
|
|
also retain the actual integer value, which can be recovered by
|
|
converting the enum value back to an integer.
|
|
|
|
Using Unit Parameters
|
|
---------------------
|
|
|
|
Looking at the two types ``ReadRequest`` and ``WriteRequest``, we see
|
|
that both are using exactly the same fields. That means we do not
|
|
really need two separate types here, and could instead define a
|
|
single ``Request`` unit to cover both cases. Doing so is
|
|
straight-forward, except for one issue: when parsing such a
|
|
``Request``, we would now lose the information whether we are seeing
|
|
read or a write operation. For a potential Zeek integration later it will be
|
|
useful to retain that distinction, so let us leverage a Spicy
|
|
capability that allows passing state into a sub-unit: :ref:`unit
|
|
parameters <unit_parameters>`. Here's the corresponding excerpt after
|
|
that refactoring:
|
|
|
|
.. spicy-code:: tftp-unified-request.spicy
|
|
|
|
# %hide-begin%
|
|
module TFTP;
|
|
|
|
type Opcode = enum { RRQ = 1, WRQ = 2, DATA = 3, ACK = 4, ERROR = 5 };
|
|
# %hide-end%
|
|
|
|
public type Packet = unit {
|
|
opcode: uint16 &convert=Opcode($$);
|
|
|
|
switch ( self.opcode ) {
|
|
Opcode::RRQ -> rrq: Request(True);
|
|
Opcode::WRQ -> wrq: Request(False);
|
|
# ...
|
|
# %hide-begin%
|
|
Opcode::DATA -> data: Data;
|
|
Opcode::ACK -> ack: Acknowledgement;
|
|
Opcode::ERROR -> error: Error;
|
|
# %hide-end%
|
|
};
|
|
|
|
on %done { print self; }
|
|
};
|
|
|
|
type Request = unit(is_read: bool) {
|
|
filename: bytes &until=b"\x00";
|
|
mode: bytes &until=b"\x00";
|
|
|
|
on %done { print "We got a %s request." % (is_read ? "read" : "write"); }
|
|
};
|
|
|
|
# %hide-begin%
|
|
type Data = unit {
|
|
num: uint16;
|
|
data: bytes &eod; # parse until end of data (i.e., packet) is reached
|
|
};
|
|
|
|
type Acknowledgement = unit {
|
|
num: uint16; # block number being acknowledged
|
|
};
|
|
|
|
type Error = unit {
|
|
code: uint16;
|
|
msg: bytes &until=b"\x00";
|
|
};
|
|
# %hide-end%
|
|
|
|
We see that the ``switch`` now passes either ``True`` or ``False``
|
|
into the ``Request`` type, depending on whether it is a Read Request
|
|
or Write Request. For demonstration, we added another ``print``
|
|
statement, so that we can see how that boolean becomes available
|
|
through the ``is_read`` unit parameter:
|
|
|
|
.. spicy-output:: tftp-unified-request.spicy
|
|
:exec: spicy-driver -F tutorial/examples/tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet %INPUT
|
|
:show-as: spicy-driver -F tftp_rrq.dat -P 69/udp=TFTP::Packet -P 50618/udp=TFTP::Packet tftp.spicy
|
|
:max-lines: 6
|
|
|
|
Admittedly, the unit parameter is almost overkill in this
|
|
example, but it proves very useful in more complex grammars where one
|
|
needs access to state information, in particular also from
|
|
higher-level units. For example, if the ``Packet`` type stored
|
|
additional state that sub-units needed access to, they could receive
|
|
the ``Packet`` itself as a parameter.
|
|
|
|
Complete Grammar
|
|
----------------
|
|
|
|
Combining everything discussed so far, this leaves us with the
|
|
following complete grammar for TFTP, including the packet formats in
|
|
comments as well:
|
|
|
|
.. literalinclude:: /_static/tftp-no-accept.spicy
|
|
:language: spicy
|
|
|
|
Next Steps
|
|
==========
|
|
|
|
This tutorial provides an introduction to the Spicy language and
|
|
toolchain. Spicy's capabilities go much further than what we could
|
|
show here. Some pointers for what to look at next:
|
|
|
|
- :ref:`programming` provides an in-depth discussion of the Spicy
|
|
language, including in particular all the constructs for
|
|
:ref:`parsing data <parsing>` and a :ref:`reference of language
|
|
elements <spicy_language>`. Note that most of Spicy's :ref:`types
|
|
<types>` come with operators and methods for operating on values.
|
|
The :ref:`debugging` section helps understanding Spicy's operation
|
|
if results do not match what you would expect.
|
|
|
|
- :ref:`examples` summarizes grammars coming with the
|
|
Spicy distribution.
|
|
|
|
- Zeek's :zeek:`Spicy tutorial <devel/spicy/tutorial.html>` continues
|
|
the TFTP example by turning the Spicy code developed here into
|
|
a full Zeek analyzer.
|
|
|
|
- :ref:`zeek_plugin` discusses Spicy's integration into Zeek.
|