zeek/doc/logs/http.rst
Patrick Kelley 8fd444092b initial
2025-05-07 15:35:15 -04:00

175 lines
6.8 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

========
http.log
========
The HyperText Transfer Protocol (HTTP) log, or :file:`http.log`, is another
core data source generated by Zeek. With the transition from clear-text HTTP to
encrypted HTTPS traffic, the :file:`http.log` is less active in many
environments. In some cases, however, organizations implement technologies or
practices to expose HTTPS as HTTP. Whether youre looking at legacy HTTP on the
wire, or HTTPS that has been exposed as HTTP, Zeeks :file:`http.log` offers
utility for examining normal, suspicious, and malicious activity.
The Zeek scripting manual, derived from the Zeek source code, completely
explains the meaning of each field in the :file:`http.log` (and other logs). It
would be duplicative to manually recreate that information in another format
here. Therefore, this entry seeks to show how an analyst would make use of the
information in the :file:`http.log`. Those interested in getting details on
every element of the :file:`http.log` should refer to :zeek:see:`HTTP::Info`.
Throughout the sections that follow, we will inspect Zeek logs in JSON format.
Inspecting the :file:`http.log`
===============================
To inspect the :file:`http.log`, we will use the same techniques we learned
earlier in the manual. First, we have a JSON-formatted log file, either
collected by Zeek watching a live interface, or by Zeek processing stored
traffic. We use the :program:`jq` utility to review the contents.
.. code-block:: console
zeek@zeek:~/zeek-test/json$ jq . -c http.log
::
{"ts":1591367999.512593,"uid":"C5bLoe2Mvxqhawzqqd","id.orig_h":"192.168.4.76","id.orig_p":46378,"id.resp_h":"31.3.245.133","id.resp_p":80,"trans_depth":1,"method":"GET","host":"testmyids.com","uri":"/","version":"1.1","user_agent":"curl/7.47.0","request_body_len":0,"response_body_len":39,"status_code":200,"status_msg":"OK","tags":[],"resp_fuids":["FEEsZS1w0Z0VJIb5x4"],"resp_mime_types":["text/plain"]}
This is a very simple :file:`http.log`. With only one entry, its the simplest
possible entry. As before, we could see each field printed on its own line:
.. code-block:: console
zeek@zeek:~/zeek-test/json$ jq . http.log
::
{
"ts": 1591367999.512593,
"uid": "C5bLoe2Mvxqhawzqqd",
"id.orig_h": "192.168.4.76",
"id.orig_p": 46378,
"id.resp_h": "31.3.245.133",
"id.resp_p": 80,
"trans_depth": 1,
"method": "GET",
"host": "testmyids.com",
"uri": "/",
"version": "1.1",
"user_agent": "curl/7.47.0",
"request_body_len": 0,
"response_body_len": 39,
"status_code": 200,
"status_msg": "OK",
"tags": [],
"resp_fuids": [
"FEEsZS1w0Z0VJIb5x4"
],
"resp_mime_types": [
"text/plain"
]
}
HTTP is a protocol that was initially fairly simple. Over time it has become
increasingly complicated. Its not the purpose of this manual to describe how
HTTP can be used and abused. Rather, we will take a brief look at the most
important elements of this :file:`http.log` entry, which is almost all of them.
Understanding the :file:`http.log` Entry
========================================
Similar to the previous :file:`dns.log`, the :file:`http.log` is helpful
because it combines elements from the conversation between the source and
destination in one log entry. The most fundamental elements of the log answer
questions concerning who made a request, who responded, and the nature of the
request and response.
In this entry, we see that ``192.168.4.76`` made a request to ``31.3.245.133``.
The originator made a HTTP version 1.1 GET request for the ``/`` or root of the
site ``testmyids.com`` hosted by the responder, passing a user agent of
``curl/7.47.0``.
The responder replied with a 200 OK message, with a MIME (Multipurpose Internet
Mail Extensions) type of ``text/plain``. Zeek provides us a file ID (or
``fuid``) of ``FEEsZS1w0Z0VJIb5x4``. If we had configured Zeek to log files of
type ``text/plain``, we could look at the content returned by the responder.
Finally, note the UID of ``C5bLoe2Mvxqhawzqqd``. This is the same UID found in
the :file:`conn.log` for this TCP connection. This allows us to link the
:file:`conn.log` entry with this :file:`http.log` entry.
Reviewing the Original Traffic
==============================
To better understand the original traffic, and how it relates to the Zeek
:file:`http.log`, lets look at the contents manually. HTTP is a clear-text
protocol. Assuming the contents are also clear text, and not obfuscated or
encrypted, we can look at the contents. In the following example I use the
venerable program :program:`tcpflow` to create two files. One contains data
from the originator to the responder, while the second contains data from the
responder to the originator.
.. code-block:: console
zeek@zeek:~/zeek-test$ tcpflow -r tm1t.pcap port 80
Lets first look at the data from the originator to the responder.
.. code-block:: console
zeek@zeek:~/zeek-test$ cat 192.168.004.076.46378-031.003.245.133.00080
::
GET / HTTP/1.1
Host: testmyids.com
User-Agent: curl/7.47.0
Accept: */*
Here is the data from the responder to the originator.
.. code-block:: console
zeek@zeek:~/zeek-test$ cat 031.003.245.133.00080-192.168.004.076.46378
::
HTTP/1.1 200 OK
Server: nginx/1.16.1
Date: Fri, 05 Jun 2020 14:40:07 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 39
Connection: keep-alive
Last-Modified: Fri, 10 Jan 2020 21:36:02 GMT
ETag: "27-59bcfe9932c32"
Accept-Ranges: bytes
uid=0(root) gid=0(root) groups=0(root)
As you can see, there are elements, particularly in the response, that do not
appear in the :file:`http.log`. For example, the Server type of
``nginx/1.16.1`` is not logged. If an analyst or administrator decided that he
or she wished to include that data in his or her :file:`http.log`, it is
possible to make adjustments.
The data from the responder also shows the application payload it sent::
uid=0(root) gid=0(root) groups=0(root)
This is the output of a Unix ``uname -a`` command. It is hosted at the server
``testmyids.com`` to trigger a “GPL ATTACK_RESPONSE id check returned root”
alert found in open source intrusion detection engine rule sets, such as that
supported by Suricata. Analysts sometimes use this site to test if their
intrusion detection engines are functioning properly. A more modern option with
many different tests can be found at https://github.com/0xtf/testmynids.org.
Conclusion
==========
Zeeks :file:`http.log` is another important log that offers a great deal of
information on how systems are interacting with the Internet and each other. In
the example in this section we looked at a very simple interaction between an
originator and a responder. We could see the benefit of summarizing an HTTP
request and response in a single log entry. In the next section we will look
at other core Internet protocols.