Skip to main content

Tenzir v4.7

· 4 min read
Dominik Lohmann

Tenzir v4.7 brings a new context type, two parsers, four new operators, improvements to existing parsers, and a sizable under-the-hood performance improvement.

Enrich with the GeoIP context

Use the geoip context to enrich events with information from a MaxMind GeoIP® database.

To get started, download the freely available GeoLite2 MaxMind database, or use any other MaxMind database. We'll use the country database file GeoLite2-Country.mmdb.

Create a 'geoip' context named 'country'
context create country geoip --db-path /path/to/GeoLite2-Country.mmdb
Enrich Suricata events with the 'country' context
export
| where #schema == /suricata.*/
/* Apply the context to both source and destination IP address fields */
| enrich src_country=country --field src_ip
| enrich dest_country=country --field dest_ip
/* Use just the country's isocode, and discard the rest of the information */
| replace src_country=src_country.context.country.iso_code,
dest_country=dest_country.context.country.iso_code
Possible output
{
"timestamp": "2021-11-17T14:02:38.165570",
"flow_id": 1837021175481117,
"pcap_cnt": 357,
"vlan": null,
"in_iface": null,
"src_ip": "45.137.23.27",
"src_port": 47958,
"dest_ip": "198.71.247.91",
"dest_port": 53,
"proto": "UDP",
"event_type": "dns",
"community_id": "1:0nZC/6S/pr+IceCZ04RjDZbX+KI=",
"dns": {
// ...
},
"src_country": "NL",
"dest_country": "US"
}

The geoip context is a powerful building block for in-band enrichments. Besides country codes and country names you can add region codes, region names, cities, zip codes, and geographic coordinates. With the flexibility of the contextualization framework this information you can now get this information in real-time.

Follow our Blog Post Series

Read more about contexts in our blog post series:

  1. Enrichment Complexity in the Wild
  2. Contextualization Made Simple

Grok and KV Parsers

The kv and grok parsers combine well with the parse operator introduced with Tenzir v4.6. The former reads key-value pairs by splitting strings based on regular expressions, and the latter uses a parser modeled after the Logstash grok plugin in Elasticsearch.

Parse a fictional HTTP request log with grok:

Example input
{
"message": "55.3.244.1 GET /index.html 15824 0.043"
}
Parse with grok
parse message grok "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"
Example output
{
"message": {
"client": "55.3.244.1",
"method": "GET",
"request": "/index.html",
"bytes": 15824,
"duration": 0.043
}
}

Extract space-separated key=value pairs with kv:

Example input
{
"message": "foo=1 bar=2 baz=3 qux=4"
}
Parse with kv
parse message kv "\s+" "="
Example output
{
"message": {
"foo": 1,
"bar": 2,
"baz": 3,
"qux": 4
}
}

Slice and Dice Events

The slice operator is a more powerful version of the head and tail operators. It allows for selecting a contiguous range of events given a half-closed interval.

Get the second 100 events
slice --begin 100 --end 200

Negative values for the interval count from the end rather than from the start:

Get the last 5 events
slice --begin -5

Positive and negative values can also be combined:

Get everything but the first 10 and the last 10 events
slice --begin 10 --end -10

Lightweight Endpoint Snapshot

Use the processes, sockets, and nics sources to get a snapshot of running processes, sockets, and available network interfaces, respectively.

Top three running processes by name
processes
| top name
| head 3
Possible output
{
"name": "MTLCompilerService",
"count": 24
}
{
"name": "zsh",
"count": 16
}
{
"name": "VTDecoderXPCServ",
"count": 9
}

Performance Improvements

We've fixed a long-standing bug in Tenzir's pipeline execution engine that improve performance for some operators:

  1. Operators and loaders that interface with blocking third-party APIs sometimes delayed partial results until the next partial result arrived through the blocking API. This bug affected the tcp, zmq, kafka, and nic loaders and the shell, fluent-bit, velociraptor, and python operators. These loaders and operators are now generally more responsive.
  2. The time-to-first-result for pipelines with many operators is now shorter, and the first result no longer takes an additional 20ms per operator in the pipeline to arrive.

Want More?

We provide a full list of changes in our changelog.

Head over to app.tenzir.com to play with the new features and join our Discord server—the perfect place to ask questions, chat with Tenzir users and developers, and to discuss your feature ideas!