VAST's expression language makes it easy to describe a relevant subset of interest over structured data. The "easy" part is that VAST expressions operate on multiple different schemas at once, as opposed to traditional expressions that apply to a single, fixed schema. The language captures this heterogeneity with extractors.
An expression is a function over an event that evaluates to
false, indicating whether it qualifies as result. Expression operands are
either sub-expressions or predicates, and can be composed via conjunctions
&&), disjunctions (
||), and negations (
The following diagram shows an example expression in tree form:
When written out, it looks like this:
(dport <= 1024 || :addr in 10.0.0.0/8) && ! (#type == /zeek.*/)
In this example, the predicate operands
extractors that resolve to a set of matching fields at runtime.
Let's take a look at the expression components in more depth.
There exist three logical connectives that connect sub-expressions:
&&: the logical AND between two expressions
||: the logical OR between two expressions
!: the logical NOT of one expression
A predicate has the form
LHS denotes the left-hand
side operand and
RHS the right-hand side operand. The relational operator
op is typed, i.e., only a subset of the cross product
of operand types is valid.
The following operators separate two operands:
<: less than
<=: less equal
>=: greater equal
==: equal to
!=: not equal to
in: in (left to right)
!in: not in (left to right)
ni: in (right to left)
!ni: not in (right to left)
!~: not match
The table below illustrates a partial function over the cross product of available types. Each letter in a cell denotes a set of operators:
- E: equality operators
- R: range operators
- M: membership operators
An extractor retrieves a certain aspect of an event. When looking up an expression, VAST binds the extractor to a specific record field, i.e., maps it to the corresponding numeric column offset in the schema. Binding an expression implicitly creates a disjunction of all matching fields. We find that this existential qualification is the natural user experience when "extracting" data declaratively.
VAST has the following extractor types:
Field: extracts all fields whose name match a given record field name.
Type: extracts all event types that have a field of a given type.
Meta: matches on the type name or field name of a layout instead of the values contained in actual events.
Field extractors have the form
z match on
record field names. The access fields in nested records. Using a type name as
leftmost element before a
. is also possible.
A field extractor has suffix semantics. It is possible to just write
x.y.z. In fact, writing
z is equivalent to
*.z and creates a
disjunction of all fields ending in
ts > 1 day ago: events with a record field
tsfrom the last 24h hours
zeek.conn.id.orig_h in 192.168.0.0/24: connections with source IP in 192.168.0.0/24
orig_bytes >= 10Ki: events with a field
orig_bytesgreater or equal to 10 * 2^10.
Type extractors have the form
T is the type of a field. Type
extractors work for all basic
types and user-defined aliases.
A search for type
:T includes all aliased types. For example, given the alias
port that maps to
count, then the
:count type extractor will also consider
instances of type
port. However, a
:port query does not include
types because an alias is a strict refinement of an existing type.
:timestamp > 1 hour ago: events with a
timestampalias in the last hour
:addr == 188.8.131.52: events with any field of type
addrequal to 184.108.40.206
:count > 42M: events where
countvalues is greater than 42M
"evil" in :string: events where any
stringfield contains the substring
Meta extractors have the forms
E is from the fixed set of tokens
import_time. They work on the event metadata (e.g., their
schema) instead of the value domain.
#type: on the event name in a schema
#field: matches on the field names of a record
#import_time: matches on the ingestion time when event arrived at the server
#type == "zeek.conn": events of type
"suricata" in #type: events that have
suricatain their type name
#field == "community_id": events that have a field with name
#import_time > 1 hour ago: events that have been imported within the last hour
Predicates with type extractors and equality operators can be written tersely
as value predicates. That is, if a predicate has the form
:T == X where
X is a value and
T the type of
X, it suffices to write
The predicate parser deduces the type of
X automatically in this case.
220.127.116.11 is a valid predicate and expands to
:addr == 18.104.22.168.
This allows for quick type-based point queries, such as
(22.214.171.124 || 80/tcp) && "evil".
Value predicates of type
subnet expand more broadly. Given a subnet
10.0.0.0/8, the parser expands this to:
:subnet == 10.0.0.0/8 || :addr in 10.0.0.0/8
This makes it easier to search for IP addresses belonging to a specific subnet.
Every type has a corresponding value syntax in the expression language.
Here is an over view of basic types:
|Denotes an absent or invalid value|
|A boolean value|
|A 64-bit signed integer|
|A 64-bit unsigned integer|
|A 64-bit double (IEEE 754)|
|A time span (nanosecond granularity)|
|A time point (nanosecond granularity)|
|A sequence of characters|
|A regular expression|
|An IPv4 or IPv6 address|
|An IPv4 or IPv6 subnet|
|An ordered sequence of values where each element has type |
|An associate array which maps keys to values|
|a product type with one or more named fields|