Filter flow logs
Big picture
Filter Calico Cloud flow logs.
Value
Filter Calico Cloud flow logs to suppress logs of low significance, and troubleshoot threats.
Concepts
Container monitoring tools versus flow logs
Container monitoring tools are good for monitoring Kubernetes and orchestrated workloads for CPU usage, network usage, and log aggregation. For example, a data monitoring tool can tell if a pod has turned into a bitcoin miner based on it using more than normal CPU.
Calico Cloud flow logs provide continuous records of every single packet sent/received by all pods in your Kubernetes cluster. Note that flow logs do not contain all packet data; only the number of packets/bytes that were sent between specific IP/ports, and when. In the previous monitoring tool example, Calico Cloud flow logs could see the packets running to/from the bitcoin mining network.
Calico Cloud flow logs tell you when a pod is compromised, specifically:
- Where a pod is sending data to
- If the pod is talking to a known command-and-control server
- Other pods that the compromised pod has been talking to (so you can see if they're compromised too)
Flow log format
A flow log contains these space-delimited fields (unless filtered out).
startTime endTime srcType srcNamespace srcName srcLabels dstType dstNamespace dstName
dstLabels srcIP dstIP proto srcPort dstPort numFlows numFlowsStarted numFlowsCompleted
reporter packetsIn packetsOut bytesIn bytesOut action
Example
1528842551 1528842851 wep dev rails-81531* - wep dev memcached-38456* - - - 6 - 3000 7 3 4 out 154 61 70111 49404 allow
- Fields that are not enabled or are aggregated, are noted by
- - Aggregated names (such as “pod prefix”), are noted by
*at the end of the name - If
srcNameordstNamefields contain only a*, aggregation was performed using other means (such as specific labels), and no unique prefix was present.
How to
Create flow log filters
Filters are written as a YAML list of Fluent Bit filter entries. The calico-fluent-bit log collector ships the grep, record_modifier, parser, and lua filters. Filters you add under the flow key are applied to flow logs automatically; you do not need to set a match on each entry.
Example: filter out a specific namespace
This example filters out all flow logs whose source or destination namespace is dev. A record is dropped when it matches any of the exclude rules; additional namespaces could be filtered by adjusting the regular expressions, or by adding more exclude rules.
- name: grep
exclude:
- source_namespace dev
- dest_namespace dev
Example: filter out internet traffic to a specific deployment
This example filters inbound internet traffic to the deployment with pods named, nginx-internet-*. Note the use of logical_op: and to filter out only the traffic that is both to the deployment, and from the internet (source pub).
- name: grep
logical_op: and
exclude:
- dest_name_aggr ^nginx-internet
- source_name_aggr pub
Add filters to ConfigMap file
-
Create a
filtersdirectory with a file calledflowwith your desired filters. If you are also adding DNS filters, add thednsfile to the directory. -
Create the
fluent-bit-filtersConfigMap in thetigera-operatornamespace with the following command.kubectl create configmap fluent-bit-filters -n tigera-operator --from-file=filters
The operator inserts the filters inline into the log collector configuration and rolls the calico-fluent-bit DaemonSet automatically.
:::note Upgrading from a release that used Fluentd
Earlier releases collected logs with Fluentd and read filters in Fluentd <filter> syntax from a ConfigMap named fluentd-filters. That ConfigMap is no longer read, and Fluentd filter syntax cannot be translated automatically. Recreate your filters as Fluent Bit YAML filter lists under the new fluent-bit-filters name. If a filter key does not parse as Fluent Bit YAML, the operator skips that filter, reports a warning on the tigera status output naming the offending key, and continues to ship unfiltered logs.
:::