Skip to content
Ryan Barnes edited this page Sep 16, 2024 · 4 revisions

kprobe is Kentik's high-performance host and sensor network probe.

Usage

kprobe -i <interface> --email <kentik API user email> --token <kentik API access token> --device-name <name>

kprobe is configured to send traffic to Kentik's production US cluster by default. The --region flag points kprobe to the specified region, and the --api-url, --flow-url, and --metrics-url flags are available to override specific endpoints.

When running on a private network that requires a HTTP proxy for outgoing traffic to Kentik the --proxy-url flag may be used to supply the URL of the HTTP proxy.

Each kprobe instance sends data for a specific device, identified by one of the following flags:

--device-id <numeric device identifier>
--device-ip <device IP address>
--device-name <device name>

If a matching Kentik device cannot be found kprobe will exit with an error. Alternatively a new Kentik device can be automatically created by supplying --device-name <name> along with the --device-plan <ID> and --device-site <ID> flags which take a Kentik plan ID and site ID respectively.

Troubleshooting

The first step in troubleshooting an issue with kprobe is to run with the -v flag which enables verbose output. A single -v will print the version of kprobe and libkflow, and the full configuration. Multiple flags as in -vv will enable verbose debug output.

Modes

kprobe supports a number of different, and mutually exclusive modes of operation.

flow mode

Flow mode is kprobe's standard mode of operation which captures all network traffic on the selected interface and generates flow records in Kentik's kflow format.

DNS mode

The --dns flag enables DNS mode which ignores all traffic except DNS over UDP (port 53). In this mode DNS responses with A or AAAA records are sent in batches to a dedicated DNS endpoint, and normal flows are not generated.

The DNS endpoint can be overridden with the --dns-url <URL> flag.

Example usage:

kprobe -i <interface> --email <kentik API user email> --token <kentik API access token> --device-name <name> --dns

Then, to enable it on the server side, run:

kt ops admin ott on device_id target_host.

Where target_host is a FQDN like c199.iad1.kentik.com

RADIUS mode

The --radius flag enables RADIUS mode which ignores all traffic except RADIUS UDP traffic on ports 1812 and 1813. In this mode RADIUS accounting requests with a user name and IP address are used to populate custom dimensions. Normal flow records are not generated in this mode.

Protocol decoding

In flow mode kprobe will decode the following application protocols and output detailed information including request latency and protocol-specific fields:

  • HTTP (TCP, port 80)
  • DNS (UDP, port 53)
  • DHCP (UDP, port 67 & 68)
  • TLS (TCP, port 443)

Protocol decoding incurs a certain amount of CPU cost and may be disabled with --no-decode. Decoding is only performed for traffic on standard ports, however for HTTP the --http-port <port> option may be specified one or more times to decode HTTP on those additional ports.

Sampling

When capturing on an interface that exceeds the device FPS limit or kprobe's available CPU allocation, the --sample <N> flag may be used to record only 1:N flows. Flows are selected randomly, but this is flow sampling not packet sampling. On devices with asymmetric or small numbers of flows data from the sampled flows multiplied by the sample rate may not be representative of actual traffic.

Status server

When the --status-host <host> and --status-port <port> flags are supplied kprobe will start a simple HTTP server on the specified host and port. This server will respond to GET requests for /v1/status and reply with some basic statistics in JSON format:

{
  "flows-in": {
    "count": 0,
    "1m.rate": 0,
    "5m.rate": 0
  },
  "flows-out": {
    "count": 0,
    "1m.rate": 0,
    "5m.rate": 0
  }
}

Advanced flags

--fanout-group <group>

This is the important one when capturing large volumes of traffic a single kprobe instance may not be able to handle the full traffic volume. In this scenario multiple kprobe instances may be started with the --fanout <group> flag and captured traffic will be balanced across all instances in the same group.

--fanout-mode <hash|cpu>

changes the algorithm used to select which packets go to which process, cpu is a good choice if the NIC is steering all packets for a given flow to the same CPU. The default is hash which selects packets based on their 5-tuple hash

--filter <filter>

In cases where only a subset of traffic needs to be captured the --filter flag may be supplied with a standard PCAP filter specifying which traffic to capture.

For example --filter "tcp port 80" will capture only TCP traffic on port 80.

--promisc

Enable promiscuous mode on the capture interface.

--snaplen <N>

By default kprobe will capture full packets. The --snaplen <N> flag may be used to capture only the first N bytes, which may improve performance but will disable protocol decoding when the packet size exceeds the limit.

Embarrassing flags

These flags are embarrassing and should be redone or removed entirely:

--device-if <interface> attempt to identify the device by one of the IPs assigned to the specified interface. Using this will likely result in unexpected device selection and should be avoided.

--translate <spec>... spec is a pair of comma-separated A,B IP address pairs. When A is seen as a src or dst addr in flow it will be replaced by B. This was quick hack for one customer and should be rethought if not removed entirely.