Skip to content

Commit

Permalink
Implement packet capture on VM- or container interfaces (netlab captu…
Browse files Browse the repository at this point in the history
…re) (#1323)

Fixes #1086
  • Loading branch information
ipspace authored Sep 24, 2024
1 parent 85421e3 commit 4539311
Show file tree
Hide file tree
Showing 14 changed files with 283 additions and 8 deletions.
1 change: 1 addition & 0 deletions docs/labs/clab.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
(lab-clab)=
# Using Containerlab with *netlab*

[Containerlab](https://containerlab.srlinux.dev/) is a Linux-based container orchestration system that creates virtual network topologies using containers as network devices. To use it:
Expand Down
9 changes: 9 additions & 0 deletions docs/labs/libvirt.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,15 @@ The new Vagrant box will be copied into the *libvirt* storage pool the next time
* P2P UDP tunnels are used for links with two nodes, and link **type** is set to **p2p** (the default behavior for links with two nodes). P2P tunnels are transparent; you can run any layer-2 control-plane protocol (including LACP) over them.
* *libvirt* networks are used for all other links. They are automatically created and deleted by **vagrant up** and **vagrant down** commands executed by **netlab up** and **netlab down**. **netlab up** sets the `group_fwd_mask` for all Vagrant-created Linux bridges to 0x4000 to [enable LLDP passthrough](https://blog.ipspace.net/2020/12/linux-bridge-lldp.html).

(libvirt-capture)=
### Packet Capture

The *libvirt* point-to-point UDP tunnels are not implemented as Linux interfaces, making it impossible to start packet capture on the VM interfaces attached to point-to-point tunnels. The VMs must be attached to Linux bridges for the **[netlab capture](netlab-capture)** command to work.

Add **type: lan** to a point-to-point link between two virtual machines to change its implementation into a Linux bridge. You can also set the **defaults.providers.libvirt.p2p_bridge** parameter to *True* if you don't want to use UDP tunnels for point-to-point links (see [](defaults-topology) and [](defaults-user-file) for more information on changing system defaults).

Finally, you could start the lab with the `netlab up -p libvirt:p2p_bridge` command to change the system default for a single lab instance.

(libvirt-network-external)=
### Connecting to the Outside World

Expand Down
92 changes: 92 additions & 0 deletions docs/netlab/capture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
(netlab-capture)=
# Packet Capture

The **netlab capture** command can be used to capture packets on [*libvirt* virtual machines](libvirt-capture) or [*containerlab*-created Docker containers](lab-clab). The default packet capturing program is `tcpdump`; you can change that with the [default settings](defaults).

```{warning}
You cannot capture traffic on point-to-point links between *‌libvirt* virtual machines; you have to change them into Linux bridges ([more details](libvirt-capture)).
```

## Usage

The **netlab capture** command takes two parameters: the node you want to perform packet capture on and the interface name within that node.

```text
$ netlab capture -h
usage: netlab capture [-h] [--snapshot [SNAPSHOT]] node [intf]
Start a packet capture on the specified node/interface
positional arguments:
node Node on which you want to capture traffic
intf Interface on which you want to capture traffic
options:
-h, --help show this help message and exit
--snapshot [SNAPSHOT]
Transformed topology snapshot file
All other arguments are passed directly to the packet-capturing utility
```

## Examples

Let's assume we're using this simple topology:

```
defaults.device: cumulus
provider: clab
module: [ ospf ]
nodes: [ r1, r2 ]
links: [ r1-r2 ]
```

After starting the lab, you can use the **netlab capture r1 swp1** command to capture all the traffic on the R1-R2 link:

```bash
$ netlab capture r1 swp1
Starting packet capture on r1/swp1: sudo ip netns exec clab-X-r1 tcpdump -i swp1 -l -v
tcpdump: listening on swp1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
17:37:39.031667 IP6 (flowlabel 0xa854f, hlim 255, next-header ICMPv6 (58) payload length: 24) fe80::a8c1:abff:fe84:1dfb > ip6-allnodes: [icmp6 sum ok] ICMP6, router advertisement, length 24
hop limit 64, Flags [none], pref medium, router lifetime 15s, reachable time 0ms, retrans timer 0ms
source link-address option (1), length 8 (1): aa:c1:ab:84:1d:fb
```

```{tip}
If you don't specify additional parameters, **‌netlab capture** adds `-l -v` (unbuffered, verbose) flags to the **tcpdump** command line
```

If you want to capture a subset of traffic, use **tcpdump** traffic filters (you will also have to specify the `-l -v` flags if you wish to have an immediate verbose printout). For example, you can use the following command to display OSPF traffic:

```bash
$ netlab capture r1 swp1 proto ospf -l -v
Starting packet capture on r1/swp1: sudo ip netns exec clab-X-r1 tcpdump -i swp1 proto ospf -l -v
tcpdump: listening on swp1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
17:39:30.143019 IP (tos 0xc0, ttl 1, id 42863, offset 0, flags [none], proto OSPF (89), length 68)
10.1.0.2 > 224.0.0.5: OSPFv2, Hello, length 48
Router-ID 10.0.0.2, Backbone Area, Authentication Type: none (0)
Options [External]
Hello Timer 10s, Dead Timer 40s, Mask 255.255.255.252, Priority 1
Neighbor List:
10.0.0.1
```

## Changing the Packet-Capturing Utility

**netlab capture** uses **tcpdump** as the default packet-capturing utility. You can change that with the **defaults.netlab.capture.command** parameter ([default changing details](defaults)). The command you specify must include the `{intf}` string at the point where the packet-capturing utility expects the interface name.

To change the default parameters passed to the packet-capturing utility, change the **defaults.netlab.capture.command_args** parameter.

To display the default settings, use the ‌**‌netlab show defaults netlab.capture** command.

```bash
$ netlab show defaults netlab.capture

netlab default settings within the netlab.capture subtree
=============================================================================

command: tcpdump -i {intf}
command_args: -l -v
```

2 changes: 2 additions & 0 deletions docs/netlab/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ The **netlab** command is the *netlab* CLI interface. It includes data model tra
## Configuring and Controlling the Lab

* **[netlab connect](connect.md)** uses the transformed lab topology data to find the IP address, username, and password of the specified lab device or [external tool](../extools.md), and uses SSH or **docker exec** to connect to the lab device/tool.
* **[netlab capture](capture.md)** can be used to perform packet capture on VM- or container interfaces
* **[netlab collect](collect.md)** uses Ansible device facts (or equivalent functionality implemented with Ansible modules) to collect device configurations and store them in the specified directory.
* **[netlab validate](validate.md)** executes tests defined in the lab topology on the lab devices
* **[netlab down](down.md)** destroys the virtual lab.
Expand Down Expand Up @@ -52,6 +53,7 @@ The **netlab** command is the *netlab* CLI interface. It includes data model tra
.. toctree::
:maxdepth: 1
netlab capture <capture.md>
netlab clab <clab.md>
netlab collect <collect.md>
netlab config <config.md>
Expand Down
92 changes: 92 additions & 0 deletions netsim/cli/capture.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#
# netlab capture command
#
# Starts packet capturing on specified node/interface
#
import sys
import typing
import argparse

from . import load_snapshot,_nodeset,external_commands
from .. import providers
from ..augment import devices
from ..utils import strings,log

#
# CLI parser for 'netlab capture' command
#
def capture_parse(args: typing.List[str]) -> typing.Tuple[argparse.Namespace, typing.List[str]]:
parser = argparse.ArgumentParser(
prog="netlab capture",
description='Start a packet capture on the specified node/interface',
epilog='All other arguments are passed directly to the packet-capturing utility')
parser.add_argument(
'--snapshot',
dest='snapshot',
action='store',
nargs='?',
default='netlab.snapshot.yml',
const='netlab.snapshot.yml',
help='Transformed topology snapshot file')
parser.add_argument(
dest='node', action='store',
help='Node on which you want to capture traffic')
parser.add_argument(
dest='intf', action='store',
nargs='?',
help='Interface on which you want to capture traffic')

return parser.parse_known_args(args)

def run(cli_args: typing.List[str]) -> None:
(args,rest) = capture_parse(cli_args)

topology = load_snapshot(args)

if args.node and args.node not in topology.nodes:
log.error(
f'Unknown node {args.node}',
category=log.FatalError,
module='capture',
skip_header=True,
exit_on_error=True,
more_hints=[ 'Use "netlab status" to display the node names in the current lab topology' ])

ndata = topology.nodes[args.node]
if not args.intf or args.intf not in [ intf.ifname for intf in ndata.interfaces ]:
errmsg = f'Invalid interface name {args.intf} for node {args.node} (device {ndata.device})' if args.intf \
else 'Missing interface name'
log.error(
errmsg,
category=log.FatalError,
module='capture',
skip_header=True,
exit_on_error=True,
more_hints=[ f'Use "netlab report --node {args.node} addressing" to display valid interface names and their descriptions' ])
sys.exit(1)

node_provider = devices.get_provider(ndata,topology.defaults)
p_module = providers.get_provider_module(topology,node_provider)
p_cmd = p_module.call('capture_command',ndata,topology,args)

if p_cmd is None:
log.error(
f'Cannot perform packet capture for node {args.node} using provider {node_provider}',
module='capture',
category=log.FatalError,
exit_on_error=True,
skip_header=True)

if not rest:
rest = strings.string_to_list(topology.defaults.netlab.capture.command_args)

p_cmd += rest
print(f'Starting packet capture on {args.node}/{args.intf}: {" ".join(p_cmd)}')
status = external_commands.run_command(p_cmd,ignore_errors=True,return_exitcode=True)
if status == 1:
log.error(
f'Packet capturing utility reported an error',
category=log.FatalError,
module='capture',
skip_header=True,
exit_on_error=True)
2 changes: 2 additions & 0 deletions netsim/cli/usage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ collect Collect device configurations from network devices and save them in

down Destroy the virtual lab

capture Start packet capture on the specified node/interface

Reports and graphs
==================
status Display the state of lab instances running on the current server
Expand Down
5 changes: 5 additions & 0 deletions netsim/defaults/hints.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,8 @@ validation:
The 'show' action should return structured data that is then validated with the
'valid' check. If you want to execute a command on the device without checking
the results, use the 'exec' action.

libvirt:
capture: |
Change the link type to Linux bridge with 'type: lan' link attribute or see
https://netlab.tools/labs/libvirt/#libvirt-capture for other options.
3 changes: 3 additions & 0 deletions netsim/defaults/netlab.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
capture:
command: "tcpdump -i {intf}"
command_args: "-l -v"
4 changes: 2 additions & 2 deletions netsim/providers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,10 +310,10 @@ def execute(hook: str, topology: Box) -> None:
"""
Execute a node-level provider hook
"""
def execute_node(hook: str, node: Box, topology: Box) -> None:
def execute_node(hook: str, node: Box, topology: Box) -> typing.Any:
node_provider = devices.get_provider(node,topology.defaults)
p_module = get_provider_module(topology,node_provider)
p_module.call(hook,node,topology)
return p_module.call(hook,node,topology)

"""
Mark all nodes and links with relevant provider(s)
Expand Down
6 changes: 6 additions & 0 deletions netsim/providers/clab.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import json
from box import Box
import pathlib
import argparse

from . import _Provider,get_forwarded_ports
from ..utils import log, strings
Expand Down Expand Up @@ -255,3 +256,8 @@ def validate_node_image(self, node: Box, topology: Box) -> None:
f"If you're using a private Docker repository, use the 'docker image pull {node.box}'",
f"command to pull the image from it or build/install it using this recipe:",
dp_data.build ])

def capture_command(self, node: Box, topology: Box, args: argparse.Namespace) -> list:
cmd = strings.string_to_list(topology.defaults.netlab.capture.command)
cmd = strings.eval_format_list(cmd,{'intf': args.intf})
return strings.string_to_list(f'sudo ip netns exec clab-{topology.name}-{node.name}') + cmd
41 changes: 40 additions & 1 deletion netsim/providers/libvirt.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import pathlib
import tempfile
import netaddr
import argparse

from ..data import types,get_empty_box
from ..utils import log,strings
Expand Down Expand Up @@ -219,12 +220,15 @@ def pre_transform(self, topology: Box) -> None:

_Provider.pre_transform(self,topology)

p2p_bridge = topology.defaults.get('providers.libvirt.p2p_bridge',False)
for l in topology.links:
if l.get('libvirt.uplink',None): # Set 'public' attribute if the link has an uplink
if not 'public' in l.libvirt: # ... but no 'public' libvirt attr
l.libvirt.public = 'bridge' # ... default mode is bridge (MACVTAP)

if l.get('libvirt.provider',None) and 'vlan' not in l.type:
must_be_lan = l.get('libvirt.provider',None) and 'vlan' not in l.type
must_be_lan = must_be_lan or (p2p_bridge and l.get('type','p2p') == 'p2p')
if must_be_lan:
l.type = 'lan'
if not 'bridge' in l:
l.bridge = "%s_%d" % (topology.name[0:10],l.linkindex)
Expand Down Expand Up @@ -393,3 +397,38 @@ def validate_node_image(self, node: Box, topology: Box) -> None:
f"If you have the Vagrant box available in a private repository, use the",
f"'vagrant box add <url>' command to add it, or use this recipe to build it:",
dp_data.build ])

def capture_command(self, node: Box, topology: Box, args: argparse.Namespace) -> typing.Optional[list]:
intf = [ intf for intf in node.interfaces if intf.ifname == args.intf ][0]
if intf.get('libvirt.type',None) == 'tunnel':
log.error(
f'Cannot perform packet capture on libvirt point-to-point links',
category=log.FatalError,
module='libvirt',
skip_header=True,
exit_on_error=True,
hint='capture')

domiflist = external_commands.run_command(
['virsh','domiflist',f'{topology.name}_{node.name}'],
check_result=True,
return_stdout=True)
if not isinstance(domiflist,str):
return None

for intf_line in domiflist.split('\n'):
intf_data = strings.string_to_list(intf_line)
if len(intf_data) != 5:
continue
if intf_data[2] == intf.bridge:
cmd = strings.string_to_list(topology.defaults.netlab.capture.command)
cmd = strings.eval_format_list(cmd,{'intf': intf_data[0]})
return ['sudo'] + cmd

log.error(
f'Cannot find the interface on node {node.name} attached to libvirt network {intf.bridge}',
category=log.FatalError,
module='libvirt',
skip_header=True,
exit_on_error=True)
return None
10 changes: 9 additions & 1 deletion netsim/utils/log.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,8 @@ def error(
more_hints: typing.Optional[typing.Union[str,list]] = None, # More hints or extra data
more_data: typing.Optional[typing.Union[str,list]] = None,
indent: int = 10,
skip_header: typing.Optional[bool] = None) -> None:
skip_header: typing.Optional[bool] = None,
exit_on_error: bool = False) -> None:

global _ERROR_LOG,err_class_map,_WARNING_LOG,QUIET,err_color_map,_error_header_printed

Expand Down Expand Up @@ -225,13 +226,17 @@ def error(
print_more_hints(more_data,'DATA','bright_black',h_warning=category is Warning,indent=indent)

if hint is None: # No pointers to static hints
if exit_on_error and category is not Warning:
sys.exit(1)
return

from ..data.global_vars import get_topology
from .strings import extra_data_printout

topology = get_topology()
if topology is None: # No valid topology ==> no static hints
if exit_on_error and category is not Warning:
sys.exit(1)
return

mod_hints = topology.defaults.hints[module] # Get static hints for current module
Expand All @@ -248,6 +253,9 @@ def error(

mod_hints[hint] = ''

if exit_on_error and category is not Warning:
sys.exit(1)

"""
Print informational message. The arguments are similar to the ones used in 'error' function
"""
Expand Down
6 changes: 5 additions & 1 deletion netsim/utils/read.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,11 @@ def add_cli_args(topo: Box, args: typing.Union[argparse.Namespace,Box]) -> None:
topo.defaults.device = args.device

if args.provider:
topo.provider = args.provider
p_list = args.provider.split(':')
topo.provider = p_list[0]
for p_option in p_list[1:]:
for p_element in p_option.split(','):
topo.defaults.providers[topo.provider][p_element] = True

if args.plugin:
if log.debug_active('plugin'):
Expand Down
Loading

0 comments on commit 4539311

Please sign in to comment.