IPTables basics - Mulepedia

`iptables`

Note : iptables is heading down the deprecation road and is superseded by nftables. Although the syntax may be different, the general concepts as well as the kernel built-in tables and chains can be assessed to remain the same.

Features overview
Rules configuration
Rules traversal
Commands samples
General syntax
Best practices and caveats
NAT (Network Address Translation)
Connection tracking
Putting everything together
Packets inspection tools

Features overview

netfilter is a networking framework built into the linux kernel that supports a variety of features.
iptables is a userspace utility that interacts with the packet filtering features of netfilter :
1. It reads the IP header of every incoming / outgoing packet.
2. It then processes the packet according to its configured rules.
3. Thus, packet filtering happens at layer 3 of the OSI model.
Caution : iptables does not support IPv6 !!! Use ip6tables if required.

Rules configuration

Packet filtering configuration is stored in built in chains :
1. INPUT : rules for packets traversing the local IP layer upwards (enter).
2. OUTPUT : rules for packet traversing the local IP layer downwards (exit).
3. FORWARD : rules for packets routed to another destination (enter then exit).
4. PREROUTING : rule applied to packets before the routing decision (enter).
5. POSTROUTING : rule applied to packets after the routing decision (exit).
When a packet reaches the IP layer, a kernel hook fires and applies rules for the relevant chain :
1. Rules are evaluated sequentially (order matters).
2. Rules are evaluated until one rule matches the processed packet.
3. Once the matching rule is found, the configured target is applied.
4. Some targets are non-terminating (ex LOG), ie. the next rule will be evaluated regardless.
5. If no matching rule is found, the default chain policy is applied.
User-specified chains can be created and used as targets, however this topic is considered out of scope.

Note : IP packet forwarding and NAT support requires setting kernel variable to 1 in /proc/sys/net/ipv4/ip_forward.

Rules traversal

The kernel views the configured rules as a single ruleset, however their configuration is split across multiple built-in tables.
Each table supports a distinct set of built-in chains; chain rules pertaining to the table's purpose may be configured there :

chain / table filter nat mangle raw security

INPUT X X X X

OUTPUT X X X X X

FORWARD X X X

PREROUTING X X X

POSTROUTING X X
Depending on the scenario (enter, exit, enter then exit) the packet is evaluated against a specific sequence of chains.
Detailed tables description as well as the list of supported targets per table can be found here.

chain / table	`filter`	`nat`	`mangle`	`raw`	`security`
`INPUT`	X	X	X		X
`OUTPUT`	X	X	X	X	X
`FORWARD`	X		X		X
`PREROUTING`		X	X	X
`POSTROUTING`		X	X

Commands samples

# list rules for table filter chain INPUT, show packets and bytes count
sudo iptables -t filter -L INPUT -v

# allow incoming ICMP packets from local subnet on eth0, log headers and drop otherwise
sudo iptables -t filter -A INPUT -i eth0 -s 192.168.1.0/24 -p icmp -j ACCEPT
sudo iptables -t filter -A INPUT -p icmp -j LOG --log-prefix "ICMP drop inbound: "
sudo iptables -t filter -A INPUT -p icmp -j DROP

# print kernel messages showing IP headers for dropped packets
sudo dmesg -k --level=err,warn | grep "ICMP drop"

# allow list pattern using negative matchers is unreliable, avoid
sudo iptables -t filter -A OUTPUT ! -o eth0 ! -d 192.168.1.1 -p icmp -j DROP

# change default policy for chain INPUT (only supports ACCEPT and DROP)
sudo iptables -t filter -P INPUT DROP

# reset packet and bytes count for rule 1 of chain INPUT (rules are 1-indexed)
sudo iptables -t filter -Z INPUT 1

# delete rule 1 from chain INPUT (rules are 1-indexed)
sudo iptables -t filter -D INPUT 1

# delete all rules for chain INPUT
sudo iptables -t filter -F INPUT

# print module specific options if available
iptables -m icmp --help
iptables -m conntrack --help

General syntax

Rules can be customized using a number of extension modules.
Since iptables operates at layer 3, specifying a layer 4 protocol with -p is not needed to write rules.
In the event it is specified, iptables automatically looks for a module of the same name and loads it if found.
This feature is somehow confusing and abstracts the correct syntax for writing rules related commands in most cases :

# table
TABLE="filter"
# chain (must match table)
CHAIN="INPUT"
# general options
OPTS=("-i" "eth0" "-s" "192.168.1.0/24" "-p" "icmp")
# extension module
MODULE="icmp"
# module specific options
MODULE_OPTS=("--icmp-type" "echo-request")
# target (must match table AND module)
TARGET="LOG"
# target specific options
TARGET_OPTS=("--log-prefix" "\"ping from local subnet: \"")
# use the -A command to append a rule with the above configuration
sudo iptables -t "$TABLE" -A "$CHAIN" "${OPTS[*]}" -m "$MODULE" "${MODULE_OPTS[*]}" -j "$TARGET" "${TARGET_OPTS[*]}"

Note : multiple modules can be loaded in a single command using multiple -m options.

Best practices and caveats

Adding ACCEPT rules to chains permits monitoring packet traffic (packets and bytes count).
Production grade logging rules must set the --limit flag to avoid log flooding (see documentation).
Limitations for interface based filtering :
- Only packets traversing the FORWARD chain can be matched against both an input and output interface.
- For instance, a rule for the INPUT chain that specifies an output interface will never match.
Limitations for layer 4 based filtering :
- iptables only supports policies based on headers of packetized layer 4 messages as opposed to their contents.
- TCP features such as acknowledgment, retransmission, reordering are too resource intensive to implement in the packet filter.
Advanced use cases may require fine tuning of rules for fragmented IP packets using the -f option :
- In this case, every packet but the first misses the layer 4 protocol header so they may not match layer 4 filtering based rules.
- This is mitigated by the fact that dropping the first fragment prevents reassembly of the original packet in most cases.

NAT (Network Address Translation)

Source NAT :
- SNAT rules mangle packets by altering their source IP address after the routing decision.
- For instance, SNAT rules are set up in a router to mutualize internet access for all nodes of a layer 3 network segment.
- The router overwrites the initial source IP (local node) of externally routed packets with its own external IP.
- MASQUERADE is a special type of SNAT for dynamic addressing, when rules can't be passed a static IP address.
Destination NAT :
- DNAT rules mangle packets by altering their destination IP address before the routing decision.
- For instance, DNAT rules are set up in a router to distribute packets between nodes connected to a layer 3 network segment.
- The router overwrites the initial destination IP (its own external IP) of incoming packets with the IP of a local node.
iptables mangles IP packets as little as possible with respect to the configured NAT rules.

Connection tracking

Connection tracking is supported for layer 4 protocols: packets are identified as part of an existing connection.
A state machine keeps track of layer 4 headers values that are common to all packets exchanged during a session.
Incoming packets are evaluated against tracked sessions to find out which ones they belong to and apply the relevant rules.

The `conntrack` module

conntrack is a kernel module that implements the state machine used for connection tracking.
It maintains a table of entries, each containing the current and next states for a tracked connection :
1. The current state describes the last transmitted packet identified for the connection.
2. The next state describes the next packet expected to be identified for the connection.
3. When the next packet is identified, the current state is updated and a new next state is computed.
conntrack entries updates are performed according to the normal flow of messages for the associated layer 4 protocol.
For instance, UDP "connections" are tracked using source / destination IP addresses and source / destination ports.
The structures used for the conntrack entries are specific to the kernel and out of scope when considering iptables.

Connection tracking in `iptables`

Connection states are updated when packets traverse the OUTPUT (outbound) and PREROUTING (inbound) chains :
1. NEW: packet started a new connection.
2. ESTABLISHED: packet is part of an existing connection.
3. RELATED: packet starts a new connection while being part of an existing one.
4. INVALID: packet can't be associated with a new or existing connection.
5. UNTRACKED: packet explicitly excluded from connection tracking in the raw table.
Additional state-based filtering rules can be applied in subsequent tables to the matching packets.

Putting everything together

A router is configured with local eth0 (192.168.1.1) and public eth1 (292.155.42.7) which is the default route.
iptables is configured to allow traffic for locally generated pings, drop everything else and masquerade :

# set default policies, drop everything
sudo iptables -t filter -P INPUT DROP
sudo iptables -t filter -P OUTPUT DROP
sudo iptables -t filter -P FORWARD DROP

# log tracked connections, state is updated to ESTABLISHED once the ping reply arrives
sudo iptables -t raw -A PREROUTING -i eth0 -s 192.168.1.0/24 -p icmp -m icmp --icmp-type echo-request -m state --state NEW -j LOG --log-prefix "Ping :"
sudo iptables -t raw -A PREROUTING -p icmp -m state --state ESTABLISHED -j LOG --log-prefix "Pong: "

# allow 1-way traffic for new connections, 2-way traffic for existing connections
sudo iptables -t filter -A FORWARD -i eth0 -s 192.168.1.0/24 -p icmp -m icmp --icmp-type echo-request -m state --state NEW -j ACCEPT
sudo iptables -t filter -A FORWARD -p icmp -m state --state ESTABLISHED -j ACCEPT

# masquerade filtered packets, ignore connection state
sudo iptables -t nat -A POSTROUTING -o eth1 -s 192.168.1.0/24 -j MASQUERADE

conntrack identifies a packetized ping request and relevant reply as part of the same connection.

Thus, netfilter performs connection tracking, packet filtering and NAT in the following sequence :

#	src	dest	table	chain	action
1	`192.168.1.2`	`8.8.8.8`	`raw`	`PREROUTING`	Start connection tracking, log
2	`192.168.1.2`	`8.8.8.8`	`filter`	`FORWARD`	Apply state based filtering rules
3	`192.168.1.2`	`8.8.8.8`	`nat`	`POSTROUTING`	Apply `SNAT` rules
4	`292.155.42.7`	`8.8.8.8`	-	-	Outgoing packet transmitted on `eth1`
5	`8.8.8.8`	`292.155.42.7`	`raw`	`PREROUTING`	Reverse `SNAT`, update connection, log
6	`8.8.8.8`	`192.168.1.2`	`filter`	`FORWARD`	Apply state based filtering rules
7	`8.8.8.8`	`192.168.1.2`	-	-	Incoming packet transmitted on `eth0`

Packets inspection tools

The LOG target being raw and impractical, tcpdump must be preferred for traffic inspection.
It matches packets against filter expressions written using the Berkeley Packet Filter syntax :

# list interfaces available for capture
sudo tcpdump --list-interfaces

# printes physical link type for interface
sudo tcpdump -i eth0 --list-data-link-types

# logs ICMP packets arriving on eth0 from 192.168.1.10
sudo tcpdump --print -q -i eth0 host 192.168.1.10 and ip proto 1

# logs outgoing DNS requests
sudo tcpdump --print -v -i eth0 udp and dst port 53

# logs TCP session between host and 192.168.1.10 on port 8080
sudo tcpdump --print -v -i eth0 host 192.168.1.10 and tcp and port 8080

# saves the next 10 packets transmitted through eth0 to a file
sudo tcpdump -i eth0 -c 10 -w traffic.pcap

# read packets from a saved file (no sudo)
tcpdump -r traffic.pcap

Notes :

tcpdump defaults to promiscuous mode when capturing packets, however this behavior can be disabled with -p.
Man pages for tcpdump and pcap-filter are pretty consequent, so this cheatsheet should cover most use cases.
This cool guide contains a comprehensive list of network tools to use as well.

iptables

Table of contents