On Linux, the relationship between netfilter/iptables and the iproute2 package (ip/tc/rtmon) constitutes a firewall-marking and packet-mangling symbiosis to build sophisticated QoS routers. Packet mangling (manipulation) is the altering of certain bits of the IP header—for example, the ToS field or DSCP.
Table 13-1 shows an overview of the platform availability of these components. Figure 13-1 shows the lab layout for the following subsections.
OpenBSD ALTQ+pf
Packet filter (pf) is OpenBSD's stateful-inspection firewall system, packet filter, and NAT engine. It has been ported to NetBSD and FreeBSD recently. pf also is capable of normalizing and conditioning TCP/IP traffic and providing bandwidth control and packet prioritization via ALTQ. The ALTQ system is a framework to manage queuing disciplines on network interfaces.
Starting with OpenBSD 3.3, ALTQ has been integrated into pf (http://www.benzedrine.cx/pf.html). OpenBSD's ALTQ implementation supports CBQ and PRIQ schedulers (see Examples 13-17 and 13-18). It also supports RED and explicit congestion notification (ECN). The /etc/pf.conf file is the only relevant configuration file (pf.conf(5), pf(4), pfctl(8), pflogd(8), ftp-proxy(8)).
Example 13-17. OpenBSD pf/ALTQ PRIQ Example
[root@europa:~#] cat /etc/pf.conf
#
# Queuing: rule-based bandwidth control
#
# PRIQ Example
altq on xl0 bandwidth 2Mb priq queue {dflt engineering testlab}
queue dflt priority 7 qlimit 50 priq(default red ecn)
queue engineering priority 6 qlimit 50 priq(red ecn)
queue testlab priority 5 qlimit 50 priq(rio ecn)
# Filtering:
pass in log all
pass out on xl0 proto tcp from any to any port 22 queue dflt
pass out on xl0 proto icmp from any to any queue testlab
pass out on xl0 from 172.16.0.0/16 to any queue engineering
pass out log all
[root@europa:~#] pfctl -s queue -v
queue dflt priority 7 priq( red ecn default )
[ pkts: 468 bytes: 47560 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
queue engineering priority 6 priq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
queue testlab priority 5 priq( red ecn rio )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[root@europa:~#] pfctl -s queue -v -v
queue dflt priority 7 priq( red ecn default )
[ pkts: 495 bytes: 50376 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
queue engineering priority 6 priq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
queue testlab priority 5 priq( red ecn rio )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
queue dflt priority 7 priq( red ecn default )
[ pkts: 511 bytes: 52212 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[ measured: 3.2 packets/s, 2.93Kb/s ]
queue engineering priority 6 priq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[ measured: 0.0 packets/s, 0 b/s ]
queue testlab priority 5 priq( red ecn rio )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[ measured: 0.0 packets/s, 0 b/s ]
queue dflt priority 7 priq( red ecn default )
[ pkts: 532 bytes: 54538 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[ measured: 4.2 packets/s, 3.71Kb/s ]
queue engineering priority 6 priq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[ measured: 0.0 packets/s, 0 b/s ]
queue testlab priority 5 priq( red ecn rio )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 ]
[ measured: 0.0 packets/s, 0 b/s ]
[root@europa:~#] pfctl -s rules -v
scrub in all random-id fragment reassemble
[ Evaluations: 2979 Packets: 1482 Bytes: 0 States: 0 ]
pass in log all
[ Evaluations: 1536 Packets: 759 Bytes: 64806 States: 0 ]
pass out on xl0 proto tcp from any to any port = ssh queue dflt
[ Evaluations: 1539 Packets: 0 Bytes: 0 States: 0 ]
pass out on xl0 proto icmp all queue testlab
[ Evaluations: 783 Packets: 0 Bytes: 0 States: 0 ]
pass out on xl0 inet from 172.16.0.0/16 to any queue engineering
[ Evaluations: 786 Packets: 0 Bytes: 0 States: 0 ]
pass out log all
[ Evaluations: 789 Packets: 789 Bytes: 73176 States: 0 ]
Example 13-18. OpenBSD pf/ALTQ CBQ Example
[root@europa:~#] cat /etc/pf.conf
#
# Queuing: rule-based bandwidth control
#
# CBQ Example
altq on xl0 bandwidth 2Mb cbq queue {dflt engineering testlab}
queue dflt bandwidth 50% priority 7 qlimit 50 cbq(default red ecn)
queue engineering priority 6 bandwidth 30% qlimit 50 cbq(red ecn borrow)
queue testlab priority 5 bandwidth 20% qlimit 50 cbq(red ecn)
# Filtering
pass in log all
pass out on xl0 proto tcp from any to any port 22 queue dflt
pass out on xl0 proto icmp from any to any queue testlab
pass out on xl0 from 172.16.0.0/16 to any queue engineering
pass out log all
[root@europa:~#] pfctl -s queue -v
queue root_xl0 bandwidth 2Mb priority 0 cbq( wrr root ) {dflt, engineering, testlab}
[ pkts: 25 bytes: 2256 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue dflt bandwidth 1Mb priority 7 cbq( red ecn default )
[ pkts: 25 bytes: 2256 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue engineering bandwidth 600Kb priority 6 cbq( red ecn borrow )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue testlab bandwidth 400Kb priority 5 cbq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[root@europa:~#] pfctl -s queue -v -v
queue root_xl0 bandwidth 2Mb priority 0 cbq( wrr root ) {dflt, engineering, testlab}
[ pkts: 60 bytes: 6010 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue dflt bandwidth 1Mb priority 7 cbq( red ecn default )
[ pkts: 60 bytes: 6010 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue engineering bandwidth 600Kb priority 6 cbq( red ecn borrow )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue testlab bandwidth 400Kb priority 5 cbq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
queue root_xl0 bandwidth 2Mb priority 0 cbq( wrr root ) {dflt, engineering, testlab}
[ pkts: 80 bytes: 8454 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[ measured: 4.0 packets/s, 3.90Kb/s ]
queue dflt bandwidth 1Mb priority 7 cbq( red ecn default )
[ pkts: 80 bytes: 8454 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[ measured: 4.0 packets/s, 3.90Kb/s ]
queue engineering bandwidth 600Kb priority 6 cbq( red ecn borrow )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[ measured: 0.0 packets/s, 0 b/s ]
queue testlab bandwidth 400Kb priority 5 cbq( red ecn )
[ pkts: 0 bytes: 0 dropped pkts: 0 bytes: 0 ]
[ qlength: 0/ 50 borrows: 0 suspends: 0 ]
[ measured: 0.0 packets/s, 0 b/s ]
[root@europa:~#] pfctl -s rules -v
scrub in all random-id fragment reassemble
[ Evaluations: 584 Packets: 287 Bytes: 0 States: 0 ]
pass in log all
[ Evaluations: 10 Packets: 5 Bytes: 380 States: 0 ]
pass out on xl0 proto tcp from any to any port = ssh queue dflt
[ Evaluations: 10 Packets: 0 Bytes: 0 States: 0 ]
pass out on xl0 proto icmp all queue testlab
[ Evaluations: 5 Packets: 0 Bytes: 0 States: 0 ]
pass out on xl0 inet from 172.16.0.0/16 to any queue engineering
[ Evaluations: 5 Packets: 0 Bytes: 0 States: 0 ]
pass out log all
[ Evaluations: 5 Packets: 5 Bytes: 380 States: 0 ]
FreeBSD ipfilter+ALTQ
In contrast to OpenBSD pf, ipfilter and ALTQ do not form an integrated architecture. ALTQ does not natively ship with FreeBSD; it is now part of the KAME Project (http://www.kame.net/). The original web page is located at http://www.csl.sony.co.jp/~kjc/software.html#ALTQ.
ALTQ is concerned with queuing disciplines and resource-sharing QoS approaches. Remember, a queuing discipline controls outgoing traffic only. It provides stubs for RSVP and DiffServ support and enforces the following queuing regimes:
-
CBQ— Class-based queuing
-
HFSC— The hierarchical fair service curve algorithm for Link sharing, real-time, and priority service
-
JoBS— Joint buffer management and scheduling algorithm
-
RED— Random early detection
-
RIO— RED with in/out
-
Blue— A queue-management algorithm focusing on eliminating packet loss in congestion situations (an alternative to RED)
-
WFQ— Weighted fair queuing
-
PRIQ— Priority queuing
CBQ, HFSC, and RED are the most mature and recommended approaches. Relevant management tools and man pages are as follows:
-
tbrconfig(8)— Configure a token-bucket regulator for an output queue
-
altq(9)— Kernel interfaces for manipulating output queues on network interfaces
-
altq.conf(5)— ALTQ configuration file for altqd(8)
-
altqd(8)— The ALTQ daemon
-
altqstat(1)— Show ALTQ status
-
pf.conf(5)— Packet filter configuration file
-
pfctl(8)— Control the packet filter (PF) and NAT device
You can use the ALTQ token-bucket regulator to rate-limit an interface (see Example 13-19).
Example 13-19. Interface Rate-Limiting Without Queuing Discipline
[root@castor:~#] tbrconfig ed0 30M auto
ed00: tokenrate 30.00M(bps) bucketsize 36.62K(bytes)
[root@castor:~#] tbrconfig -d ed0
deleted token bucket regulator on ed0
FreeBSD IP Firewall(ipfw) + dummynet
FreeBSD's ipfw(4) is the utility that is responsible for controlling the ipfirewall(4) and dummynet(4) system facilities. ipfirewall is used for filtering, redirection, accounting, and NAT, whereas dummynet is a flexible bandwidth manager and delay emulator for traffic shaping and networking protocol testing on the FreeBSD operating system. dummynet supports a variant of WFQ and can be used on any type of workstation or gateway acting as either a router or bridge (see Example 13-20). For a detailed introduction, check out the following excellent man pages: divert(4), ipfirewall(4), ipfw(4), ipfw-graph(8), ipfw-al(1).
Example 13-20. Traffic Shaping on a Crossover-Link FreeBSD with RED and WF2Q+ <--> Cisco IOS Architecture with CAR/GTS and RED on the Cisco Side
[root@castor:~#] ipfw add pipe 1 icmp from any to any out xmit ed0
[root@castor:~#] ipfw pipe 1 config bw 8Kbit/s queue 10 delay 10ms red
[root@castor:~#] ipfw queue 1 config pipe 1 weight 1 red
[root@castor:~#] ping 192.168.7.254
PING 192.168.7.254 (192.168.7.254): 56 data bytes
64 bytes from 192.168.7.254: icmp_seq=0 ttl=255 time=94.281 ms
64 bytes from 192.168.7.254: icmp_seq=1 ttl=255 time=95.006 ms
64 bytes from 192.168.7.254: icmp_seq=2 ttl=255 time=95.030 ms
64 bytes from 192.168.7.254: icmp_seq=3 ttl=255 time=95.002 ms
scar# show running-config
...
!
interface Ethernet0
bandwidth 10000
ip address 192.168.14.254 255.255.255.0
media-type 10BaseT
random-detect
traffic-shape rate 8000 8000 8000 1000
!
interface Ethernet1
ip address 192.168.7.254 255.255.255.0
rate-limit output 8000 1500 2000 conform-action transmit exceed-action drop
media-type 10BaseT
random-detect
!...
scar# show traffic-shape ethernet 0
Interface Et0
Access Target Byte Sustain Excess Interval Increment Adapt
VC List Rate Limit bits/int bits/int (ms) (bytes) Active
- 8000 2000 8000 8000 1000 1000 -
scar# show traffic-shape statistics ethernet 0
Access Queue Packets Bytes Packets Bytes Shaping
I/F List Depth Delayed Delayed Active
Et0 0 351 32902 0 0 no
scar# show int ethernet 1 rate-limit
Ethernet1
Output
matches: all traffic
params: 8000 bps, 1500 limit, 2000 extended limit
conformed 340 packets, 33320 bytes; action: transmit
exceeded 0 packets, 0 bytes; action: drop
last packet: 940ms ago, current burst: 0 bytes
last cleared 00:12:30 ago, conformed 0 bps, exceeded 0 bps
scar# ping 192.168.7.7
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.7.7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 108/112/116 ms
dummynet works in symbiosis with ipfw by "intercepting packets and passing them through one or more objects called queues and pipes, which simulate the effects of bandwidth limitations, propagation delays, bounded-size queues, packet losses, and multipath."[1]
Pipes represent fixed-bandwidth channels that can contain one or multiple queues. Think of a pipe as analogous to ATM virtual paths (VPs) and virtual channels (VCs). You can control the utilization and, as a consequence, the proportional bandwidth share of a pipe by associating queues with a weight.
Linux Firewall Marking and iproute2 (ip/tc)
netfilter and iptables are elements of the firewalling, NAT/NAPT, policy router, and packet-mangling architecture for the 2.4.x and 2.6.x Linux kernels. netfilter and iptables(8) allow marking and tagging of a packet with a number via the --set-mark facility. This is a mark of local significance only (packet metadata) and does not alter the IP header after forwarding. This mark can assign a different routing table (routing policy), as demonstrated in Example 13-21.
Example 13-21. Policy Routing Based on iptables Markings
[root@callisto:~#] iptables -A PREROUTING -i eth0 -t mangle -p tcp --dport 25 -j MARK
--set-mark 1
[root@callisto:~#] echo 1 lab >> /etc/iproute2/rt_tables
[root@callisto:~#] ip rule add fwmark 1 table lab
[root@callisto:~#] ip rule list
0: from all lookup local
32764: from all fwmark 1 lookup lab
32766: from all lookup main
32767: from all lookup default
[root@callisto:~#] ip route add default via 192.168.14.254 table lab
Bell Labs' Eclipse—An Operating System with QoS Support
The Eclipse operating system (http://www.bell-labs.com/project/eclipse/release/) is a QoS test platform from Lucent Technologies. It is an independent OS approach that is compatible with FreeBSD and provides a simple application-programming interface (API) for fine-grained QoS support.
0 comments:
Post a Comment