Copyright © 2001 by Oskar Andreasson
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1; with the Invariant Sections being "Introduction" and all sub-sections, with the Front-Cover Texts being "Original Author: Oskar Andreasson", and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
All scripts in this tutorial are covered by the GNU General Public License. The scripts are free source; you can redistribute them and/or modify them under the terms of the GNU General Public License as published by the Free Software Foundation, version 2 of the License.
These scripts are distributed in the hope that they will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License within this tutorial, under the section entitled "GNU General Public License"; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Table of Contents
1.1. About the author
1.3. Why this document was written
1.4. How it was written
1.5. How to read this document
1.6. Terms used in this document
2.1. Where to get iptables
2.2. Kernel setup
2.3. Userland setup
2.3.1. Compiling the userland applications
2.3.2. Installation on Red Hat 7.1
3. Traversing of tables and chains
3.2. Mangle table
3.3. Nat table
3.4. Filter table
4. The state machine
4.2. The conntrack entries
4.3. Userland states
4.4. TCP connections
4.5. UDP connections
4.6. ICMP connections
4.7. Default connections
4.8. Complex protocols and connection tracking
5. How a rule is built
5.4.1. Generic matches
5.4.2. Implicit matches
5.4.3. Explicit matches
5.5.1. ACCEPT target
5.5.2. DROP target
5.5.3. QUEUE target
5.5.4. RETURN target
5.5.5. LOG target
5.5.6. MARK target
5.5.7. REJECT target
5.5.8. TOS target
5.5.9. MIRROR target
5.5.10. SNAT target
5.5.11. DNAT target
5.5.12. MASQUERADE target
5.5.13. REDIRECT target
5.5.14. TTL target
5.5.15. ULOG target
6. rc.firewall file
6.1. example rc.firewall
6.2. explanation of rc.firewall
6.2.1. Configuration options
6.2.2. Initial loading of extra modules
6.2.3. proc set up
6.2.4. Displacement of rules to different chains
6.2.5. Setting up default policies
6.2.6. Setting up user specified chains in the filter table
6.2.7. INPUT chain
6.2.8. FORWARD chain
6.2.9. OUTPUT chain
6.2.10. PREROUTING chain of the nat table
6.2.11. Starting SNAT and the POSTROUTING chain
7. Example scripts
7.1. rc.firewall.txt script structure
7.1.1. The structure
A. Detailed explanations of special commands
A.1. Listing your active ruleset
A.2. Updating and flushing your tables
B. Common problems and questions
B.1. Problems loading modules
B.2. Passive FTP but no DCC
B.3. State NEW packets but no SYN bit set
B.4. Internet Service Providers who use assigned IP addresses
B.5. Letting DHCP requests through a iptables
B.6. mIRC DCC problems
C. ICMP types
D. Other resources and links
G. GNU Free Documentation License
1. APPLICABILITY AND DEFINITIONS
2. VERBATIM COPYING
3. COPYING IN QUANTITY
5. COMBINING DOCUMENTS
6. COLLECTIONS OF DOCUMENTS
7. AGGREGATION WITH INDEPENDENT WORKS
10. FUTURE REVISIONS OF THIS LICENSE
How to use this License for your documents
H. GNU General Public License
1. TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
2. How to Apply These Terms to Your New Programs
I. Example scripts codebase
I.1. Example rc.firewall script
I.2. Example rc.DMZ.firewall script
I.3. Example rc.UTIN.firewall script
I.4. Example rc.DHCP.firewall script
I.5. Example rc.flush-iptables script
I.6. Example rc.test-iptables script
List of Tables
3-1. Forwarded packets
3-2. Destination local host (our own machine)
3-3. Source local host (our own machine)
4-1. Userland states
4-2. Internal states
5-4. Generic matches
5-5. TCP matches
5-6. UDP matches
5-7. ICMP matches
5-8. MAC match options
5-9. Limit match options
5-10. Multiport match options
5-11. Mark match options
5-12. Owner match options
5-13. State matches
5-14. TOS matches
5-15. TTL matches
5-16. LOG target options
5-17. MARK target options
5-18. REJECT target
5-19. TOS target
5-20. SNAT target
5-21. DNAT target
5-22. MASQUERADE target
5-23. REDIRECT target
5-24. TTL target
5-25. ULOG target
C-1. ICMP types
Chapter 1. Introduction
1.1. About the author
I am someone with too many old computers on his hands. I have my own LAN and want all my machines to be connected to the Internet, whilst at the same time making my LAN fairly secure. The new iptables is a good upgrade from the old ipchains in this regard. With ipchains, you could make a fairly secure network by dropping all incoming packages not destined for given ports. However, things like passive FTP or outgoing DCC in IRC would cause problems. They assigns ports on the server, tell the client about it, and then let the client connect. There were some toothing problems in the iptables code that I ran into in the beginning, and in some respects I found the code not quite ready for release in full production. Today, I'd recommend everyone who uses ipchains or even older ipfwadm etc .,to upgrade - unless they are happy with what their current code is capable of and if it does what they need it to.
First of all I would like to dedicate this document to my wonderful girlfriend Ninel. She has supported me more than I ever can support her to any degree. I wish I could make you just as happy as you make me.
Second of all, I would like to dedicate this work to all of the incredibly hard working Linux developers and maintainers. It is people like those who makes this wonderful operating system possible.
1.3. Why this document was written
Well, I found a big empty space in the HOWTO's out there lacking in information about the iptables and Netfilter functions in the new Linux 2.4.x kernels. Among other things, I'm going to try to answer questions that some might have about the new possibilities like state matching. Is it possible to allow passive FTP to your server, but not allow outgoing DCC from IRC as an example? I will illustrate this with an example rc.firewall.txt file that you can use in your /etc/rc.d/ scripts. Yes, this file was originally based upon the masquerading HOWTO for those of you who recognize it.
Also, there's a small script that I wrote just in case you screw up as much as I did during the configuration available as rc.flush-iptables.txt.
1.4. How it was written
I've consulted Marc Boucher and others from the core Netfilter team. Many heartfelt thanks to them for their work and for their help on this tutorial, that I wrote and maintain for boingworld.com. This document will guide you through the setup process step by step and hopefully help you to understand some more about the iptables package. I will base most of the stuff here on the example rc.firewall file, since I find that example a good way to learn how to use iptables. I have decided to just follow the basic chains and from there go down into each and one of the chains traversed in each due order. That way the tutorial is a little bit harder to follow, though this way is more logical. Whenever you find something that's hard to understand, just come back to this tutorial.
1.5. How to read this document
This document was written purely so people can start to grasp the wonderful world of iptables. It was never meant to contain information on specific security bugs in iptables or netfilter. If you find peciuliar bugs or behaviours in iptables or any of the subcomponents, you should contact the netfilter mailing lists and tell them about the problem and they can tell you if this is a real bug or if it has already been fixed. There are very seldomly actual security related bugs found in iptables or netfilter, however, one or two do slip by once in a while. These are properly shown on the front page of the netfilter main page, and that is where you should go to get information on such topics.
The above also implies that the rulesets available with this tutorial are not written to deal with actual bugs inside netfilter. The main goal of them is to simply show how to set up rules in a nice simple fashion that deals with all problems we may run into. For example, this tutorial will not cover how we would close down the HTTP port for the simple reason that Apache happens to be vulnerable in version 1.2.12 (This is covered really, though not for that reason).
This document was simply written to give everyone a good and simple primer at how to get started with iptables, but at the same time it was created to be as complete as possible. It does not contain any targets or matches that are in patch-o-matic for the simple reason that it would require too much effort to keep such a list updated. If you need information about the patch-o-matic updates, you should read the info that comes with it in patch-o-matic as well as the other documentations available on the netfilter main page.
1.6. Terms used in this document
This document contains a few terms that may need more detailed explanations before you read them. This section will try to cover the most obvious ones and how I have chosen to use them within this document.
Stream - This term refers to a connection that sends and receives packets that are related to eachother in some fashion. Basically, I have used this term for any kind of connection that sends 2 packets or more in both directions. In TCP this may mean a connection that sends a SYN and then replies with an SYN/ACK, but it may also mean a connection that sends a SYN and then replies with a ICMP Host unreachable. In other words, I use this term very loosely.
State - This term refers to which state the packet is in, either according to RFC 793 - Transmission Control Protocol or according to userside states used in netfilter/iptables.
Chapter 2. Preparations
This chapter is aimed at getting you started and to help you understand the role Netfilter and iptables play in Linux today. This chapter should hopefully get you set up and finished to go with your experimentation, and installation of your firewall. Given time and perseverance, you'll then get it to perform exactly as you want it to.
2.1. Where to get iptables
The iptables userspace package can be downloaded from the Netfilter homepage. The iptables package also makes use of kernel space facilities which can be configured into the kernel during make configure. The necessary steps will be discussed a bit further down in this document.
2.2. Kernel setup
To run the pure basics of iptables you need to configure the following options into the kernel while doing make config or one of its related commands:
CONFIG_PACKET - This option allows applications and utilities that needs to work directly to various network devices. Examples of such utilities are tcpdump or snort.
CONFIG_NETFILTER - This option is required if you're going to use your computer as a firewall or gateway to the internet. In other words, this is most definitely required if for anything in this tutorial to work at all. I assume you will want this, since you are reading this.
And of course you need to add the proper drivers for your interfaces to work properly, i.e. Ethernet adapter, PPP and SLIP interfaces. The above will only add some of the pure basics in iptables. You won't be able to do anything productive to be honest, it just adds the framework to the kernel. If you want to use the more advanced options in IPTables, you need to set up the proper configuration options in your kernel. Here we will show you the options available in a basic 2.4.9 kernel and a brief explanation :
CONFIG_IP_NF_CONNTRACK - This module is needed to make connection tracking. Connection tracking is used by, amongst other things, NAT and Masquerading. If you need to firewall machines on a LAN you most definitely should mark this option. For example, this module is required by the rc.firewall.txt script to work.
CONFIG_IP_NF_FTP - This module is required if you want to do connection tracking on FTP connections. Since FTP connections are quite hard to do connection tracking on in normal cases, conntrack needs a so called helper, this option compiles the helper. If you do not add this module you won't be able to FTP through a firewall or gateway properly.
CONFIG_IP_NF_IPTABLES - This option is required if you want do any kind of filtering, masquerading or NAT. It adds the whole iptables identification framework to the kernel. Without this you won't be able to do anything at all with iptables.
CONFIG_IP_NF_MATCH_LIMIT - This module isn't exactly required but it's used in the example rc.firewall.txt. This option provides the LIMIT match, that adds the possibility to control how many packets per minute that are to be matched, governed by an appropriate rule. For example, -m limit --limit 3/minute would match a maximum of 3 packets per minute. This module can also be used to avoid certain Denial of Service attacks.
CONFIG_IP_NF_MATCH_MAC - This allows us to match packets based on MAC addresses. Every Ethernet adapter has its own MAC address. We could for instance block packets based on what MAC address is used and block a certain computer pretty well since the MAC address very seldom change. We don't use this option in the rc.firewall.txt example or anywhere else.
CONFIG_IP_NF_MATCH_MARK - This allows us to use a MARK match. For example, if we use the target MARK we could mark a packet and then depending on if this packet is marked further on in the table, we can match based on this mark. This option is the actual match MARK, and further down we will describe the actual target MARK.
CONFIG_IP_NF_MATCH_MULTIPORT - This module allows us to match packets with a whole range of destination ports or source ports. Normally this wouldn't be possible, but with this match it is.
CONFIG_IP_NF_MATCH_TOS - With this match we can match packets based on their TOS field. TOS stands for Type Of Service. TOS can also be set by certain rules in the mangle table and via the ip/tc commands.
CONFIG_IP_NF_MATCH_TCPMSS - This option adds the possibility for us to match TCP packets based on their MSS field.
CONFIG_IP_NF_MATCH_STATE - This is one of the biggest news in comparison to ipchains. With this module we can do stateful matching on packets. For example, if we have already seen trafic in two directions in a TCP connection, this packet will be counted as ESTABLISHED. This module is used extensively in the rc.firewall.txt example.
CONFIG_IP_NF_MATCH_UNCLEAN - This module will add the possibility for us to match IP, TCP, UDP and ICMP packets that don't conform to type or are invalid. We could for example drop these packets, but we never know if they are legitimate or not. Note that this match is still experimental and might not work perfectly in all cases.
CONFIG_IP_NF_MATCH_OWNER - This option will add the possibility for us to do matching based on the owner of a socket. For example, we can allow only the user root to have Internet access. This module was originally just written as an example on what could be done with the new iptables. Note that this match is still experimental and might not work for everyone.
CONFIG_IP_NF_FILTER - This module will add the basic filter table which will enable you to do IP filtering at all. In the filter table you'll find the INPUT, FORWARD and OUTPUT chains. This module is required if you plan to do any kind of filtering on packets that you receive and send.
CONFIG_IP_NF_TARGET_REJECT - This target allows us to specify that an ICMP error message should be sent in reply to incoming packets, instead of plainly dropping them dead to the floor. Keep in mind that TCP connections, as opposed to ICMP and UDP, are always reset or refused with a TCP RST packet.
CONFIG_IP_NF_TARGET_MIRROR - This allows packets to be bounced back to the sender of the packet. For example, if we set up a MIRROR target on destination port HTTP on our INPUT chain and someone tries to access this port, we would bounce his packets back to him and finally he would probably see his own homepage.
CONFIG_IP_NF_NAT - This module allows network address translation, or NAT, in its different forms. This option gives us access to the nat table in iptables. This option is required if we want to do port forwarding, masquerading, etc. Note that this option is not required for firewalling and masquerading of a LAN, but you should have it present unless you are able to provide unique IP addresses for all hosts. Hence, this option is required for the example rc.firewall.txt script to work properly, and most definitely on your network if you do not have the ability to add unique IP addresses as specified above.
CONFIG_IP_NF_TARGET_MASQUERADE - This module adds the MASQUERADE target. For instance if we don't know what IP we have to the Internet this would be the preferred way of getting the IP instead of using DNAT or SNAT. In other words, if we use DHCP, PPP, SLIP or some other connection that assigns us an IP, we need to use this target instead of SNAT. Masquerading gives a slightly higher load on the computer than NAT, but will work without us knowing the IP address in advance.
CONFIG_IP_NF_TARGET_REDIRECT - This target is useful together with application proxies, for example. Instead of letting a packet pass right through, we remap them to go to our local box instead. In other words, we have the possibility to make a transparent proxy this way.
CONFIG_IP_NF_TARGET_LOG - This adds the LOG target and its functionality to iptables. We can use this module to log certain packets to syslogd and hence see what is happening to the packet. This is invaluable for security audits, forensics or debugging a script you are writing.
CONFIG_IP_NF_TARGET_TCPMSS - This option can be used to counter Internet Service Providers and servers who block ICMP Fragmentation Needed packets. This can result in webpages not getting through, small mails getting through while larger mails don't, ssh works but scp dies after handshake, etc. We can then use the TCPMSS target to overcome this by clamping our MSS (Maximum Segment Size) to the PMTU (Path Maximum Transmit Unit). This way, we'll be able to handle what the authors of netfilter them selves call "criminally braindead ISPs or servers" in the kernel configuration help.
CONFIG_IP_NF_COMPAT_IPCHAINS - Adds a compatibility mode with the obsolescent ipchains. Do not look to this as any real long term solution for solving migration from Linux 2.2 kernels to 2.4 kernels, since it may well be gone with kernel 2.6.
CONFIG_IP_NF_COMPAT_IPFWADM - Compatibility mode with obsolescent ipfwadm. Definitely don't look to this as a real long term solution.
As you can see, there is a heap of options. I have briefly explained here what kind of extra behaviours you can expect from each module. These are only the options available in a vanilla Linux 2.4.9 kernel. If you would like to take a look at more options, I suggest you look at the patch-o-matic functions in Netfilter userland which will add heaps of other options in the kernel. POM fixes are additions that are supposed to be added in the kernel in the future but has not quite reached the kernel yet. These functions should be added in the future, but has not quite made it in yet. This may be for various reasons - such as the patch not being stable yet, to Linus Torvalds being unable to keep up, or not wanting to let the patch in to the mainstream kernel yet since it is still experimental.
You will need the following options compiled into your kernel, or as modules, for the rc.firewall.txt script to work. If you need help with the options that the other scripts need, look at the example firewall scripts section.
At the very least the above will be required for the rc.firewall.txt script. In the other example scripts I will explain what requirements they have in their respective sections. For now, let's try to stay focused on the main script which you should be studying now.
2.3. Userland setup
First of all, let's look at how we compile the iptables package. It's important to realize that for most part configuration and compilation of iptables goes hand in hand with the kernel configuration and compilation. Certain distributions comes with the iptables package preinstalled, one of these are Red Hat. However, in Red Hat it is disabled per default. We will check closer on how to enable it and take a look at other distributions further on in this chapter.
2.3.1. Compiling the userland applications
First of all unpack the iptables package. Here, we have used the iptables 1.2.6a package and a vanilla 2.4 kernel. Unpack as usual, using bzip2 -cd iptables-1.2.6a.tar.bz2 | tar -xvf - (this can also be accomplished with the tar -xjvf iptables-1.2.6a.tar.bz2, which should do pretty much the same as the first command. However, this may not work with older versions of tar). The package should now be unpacked properly into a directory named iptables-1.2.6a. For more information read the iptables-1.2.6a/INSTALL file which contains pretty good information on compiling and getting the program to run.
After this, there you have the option of configuring and installing extra modules and options etcetera for the kernel.The step described here will only check and install standard patches that are pending for inclusion to the kernel, there are some even more experimental patches further along, which may only be available when you carry out other steps.
Some of these patches are highly experimental and may not be such a good idea to install them. However, there are heaps of extremely interesting matches and targets in this installation step so don't be afraid of at least looking at them.
To carry out this step we do something like this from the root of the iptables package:
make pending-patches KERNEL_DIR=/usr/src/linux/
The variable KERNEL_DIR should point to the actual place that your kernel source is located at. Normally this should be /usr/src/linux/ but this may vary, and most probably you will know yourself where the kernel source is available.
This only asks about certain patches that are just about to enter the kernel anyway. There might be more patches and additions that the developers of Netfilter are about to add to the kernel, but is a bit further away from actually getting there. One way to install these are by doing the following:
make most-of-pom KERNEL_DIR=/usr/src/linux/
The above command would ask about installing parts of what in Netfilter world is called patch-o-matic, but still skip the most extreme patches that might cause havoc in your kernel. Note that we say ask, because that's what these commands actually do. They ask you before anything is changed in the kernel source. To be able to install all of the patch-o-matic stuff you will need to run the following command:
make patch-o-matic KERNEL_DIR=/usr/src/linux/
Don't forget to read the help for each patch thoroughly before doing anything. Some patches will destroy other patches while others may destroy your kernel if used together with some patches from patch-o-matic etc.
You may totally ignore the above steps if you don't want to patch your kernel, it is in other words not necessary to do the above. However, there are some really interesting things in the patch-o-matic that you may want to look at so there's nothing bad in just running the commands and see what they contain.
After this you are finished doing the patch-o-matic parts of installation, you may now compile a new kernel making use of the new patches that you have added to the source. Don't forget to configure the kernel again since the new patches probably are not added to the configured options. You may wait with the kernel compilation until after the compilation of the userland program iptables if you feel like it, though.
Continue by compiling the iptables userland application. To compile iptables you issue a simple command that looks like this:
The userland application should now compile properly. If not, you are on your own, or you could subscribe to the netfilter mailing list, where you have the chance of asking for help you with your problems. There are a few things that might go wrong with the installation of iptables, so don't panic if it won't work. Try to think logically about it and find out what's wrong, or get someone to help you.
If everything has worked smoothly, you're ready to install the binaries by now. To do this, you would issue the following command to install them:
make install KERNEL_DIR=/usr/src/linux/
Hopefully everything should work in the program now. To use any of the changes in the iptables userland applications you should now recompile and reinstall your kernel and modules, if you hadn't done so before. For more information about installing the userland applications from source, check the INSTALL file in the source which contains excellent information on the subject of installation.
2.3.2. Installation on Red Hat 7.1
Red Hat 7.1 comes preinstalled with a 2.4.x kernel that has Netfilter and iptables compiled in. It also contains all the basic userland programs and configuration files that is needed to run it. However, the Red Hat people have disabled the whole thing by using the backwards compatible ipchains module. Annoying to say the least, and a lot of people keep asking different mailing lists why iptables don't work. So, let's take a brief look at how to turn the ipchains module off and how to install iptables instead.
The default Red Hat 7.1 installation today comes with an hopelessly old version of the userspace applications, so you might want to compile a new version of the applications as well as install a new and custom compiled kernel before fully exploiting iptables.
First of all you will need to turn off the ipchains modules so it won't start in the future. To do this, you will need to change some filenames in the /etc/rc.d/ directory-structure. The following command should do it:
chkconfig --level 0123456 ipchains off
By doing this we move all the soft links that points to the /etc/rc.d/init.d/ipchains script to K92ipchains. The first letter which per default would be S, tells the initscripts to start the script. By changing this to K we tell it to Kill the service instead, or to not run it if it was not previously started. Now the service won't be started in the future.
However, to stop the service from actually running right now we need to run another command. This is the service command which can be used to work on currently running services. We would then issue the following command to stop the ipchains service:
service ipchains stop
Finally, to start the iptables service. First of all, we need to know which runlevels we want it to run in. Normally this would be in runlevel 2, 3 and 5. These runlevels are used for the following things:
2. Multiuser without NFS or the same as 3 if there is no networking.
3. Full multiuser mode, i.e. the normal runlevel to run in.
5. X11. This is used if you automatically boot into Xwindows.
To make iptables run in these runlevels we would do the following commands:
chkconfig --level 235 iptables on
The above commands would in other words make the iptables service run in runlevel 2, 3 and 5. If you'd like the iptables service to run in some other runlevel you would have to issue the same command in those. However, none of the other runlevels should be used, so you should not really need to activate it for those runlevels. Level 1 is for single user mode, i.e, when you need to fix a screwed up box. Level 4 should be unused, and level 6 is for shutting the computer down.
To activate the iptables service, we just run the following command:
service iptables start
There are no rules in the iptables script. To add rules to an Red Hat 7.1 box, there is two common ways. Firstly, you could edit the /etc/rc.d/init.d/iptables script. This would have the undesired effect of deleting all the rules if you updated the iptables package by RPM. The other way would be to load the ruleset and then save it with the iptables-save command and then have it loaded automatically by the rc.d scripts.
First we will describe the how to set up iptables by cutting and pasting to the iptables init.d script. To add rules that are to be run when the computer starts the service, you add them under the start) section, or in the start() function. Note, if you add the rules under the start) section don't forget to stop the start() function in the start) section from running. Also, don't forget to edit a the stop) section either which tells the script what to do when the computer is going down for example, or when we are entering a runlevel that doesn't require iptables. Also, don't forget to check out the restart section and condrestart. Note that all this work will probably be trashed if you have, for example, Red Hat Network automatically update your packages. It may also be trashed by updating from the iptables RPM package.
The second way of doing the set up would require the following: First of all, make and write a ruleset in a file, or directly with iptables, that will meet your requirements, and don't forget to experiment a bit. When you find a set up that works without problems, or as you can see without bugs, use the iptables-save command. You could either use it normally, i.e. iptables-save > /etc/sysconfig/iptables, which would save the ruleset to the file /etc/sysconfig/iptables. This file is automatically used by the iptables rc.d script to restore the ruleset in the future. The other way is to save the script by doing service iptables save, which would save the script automatically to /etc/sysconfig/iptables. The next time you reboot the computer, the iptables rc.d script will use the command iptables-restore to restore the ruleset from the save-file /etc/sysconfig/iptables. Do not intermix these two methods, since they may heavily damage each other and render your firewall configuration useless.
When all of these steps are finished, you can deinstall the currently installed ipchains and iptables packages. This because we don't want the system to mix up the new iptables userland application with the old preinstalled iptables applications. This step is only necessary if you are going to install iptables from the source package. It's not unusual that the new and the old package to get mixed up, since the rpm based installation installs the package in non-standard places and won't get overwritten by the installation for the new iptables package. To carry out the deinstallation, do as follows:
rpm -e iptables
And why keep ipchains lying around if you won't be using it any more? Removing it is done the same way as with the old iptables binaries, etc:
rpm -e ipchains
After all this has been completed, you will have finished with the update of the iptables package from source, having followed the source installation instructions. None of the old binaries, libraries or include files etc should be lying around any more.
Chapter 3. Traversing of tables and chains
In this chapter we'll discuss how packets traverse the different chains, and in which order. We will also discuss the order in which the tables are traversed. We'll see how valuable this is later on, when we write our own specific rules. We will also look at the points which certain other components, that also are kernel dependant, enter into the picture. Which is to say the different routing decisions and so on. This is especially necessary if we want to write iptables rules that could change routing patterns/rules for packets; i.e. why and how the packets get routed, good examples of this is DNAT and SNAT. Not to be forgotten are, of course, the TOS bits.
When a packet first enters the firewall, it hits the hardware and then get's passed on to the proper device driver in the kernel. Then the packet starts to go through a series of steps in the kernel, before it is either sent to the correct application (locally), or forwarded to another host - or whatever happens to it. In this example, we're assuming that the packet is destined for another host on another network. The packet goes through the different steps in the following fashion:
Table 3-1. Forwarded packets
Step Table Chain Comment
1 On the wire (i.e., internet)
2 Comes in on the interface (i.e., eth0)
3 mangle PREROUTING This chain is normally used for mangling packets, i.e., changing TOS and so on.
4 nat PREROUTING This chain is used for Destination Network Address Translation mainly. Source Network Address Translation is done further on. Avoid filtering in this chain since it will be bypassed in certain cases.
5 Routing decision, i.e., is the packet destined for our localhost or to be forwarded and where.
6 filter FORWARD The packet gets routed onto the FORWARD chain. Only forwarded packets go through here, and here we do all the filtering. Note that all traffic that's forwarded goes through here (not only in one direction), so you need to think about it when writing your ruleset.
7 nat POSTROUTING This chain should first and foremost be used for Source Network Address Translation. Avoid doing filtering here, since certain packets might pass this chain without ever hitting it. This is also where Masquerading is done.
8 Goes out on the outgoing interface (i.e., eth1).
9 Out on the wire again (i.e., LAN).
As you can see, there are quite a lot of steps to pass through. The packet can be stopped at any of the iptables chains, or anywhere else if it is malformed; however, we are mainly interested in the iptables aspect of this lot. Do note that there are no specific chains or tables for different interfaces or anything like that. FORWARD is always passed by all packets that are forwarded over this firewall/router. Do not use the INPUT chain to filter on in the previous scenario! INPUT is meant solely for packets to our local host that do not get routed to any other destination.
Now, let us have a look at a packet that is destined for our own localhost. It would pass through the following steps before actually being delivered to our application that receives it:
Table 3-2. Destination local host (our own machine)
Step Table Chain Comment
1 On the wire (e.g., Internet)
2 Comes in on the interface (e.g., eth0)
3 mangle PREROUTING This chain is normally used for mangling packets, i.e., changing TOS and so on.
4 nat PREROUTING This chain is used for Destination Network Address Translation mainly. Avoid filtering in this chain since it will be bypassed in certain cases.
5 Routing decision, i.e., is the packet destined for our local host or to be forwarded and where.
6 filter INPUT This is where we do filtering for all incoming traffic destined for our localhost. Note that all incoming packets destined for this host pass through this chain, no matter what interface or in which direction they came from.
7 Local process/application (i.e., server/client program)
Note that this time the packet was passed through the INPUT chain instead of the FORWARD chain. Quite logical. Most probably the only thing that's really logical about the traversing of tables and chains in your eyes in the beginning, but if you continue to think about it, you'll find it will get clearer in time.
Finally we look at the outgoing packets from our own local host and what steps they go through.
Table 3-3. Source local host (our own machine)
Step Table Chain Comment
1 Local process/application (i.e., server/client program)
2 Mangle OUTPUT This is where we mangle packets, it is suggested that you do not filter in this chain since it can have sideeffects.
3 Nat OUTPUT This is currently broken, could someone tell me when this will be fixed? Please?
4 Filter OUTPUT This is where we filter packets going out from the local host.
5 Routing decision. This is where we decide where the packet should go.
6 Nat POSTROUTING This is where we do Source Network Address Translation as described earlier. It is suggested that you don't do filtering here since it can have sideeffects, and certain packets might slip through even though you set a default policy of DROP.
7 Goes out on some interface (e.g., eth0)
8 On the wire (e.g., Internet)
We have now seen how the different chains are traversed in three separate scenarios. If we were to figure out a good map of all this, it would look something like this:
Hopefully you've got a clearer picture of how the packets traverses the built in chains by now. All comments are welcome, the details might still be wrong or they might change in the future. If you feel that you want more information, you could use the rc.test-iptables.txt script. This test script should give you the necessary rules to test how the tables and chains are traversed.
3.2. Mangle table
This table should as we've already noted mainly be used for mangling packets. In other words, you may freely use the mangle matches etc that could be used to change TOS (Type Of Service) fields and so on.
You are strongly adviced not to use this table for any filtering; nor will any DNAT, SNAT or Masquerading work in this table.
Targets that are only valid in the mangle table:
The TOS target is used to set and/or change the Type of Service field in the packet. This could be used for setting up policies on the network regarding how a packet should be routed and so on. Note that this has not been perfected and is not really implemented on the internet and most of the routers don't care about the value in this field, and sometimes, they act faulty on what they get. Don't set this in other words for packets going to the Internet unless you want to make routing decisions on it, with iproute2.
The TTL target is used to change the TTL (Time To Live) field of the packet. We could tell packets to only have a specific TTL and so on. One good reason for this could be that we don't want to give ourself away to nosy Internet Service Providers. Some Internet Service Providers do not like users running multiple computers on one single connection, and there are some Internet Service Providers known to look for a single host generating different TTL values, and take this as one of many signs of multiple computers connected to a single connection.
The MARK target is used to set special mark values to the packet. These marks could then be recognized by the iproute2 programs to do different routing on the packet depending on what mark they have, or if they don't have any. We could also do bandwidth limiting and Class Based Queuing based on these marks.
3.3. Nat table
This table should only be used for NAT (Network Address Translation) on different packets. In other words, it should only be used to translate the packet's source field or destination field. Note that, as we have said before, only the first packet in a stream will hit this chain. After this, the rest of the packets will automatically have the same action taken on them as the first packet. The actual targets that do these kind of things are:
The DNAT (Destination Network Address Translation) target is mainly used in cases where you a public IP and want to redirect accesses to the firewall to some other host (on a DMZ for example). In other words, we change the destination address of the packet and reroute it to the host.
SNAT (Source Network Address Translation) is mainly used for changing the source address of packets. For the most part you'll hide your local networks or DMZ, etc. A very good example would be that of a firewall of which we know outside IP address, but need to substitute our local network's IP numbers whit that of our firewall. With this target the firewall will automatically SNAT and De-SNAT the packets, hence making it possible to make connections from the LAN to the Internet. If youre network uses 192.168.0.0/netmask for example, the packets would never get back from the Internet, because IANA has regulated these networks (amongst others) as private and only for use in isolated LANs.
The MASQUERADE target is used in exactly the same way as SNAT, but the MASQUERADE target takes a little bit more overhead to compute. The reason for this, is that each time that the MASQUERADE target gets hit by a packet, it automatically checks for the IP address to use, instead of doing as the SNAT target does - just using the single configured IP address. The MASQUERADE target makes it possible to work properly with Dynamic DHCP IP addresses that your ISP might provide for your PPP, PPPoE or SLIP connections to the internet.
3.4. Filter table
The filter table is, of course, mainly used for filtering packets. We can match packets and filter them in whatever way we want. There is nothing special to this chain or to pacets that might slip through because they are malformed, etc. This is the place that we actually take action against packets and look at what they contain and DROP or /ACCEPT them, depending on their payload. Of course we may also do prior filtering; however, this particular table, is the place for wich filtering was designed. Almost all targets are usble in this chain; however do keep in mind that the targets discusses previously earlier too, however, this is the place that was designed for it. Almost all targets are usable in this chain, however, the targets discussed previously in this chapter can only be used in their respective tables. We will be more prolific about the filter table here; however you now know that this table is the right place to do your main filtering.
Chapter 4. The state machine
This chapter will deal with the state machine and explain it in detail. After reading trough it, you should have a complete understanding of how the State machine works. We will also go through a large set of examples on how states are dealt within the state machine itself. These should clarifly everything in practice.
The state machine is a special part within iptables that should really not be called the state machine at all, since it is really a connection tracking machine. However, most people recognize it under the first name. Throughout this chapter i will use this namnes more or less as if they where synonymous. This sould not be overly confusing. Connections tracking is done to let the netfilter framework know the state of a specific connection. Firewalls that implement this are generally called statful firewalls. A stateful firewall is generally much more secure than non-stateful firewalls since it allows us to write much tighter rulesets.
Whitin iptables, the state of a connection can be divaded in 4 different basic states. These are known as NEW, ESTABLISHED, RELATED and INVALID. We will discuss each of these in more depth later. With the --state match we can specify which states that we want to allow in or out. The conntrack module keeps all states freshly in memory, according to certain rules on when to release a certain state and on specific information. The connection tracking module calculates the state of a specific packet upon 4 basic values in TCP and UDP and then on a few extra values. The basic values used to calculate a state for TCP and UDP streams are: The source IP address, the destination IP address, the source port and, lastly, the destination port. For ICMP packages, other rules apply. The same goes for other subprotocols of the IP protocol.
In previous kernels, we had the possibility to turn on and off defragmentation. However, since iptables and Netfilter were introduced and connection tracking in particular, this option was gotten rid of. The reason for this, is that connection tracking can not work properly without defragmenting packets, and hence defragmenting has been incorporated to conntrack and is carried out automatically. It can not be turned off, except by turning off connection tracking. Defragmentation is always carried out if connection tracking is turned on.
All connection tracking is handled in the PREROUTING chain. What this means, is that iptables will do all recalculation of states and so on, within the PREROUTING chain. If we send the initial packet in a stream, this is where the state gets set to NEW, and when we receive a return packet, this is where the state is changed to ESTABLISHED, and so on.
4.2. The conntrack entries
Let's take a brief look at a conntrack entry and how to read them in /proc/net/ip_conntrack. This gives a list of all the current entries in your conntrack database. If you have the ip_conntrack module loaded, a cat of /proc/net/ip_conntrack might look like:
tcp 6 117 SYN_SENT src=192.168.1.6 dst=192.168.1.9 sport=32775 dport=22 [UNREPLIED] src=192.168.1.9 dst=192.168.1.6 sport=22 dport=32775 use=2
This example contains all the information that the conntrack module maintains to know which state a specific connection is in. First of all, we have a protocol, which in this case is tcp. Next, the same value in normal decimal coding. After this, we see how long this conntrack entry has to live. This value is set to 117 seconds right now and is decremented regularly until we see more traffic. This value is then reset to the default value for the specific state that it is in at that relevant point of time. Next comes the actual state that this entry is in at the present point of time. In the above mentioned case we are looking at a packet that is in the SYN_SENT state. The internal value of a connection is slightly different from the ones used externally with iptables. The value SYN_SENT tells us that we are looking at a connection that has only seen a TCP SYN packet in one direction. Next, we see the source IP address, destination IP address, source port and destination port. At this point we see a specific keyword that tells us that we have seen no return traffic for this connection. Lastly, we see what we expect of return packets. The information details the source IP address and destination IP address (which are both inverted, since the packet is to be directed back to us). The same thing goes for the source port and destination port of the connection. These are the values that should be of any interest to us.
The connection tracking entries may take on a series of different values, all specified in the conntrack headers available in linux/include/netfilter-ipv4/ip_conntrack*.h files. These values are dependant on which subprotocol of IP we use. TCP, UDP or ICMP protocols take specific default values as specified in linux/include/netfilter-ipv4/ip_conntrack.h. We will look closer at this when we look at each of the protocols; however, we will not use them extensively through this chapter, since they are not used outside of the conntrack internals. Also, depending on how this state changes, the default value of the time until the connection is destroyed will also change.
Recently there was a new patch made available in iptables patch-o-matic, called tcp-window-tracking. This patch adds, among other things, all of the above timeouts to special sysctl variables, which means that they can be changed on the fly, while the system is still running. Hence, this makes it unnecessary to recompile the kernel every time you want to change the timeouts.
These can be altered via using specific system calls available in the /proc/sys/net/ipv4/netfilter directory. You should in particular look at the /proc/sys/net/ipv4/netfilter/ip_ct_* variables.
When a connection has seen traffic in both directions, the conntrack entry will erase the [UNREPLIED] flag, and then reset it. The entry tells us that the connection has not seen any traffic in both directions, will be replaced by the [ASSURED] flag, to be found close to the end of the entry. The [ASSURED} flag tells us that this connection is assured and that it will not be erased if we reach the maximum possible tracked connections. Thus, connections marked as [ASSURED] will not be erased, contrary to the non assured connections (those not marked as [ASSURED]). How many connections that the connection tracking table can hold depends upon a variable that can be set through the ipsysctl functions in recent kernels. The default value held by this entry varies heavily depending on how much memory you have. On 128 MB of RAM you will get 8192 possible entries, and at 256 MB of RAM, you will get 16376 entries. You can read and set your settings through the /proc/sys/net/ipv4/ip_conntrack_max setting.
4.3. Userland states
As you have seen, packets may take on several different states within the kernel itself, depending on what protocol we are talking about. However, outside the kernel, we only have the 4 states as described previously. These states can mainly be used in conjunction with the state match which will then be able to match packets based on their current connection tracking state. The valid states are NEW, ESTABLISHED, RELATED and INVALID states. The following table will briefly explain each possible state.
Table 4-1. Userland states
NEW The NEW state tells us that the packet is new in the connection. This means that the first packet that the conntrack module sees will be matched. For example, if we see a SYN packet and it is the first packet in a connection that we see, it will match. However, the packet may as well not be a SYN packet and still be considered NEW. This may lead to certain problems in some instances, but it may also be extremely helpful when we need to pick up lost connections from other firewalls, or when a connection has already timed out, but in reality is not closed.
ESTABLISHED The ESTABLISHED state has seen traffic in both directions and will then match those packets. ESTABLISHED connections are fairly easy to understand. The only requirement to get into an ESTABLISHED state is that one host sends a packet, and that it later on gets a reply from the other host. The NEW state will upon receipt to the firewall change to the ESTABLISHED state.
RELATED The RELATED state is one of the more tricky states. A connection is considered RELATED when it is related to another already ESTABLISHED connection. What this means, is that for a connection to be considered as RELATED, we must first have a connection that is considered ESTABLISHED. The ESTABLISHED connection will then spawn a connection outside of the main connection. The newly spawned connection will then be considered RELATED, if the conntrack module is able to understand that it is RELATED. ICMP error messages and redirects etc can be considered as RELATED if we have generated a packet that in turn generated the ICMP message. Other good examples of connections that can be considered as RELATED are the FTP-data connections that are considered RELATED to the FTP control port, and the DCC connections issued through IRC. This could be used to allow ICMP replies, FTP transfers and DCC's to work properly through the firewall. Do note that most TCP protocols and some UDP protocols that rely on this mechanism are quite complex and send connection information within the payload of the TCP or UDP data segments, and hence require special helper modules to be correctly understood.
INVALID The INVALID state means that the packet can not be identified or that it does not have any state. This may be due to several reasons, such as the system running out of memory or ICMP error messages that do not respond to any known connections. Generally, it is a good idea to DROP everything in this state.
These states can be used together with the --state match to match packets based on their connection tracking state. This is what makes the state machine so incredibly strong and efficient for our firewall. Previously, we often had to open up all ports above 1024 to let all traffic back into our local networks again. With the state machine in place this is not necessary any longer, since we can now just open up the firewall for return traffic and not for all kinds of other traffic.
4.4. TCP connections
In this section and the upcoming ones, we will take a closer look at the states and how they are handled for each of the three basic protocols TCP, UDP and ICMP. Also, we will take a closer look at how connections are handled per default, if they can not be classified as either of these three protocols. We have chosen to start out with the TCP protocol since it is a stateful protocol in itself, and has a lot of interesting details with regard to the state machine in iptables.
A TCP connection is always initiated with the 3-way handshake, which establishes and negotiates the actual connection over which data will be sent. The whole session is begun with a SYN packet, then a SYN/ACK packet and finally an ACK packet to acknowledge the whole session establishment. At this point the connection is established and able to start sending data. The big problem is, how does connection tracking hook up into this? Quite simply really.
As far as the user is concerned, connection tracking works basically the same for all connection types. Have a look at the picture below to see exactly what state the stream enters during the different stages of the connection. As you can see, the connection tracking code does not really follow the flow of the TCP connection, from the users viewpoint. Once it has seen one packet(the SYN), it considers the connection as NEW. Once it sees the return packet(SYN/ACK), it considers the connection as ESTABLISHED. If you think about this a second, you will understand why. With this particular implementation, you can allow NEW and ESTABLISHED packets to leave your local network, only allow ESTABLISHED connections back, and that will work perfectly. Conversely, if the connection tracking machine werer to consider the whole connection establishment as NEW, we would never really be able to stop outside connections to our local network, since we would have to allow NEW packets back in again. To make things more complicated, there is a number of other internal states that are used for TCP connections inside the kernel, but which are not available for us in Userland. Roughly, they follow the state standards specified within RFC 793 - Transmission Control Protocol at page 21-23. We will consider these in more detail further along in this section.
As you can see, it is really quite simple, seen from the user's point of view. However, looking at the whole construction from the kernel's point of view, it's a little more difficult. Let's look at an example. Consider exactly how the connection states change in the /proc/net/ip_conntrack table. The first state is reported upon receipt of the first SYN packet in a connection.
tcp 6 117 SYN_SENT src=192.168.1.5 dst=192.168.1.35 sport=1031 dport=23 [UNREPLIED] src=192.168.1.35 dst=192.168.1.5 sport=23 dport=1031 use=1
As you can see from the above entry, we have a precise state in which a SYN packet has been sent, (the SYN_SENT flag is set), and to which as yet no reply has been sent (witness the [UNREPLIED] flag). The next internal state will be reached when we see another packet in the other direction.
tcp 6 57 SYN_RECV src=192.168.1.5 dst=192.168.1.35 sport=1031 dport=23 src=192.168.1.35 dst=192.168.1.5 sport=23 dport=1031 use=1
Now we have received a corresponding SYN/ACK in return. As soon as this packet has been received, the state changes once again, this time to SYN_RECV. SYN_RECV tells us that the original SYN was delivered correctly and that the SYN/ACK return packet also got through the firewall properly. Moreover, this connection tracking entry has now seen traffic in both directions and is hence considered as having been replied to. This is not explicit, but rather assumed, as was the [UNREPLIED] flag above. The final step will be reached once we have seen the final ACK in the 3-way handshake.
tcp 6 431999 ESTABLISHED src=192.168.1.5 dst=192.168.1.35 sport=1031 dport=23 src=192.168.1.35 dst=192.168.1.5 sport=23 dport=1031 use=1
In the last example, we have gotten the final ACK in the 3-way handshake and the connection has entered the ESTABLISHED state, as far as the internal mechanisms of iptables are aware. After a few more packets, the connection will also become [ASSURED], as shown in the introduction section of this chapter.
When a TCP connection is closed down, it is done in the following way and takes the following states.
As you can see, the connection is never really closed until the last ACK is sent. Do note that this picture only describes how it is closed down under normal circumstances. A connection may also, for example, be closed by sending a RST(reset), if the connection were to be refused. In this case, the connection would be closed down after a predetermined time.
When the TCP connection has been closed down, the connection enters the TIME_WAIT state, which is per default set to 2 minutes. This is used so that all packets that have gotten out of order can still get through our ruleset, even after the connection has already closed. This is used as a kind of buffer time so that packets that have gotten stuck in one or another congested router can still get to the firewall, or to the other end of the connection.
If the connection is reset by a RST packet, the state is changed to CLOSE. This means that the connection per default have 10 seconds before the whole connection is definitely closed down. RST packets are not acknowledged in any sense, and will break the connection directly. There are also other states than the ones we have told you about so far. Here is the complete list of possible states that a TCP stream may take, and their timeout values.
Table 4-2. Internal states
State Timeout value
NONE 30 minutes
ESTABLISHED 5 days
SYN_SENT 2 minutes
SYN_RECV 60 seconds
FIN_WAIT 2 minutes
TIME_WAIT 2 minutes
CLOSE 10 seconds
CLOSE_WAIT 12 hours
LAST_ACK 30 seconds
LISTEN> 2 minutes
These values are most definitely not absolute. They may change with kernel revisions, and they may also be changed via the proc filesystem in the /proc/sys/net/ipv4/netfilter/ip_ct_tcp_* variables. The default values should, however, be fairly well established in practice. These values are set in jiffies (or 1/100th parts of seconds), so 3000 means 30 seconds.
Also note that the Userland side of the state machine does not look at TCP flags set in the TCP packets. This is generally bad, since you may want to allow packets in the NEW state to get through the firewall, but when you specify the NEW flag, you will in most cases mean SYN packets.
This not what happens with the current state implementation; instead, even a packet with no bit set or an ACK flag, will count as NEW and if you match on NEW packets. This can be used for redundant firewalling and so on, but it is generally extremely bad on your home network, where you only have a single firewall. To get around this behavior, you could use the command explained in the State NEW packets but no SYN bit set section of the Common problems and questions appendix. Another way is to install the tcp-window-tracking extension from patch-o-matic, which will make the firewall able to track states depending on the TCP window settings.
4.5. UDP connections
UDP connections are in themself not stateful connections, but rather stateless. This is so for several reasons, mainly it does not contain any connection establishment or connection closing, but most of all it lacks sequencing. Receiving two UDP datagrams in a specific order does not say anything about which order they where sent in. It is however still possible to set states on the connections within the kernel. Lets have a look at how a connection can be tracked and how it may look in conntrack.
As you can see from this, the connection is brought up almost exactly the same as the TCP connection, from the userland view. Inside conntrack it looks quite a bit different, but they are generally the same. First of all, lets have a look at the entry after the initial UDP packet has been sent.
udp 17 20 src=192.168.1.2 dst=192.168.1.5 sport=137 dport=1025 [UNREPLIED] src=192.168.1.5 dst=192.168.1.2 s
时间:2002-12-06 18:42 来源: 作者:juna2001 原文链接