The last I heard estimates, reports were that 45% of the World’s Internet users are behind some kind of firewall or NAT device. With the huge expansion of broadband in the last five years, it’s probably even more than that, since practically every home broadband device has some kind of NAT in, and of course PCs have their own software firewalls: there’s one built-in to Windows and many add-on products also have firewall functionality.

As we move into the VoIP age, this is a concern. A lot of today’s applications can cope with being behind a firewall, and since many things piggyback on HTTP and TCP, firewalls are now designed to work with this in mind. VoIP is not so easy. SIP phones and servers use two separate streams of data, the first is the SIP flow used for call control, and the second is the RTP flow that represents the media (audio or video). SIP can use UDP, TCP or TLS for its transport. The convention has always been to use UDP, but increasingly companies like Microsoft are being turned-on to the benefits of using TCP, or TLS since (apart from other benefits) it can tunnel through NATs and firewalls much better.

The RTP side is more problematic. RTP is also based on UDP, and in fact the UDP ports for the flow are only decided at call setup time, and communicated via the SIP negotiation. In the context of a firewall, this is a problem as you need to open those ports so that RTP can flow in through the firewall, and this is where all the mess starts. One approach is to co-locate a SIP proxy in the firewall, another is to have a SIP-aware ALG in the firewall. ETSI have a working group called MIDCOM, where they are now talking about a ‘middlebox’, to control SIP flows without putting the SIP knowledge into the firewall itself.

Some have hoped that IPv6 is the answer to ridding the world of NATs, but there seems little political will or commercial drive for IPv6 to happen.

You may be aware of the NAT traversal protocols that are now used on the client-side, namely TURN, STUN and ICE. These are protocols that allow the client to probe the network, using the help outside servers, in order to find out the ‘outside’ address of the firewall and port numbers to be determined at call setup time. Sometimes the solution requires SIP servers to have a media proxy available, for example SIP Express Router (SER) can be used with the SIP Express Media Server (SEMS) to relay RTP traffic that would otherwise get blocked. Even here the protocols cannot cover every scenario, for example STUN has trouble with symmetrical firewalls, and for example when two VoIP users are behind the same firewall it would be optimal for them to route RTP directly to each other rather than forward everything on a long hairpin to a media proxy.

So NAT traversal is a huge problem, and there are no silver bullets to kill all the problems, but this is part of what makes working with VoIP and SIP so interesting today.