Sunday, March 4, 2012
SIP, UDP and NAT
Writing about SIP and NAT could fill another blog. This problem was neglected in the SIP standard. Maybe this was done to push the need for IPv6; however for sure whoever had this idea did not make our life easier over the last ten years. NAT remains a major pain in the neck for the whole VoIP world.
For those who don’t know NAT: This is “network address translation” and it deals with problem that there are less IPv4 addresses in the world than there are telephone numbers in the USA. The trick is to abuse the port numbers for addressing purposes. Essentially what it does is to put 16 bit behind the 32 bit IPv4 address, so that we have 48 bits for addressing a service in the Internet. That is the same number of bits that we have for Ethernet addresses, where so far that number was sufficient. That trick worked well enough to keep the service providers upgrade to IPv6.
The other major problem in SIP was that it mandated the support for the UDP transport protocol. This is not only a pain in the neck for the poor programmers who have to deal with message repetition all the time (and which may explain the poor code quality found in many products). This is also a problem for the message size. Try to send a UDP packet with more than 1500 bytes; you will get a lot of “funny” effects. This is called UDP fragmentation. The problem behind it is that the packet cannot be sent in one packet over the network, so it has to be split and reassembled on the other side. While the splitting up usually works, most of the cheap routers available in the real world don’t reassemble the packets and even many SIP devices are not able to do the same job. In other words: If you have a very long name and you name is included in the SIP packet, your phone will probably not ring. Okay, most people don’t have names with 1000 characters; however with the advance of SIP there are so many extensions that make the packet longer. And you end up with packets that don’t make it through NAT. Nice.
Life could have been so easy. All that the SIP authors should have done would be to use TCP transport layer. Then on the signaling side we would not have fragmentation problems and even NAT would work without major glitches. The argument that servers don’t scale well with TCP is frankly from another world, looking at HTTP and Email traffic today (all on TCP).
The m9 supports UDP (including fragmentation, thanks to Linux), TCP and TLS. Whenever you can, my suggestion is to use TCP or TLS to minimize the trouble with NAT. Especially when operating the m9 behind a cheap router. Luckily, most SIP server software today supports TCP. Some of them even don’t support UDP any more (like Microsoft Lync), which is good news.
For the media side, even if SIP would have used only TCP, NAT would have always been a problem. For media, you want short delays. That means you want to send the packet directly between the devices if possible. And that means you have to use UDP. So even if your signaling is working okay, you would still have to fear that your voice connection would not make it. There is a solution for that called ICE, but that is a topic for another long blog post.