NATCracker: NAT Combinations Matter


NATCracker: NAT Combinations Matter

Authors: Roberto Roverso, Sameh El-Ansary and Seif Haridi

INTRODUCTION
Dealing with Network Address Translators (NATs) is nowadays an essential need for any P2P application. The techniques used to deal with NAT have been more or less “coined” and there are several widely-used methods. Some of them are rather a defacto standard like STUN, TURN, ICE.

In the context of our a P2P live video streaming application PeerTV, we are mainly concerned with media streaming using UDP and therefore the scope of this paper is UDP NAT traversal. Moreover, we are strictly interested in solutions that do not use relay, such as TURN for instance, due to the high bandwidth requirements of video streaming.

We have found lots of of previous work on the subject that aims to answer the following question: For every t in the set of NAT types T , which s in the set of traversal strategies S should be used to traverse t? The answer is of the form f : T → S. i.e. the following is an example with a couple of types f : { Simple Hole Punching, Port-Prediction } → { Full-Cone, Symmetric}.

However, the point which we found not gaining enough attention is that the presence of a feasible traversal technique that enables two peers behind NAT to communicate depends on the “combination” of the NAT types and not on the type of each peer separately. Thus, the question should be: “Given 2 peers pa and pb with respective NAT types t(pa) and t(pb), which traversal strategy s is needed for p1 and p2 to talk? The answer is of the form f : T × T → S”, i.e we need to analyze traversable combinations rather than traversable types.

Most works contain a few examples of combinations for explanation purposes. However, we have failed to find any comprehensive analysis that states, for every possible combination of NAT types, whether direct (i.e. with no relay) connectivity is possible and how. The analysis is more topical given that NAT community is switching from the classical set of NAT types Tclassic = { Full-Cone, Restricted-Cone, PortRestricted, Symmetric} to a more elaborate set that defines a NAT type by a combination of three different policies, namely, port mapping, port allocation and port filtering. With that, a statement like “two peers behind symmetric NAT can not communicate” becomes imprecise, as we will show that in many cases it is possible given the nuances available in the presently wide spectrum of NAT types.

RELATED WORK
The work in“Teredo extensions” includes a matrix for a number of combinations, however mostly drawn from Tclassic rather than the more elaborate classification in“Network Address Translation (NAT) Behavioral Requirements for Unicast UDP”. The work in“Symmetric nat traversal using stun” is probably the closest to ours, one can see our work as a superset of the set of combinations mentioned in that work.

Download white paper

NAT TYPES AS COMBINATIONS OF POLICIES
In this section we try to semi-formally summarize the more elaborate classification of NATs known as “BEHAVE-compliant” and craft the notation that we will use in the rest of the paper.

Notation. Let na and nb be NAT gateways. For i ∈ {a, b}, Let Pi = {pi , p0 i , p00 i , . . . } be the set of peers behind ni . An “endpoint” e is a host-port pair e = (h, p), where h(e) is the host of e and p(e) is its port. Let Vi = {vi , v0 i , v00 i , . . . } denote the set of all private endpoints of all peers behind ni and Ui = {ui , u0 i , u00 i . . . } be the set of public endpoints of ni . i.e ∀v ∈ Vi , h(v) ∈ Pi and ∀u : Ui , h(u) = ni .

When a packet is sent out from a certain private endpoint vi of a peer pi behind a gateway ni , to some public endpoint d, a rule in the NAT table of ni is created. We define the set of NAT table rules Ri = {ri , r0 i , r000 i } at ni , the rule records the fact that some public port ui and some private port vi are associated, e.g ra = (va ↔ ua).

The behavior of a gateway ni is defined by three policies, namely, port mapping, port filtering and port allocation. We use the notation f(ni), m(ni), a(ni) to denote the respective policies of gateway ni .

A. Mapping Policy

The mapping policy is triggered every time a packet is sent from a private endpoint vi behind the NAT to some external public port d. The role of a mapping policy is deciding whether a new rule will be added or an existing one will be reused. We use the notation:

1) −−→vi , d  ri to specify that the sending of a packet from vi to d resulted in the creation of a new NAT table rule ri at ni . That is the binding of a new public port on ni . However, we say that a rule was created because we care not only about the binding of the port but also the constraints on using this new port.

2) −−→vi , d ⇒ ri to specify that the sending of the packet reused an already existing rule ri .

3) −−→vi , d reason =6⇒ ri to specify that the sending of the packet did not reuse some ri in particular because of some “reason”.

Irrespective of the mapping policy, whenever a packet is sent from a private port vi to an arbitrary public destination endpoint d and @ri ∈ Ri of the form ri = (vi ↔ ui), for an arbitrary ui , the following is true −−→vi , d  ri . However, if such a mapping exists, the mapping policy would make the reuse decision based on the destination. For all subsequent packets from vi to d, naturally, −−→vi , d ⇒ ri . However, for any d 0 6= d, there are 3 different behaviors:

  • Endpoint-Independent, m(ni) = EI: −−→ vi , d0 ⇒ ri , for any d 0
  • Host-Dependent, m(ni) = HD: −−→ vi , d0 ⇒ ri , iff h(d) = h(d 0 ) −−→ vi , d0  r 0 i , iff h(d) 6= h(d 0 ), where r 0 i = (vi ↔ u 0 i ) and u 0 i 6= ui
  •  Port-Dependent, m(ni) = PD: −−→ vi , d0  r 0 i

Having introduced the different policies, we decorate the notation of the rule to include the criteria that will be used to decide whether a certain rule will be reused as follows:

Where the syntax m : x → y means that the rule will be reused if the source endpoint of the packet is x and the destination is y. The ∗ denotes any endpoint. Order. We impose the EI < HD < PD according to the increasing level of restrictiveness.

B. Allocation Policy.

Every time a new ri is added to Ri , a new public endpoint ui is bound. This policy allocates p(ui). That is, the mapping policy decides when to bind a new port and the allocation policy decides which port should be bound as follows:

  • 1) Port-Preservation, a(ni) = PP: Given −−→vi , d  ri , where ri = (vi ↔ ui), it is always the case that: p(ui) = p(vi). Naturally, this may cause conflicts if any two pi and p 0 i behind ni decided to bind private endpoints with a common port.
  • 2) Port Contiguity, a(ni) = PC: Given any two sequentially allocated public endpoints ui and u 0 i it is always the case that: p(u 0 i ) = p(ui) + ∆, for some ∆ = 1, 2, …
  • 3) Random, a(ni) = RD: ∀ui , p(ui) is allocated at random.

Order. We impose the order PP < PC < RD according to the increasing level of difficulty of handling.

 

C. Filtering Policy.

The filtering policy decides whether a packet from the outside world to a public endpoint of a NAT gateway should be forwarded to the corresponding private endpoint. Given an existing rule ri = (vi ↔ ui) that was created to send a packet from vi to d, we use the notation:

  • 1) ri ⇐ ←−− ui , s to denote that the receival of a packet from the public endpoint s to ni’s public endpoint ui is permitted by ri
  • 2) ri reason 6⇐= ←−− ui , s to denote that the receival is not permitted because of some “reason”.

There are 3 filtering policies with the following conditions for allowing receival:

  • Endpoint-Independent, f(ni) = EI: ri ⇐ ←−− ui , s, for any s
  • Host-Dependent, f(ni) = HD: ri ⇐ ←−− ui , s, iff h(s) = h(d)
  • Port-Dependent, f(ni) = PD: ri ⇐ ←−− ui , s, iff s = d

We also decorate the rules to include conditions for accepting packets as follows:

Order. We impose the order EI < HD < PD according to the increasing level of restrictiveness.

D. The Set of NAT Types

Having defined the above policies, the NAT type of a given NAT gateway is simply a matter of listing which behavior is used for each of the policies. We define the set of triplets representing all possible NAT types τ = {(m, a, f)|f, m ∈ {EI, HD, PD}, a ∈ {PP, PC, RD}}.

 

NAT TYPE DISCOVERY
Before traversing a NAT gateway, one needs to know its type. STUN is the most-widely used method for accomplishing this and there exists many publicly-available STUN servers that assist in the discovery process. The original STUN algorithm produces a classification withdrawn from the set τclassic. Due to space limitations and the fact that our main focus is on traversal strategies, we will not delve into the details of performing the discovery process. However, we just need to clarify that we have expanded the scope of the discovery process to discover information about the allocation policy. With that, our classification is capable of reporting all elements in the set τ .

 

NAT TRAVERSAL TECHNIQUES
We explain our traversal techniques which are an augmented version of the well-known techniques.

Basic Assumptions. We assume that there is a Rendez-vous server with public IP referred to by z. The traversal process always starts after: i) two Peers pa and pb respectively behind NATs na and nb register themselves at z and have an “out-ofband” communication channel with z, which is in our case a TCP connection initiated by the peer, we refer to all endpoints of z and z itslef by the same symbol; ii) The 2 peers know that they need to communicate and know the other peer’s public IP, i.e. the corresponding NAT IP, some peers supply additional information during registration as we will shortly explain in Section VII-B; iii) all the policies of pa, pb are known to z using a discovery process before any traversal process takes place.

SIMPLE HOLE-PUNCHING (SHP)

A. Traversal Process

1) pa sends from some va to z through na.

2) na creates ra = (va ↔ ua) and forwards to z.

3) z receives and consequently knows ua.

4) z informs pb about ua (Out-of-band).

5) pb sends from some vb to ua through nb.

6) nb creates rb = (vb ↔ ub) and forwards to ua.

7) na receives, if the filtering allows, forwards to va

8) pa sends from va to ub through na, if the mapping allows, ra is reused. Otherwise, r 0 a will be created and sending will occur from some other public endpoint u 0 a 6= ua

9) nb receives, if the filtering allows, forwards to vb

 

B. SHP Feasibility

Theorem 6.1: Simple hole punching is feasible for establishing direct communication between two peers pa and pb respectively behind na and nb if ∃nx ∈ {na, nb} s.th. f(nx) = EI, and either m(nx) = EI or m(nx) > EI and f(nx06=x) < PD.

Proof: We consider the most restrictive case where f(na) = f(nb) = m(na) = m(nb) = PD and a(na) = a(nb) = RD and show the minimum relaxations that we need to do for SHP to work. By looking at the steps in section VI, and considering all the very restrictive mapping and filtering on both sides, we can see that after steps 5 and 6, ra and rb will be as follows: ra =  va f:ua←uz ←−−−−−−→ m:va→uz ua  , rb =  vb f:ub←ua ←−−−−−−→ m:vb→ua ub 

Which will cause the following problems:

In step 7: ra ub6=uz 6⇐= ←−−− ub, ua and there is nothing that we can relax at nb which can help. Instead, we have to relax the filtering at pa to indulge receiving on ua from ub while it was initially opened for receiving from uz. i.e, ra has to tolerate host change which is not satisfied by PD nor HD filtering, therefore f(na) = EI is necessary, resulting into ra =  va f:ua←∗ ←−−−−−−→ m:va→uz ua 

In step 8: −−−→ va, ub ub6=uz =6⇒ ra and −−−→ va, ub  r 0 a where r 0 a =  va f:u 0 a←∗ ←−−−−−−→ m:va→ub u 0 a  . Consequently, rb ua6=u 0 a 6⇐= ←−−− u 0 a , ub. To solve this, we have two solutions, the first is to let the mapping reuse ra and not create r 0 a which needs relaxing m(na) to be EI, in which case we can keep f(nb) as restrictive. The second solution is to keep na as restrictive and relax f(nb) to tolerate receiving from u 0 a . In the second solution, there is a minor subtlety that needs to he handled, where pb has to be careful to keep sending to pa on ua despite the fact that it is receiving from u 0 a . Similarly pa should always send to pb on ub despite the fact it is receiving from u 0 b . That is an asymmetry that is not in general needed.

C. Coverage of SHP

Since |τ | = 27 types, we have a 27×28 2 = 378 distinct combinations of NAT types of two peers. Using Theorem 6.1, we find that 186 combinations, i.e. 49.2% of the total number of possible ones are traversable using the Simple Hole Punching approach. That said, this high coverage is totally orthogonal to how often one is likely to encounter combinations in the covered set in practice, which we discuss in our evaluation (Section IX-A). Traversable SHP combinations are shown in Figure 1 with label SHP(*).

To cover the rest of the cases, we use port prediction which enables a peer to punch a hole by sending to the opposite peer instead of z, which makes it possible to tolerate more restrictive filtering and mapping policies, as explained below.

 

PREDICTION

A. Prediction using Contiguity (PRC)

The traversal process consists in the following steps:

1) pa sends two consecutive messages:

  • from some va to z through na
  • from va to u dum b , an arbitrary endpoint of nb

2) na creates the following two rules:

  • r 0 a = (va ↔ u 0 a ) and forwards to z.
  • ra = (va ↔ ua) and forwards to u dum b . Actually, the whole point of sending u dum b is to open ua by sending to nb but be able to predict it at z.

3) The messages are received as follows:

  • a) z receives and consequently knows u 0 a and additionally predicts ua = u 0 a + ∆ where ∆ is known during the discovery process.
  • b) nb drops the message since no endpoint u dum b was ever bound.

4) z informs pb about ua (Out-of-Band).

5) Steps 5 − 9 follow the same scheme as in simple hole punching.

Port scanning.The process is susceptible to failure if another peer p 0 a happens by coincidence to send a packet between the two consecutive packets. For that, a technique called port scanning is used such that when pb tries to connect to ua, pb will try ua + ∆, ua + 2∆, ua + 3∆, etc.. until a reply is received. Some gateways might identify this as a malicious UDP port scan and block it as is the case in some corporate firewalls. Port scanning might be used only when pb connecting to pa where a(na) = P C has m(nb) < P D.

 

B. Prediction using Preservation (PRP)

Another technique is to exploit the port-preservation allocation policy. However, to do that, we assume that when a peer with port-preservation policy registers at z, the peer supplies a pool of free candidate ports to z. The main point here is to avoid conflicts with ports of other peers behind the same NAT. The rendez-vous server z is stateful regarding which ports are bound by each NAT and chooses from the pool of the ports supplied by the peer a port which is not already bound.

1) z chooses some arbitrary port ρ and tells pa (Out-ofBand) to bind ρ

2) pa sends from va where p(va) = ρ to u dum b through na.

3) na creates a new rule ra = (va ↔ ua)u dum b and forwards to u dum b and since a(pa) = PP, p(ua) = p(va) = ρ.

4) z informs pb about ua (Out-of-Band).

5) Steps 5-9 follow the same scheme as in SHP.

Note that the process is shorter than prediction by contiguity and z chooses the port for the peer behind NAT instead of the NAT of the peer deciding it and z observing it. However, for the sake of reasoning, the two are equivalent because what matters is what happens after the opposite peer learns about the punched port irrespective of how the port was predicted.

C. Prediction-on-a-Single-Side Feasibility

Theorem 7.1: Prediction using contiguity or preservation on a single side is feasible for establishing direct communication between two peers pa and pb respectively behind na and nb if:

  • Condition 1: ∃nx ∈ {na, nb} s.th. a(nx) < RD and f(nx) < PD
  • Condition 2: Either m(nx) < PD or m(nx) = PD and f(nx06=x) < PD.

Proof: Similar to theorem 6.1, we start with the most restrictive policies and we relax until prediction is feasible. The allocation policy of the side to be predicted (na in Section VII-B,VII-A) can not be random, because the whole idea of prediction relies on a predictable allocation policy, thus the needed relaxation is a(na) < RD.

D. Coverage of PRP & PRC

PRP and PRC together cover another 18% of the combinations. That said, we can say that PRP is as good as SHP in terms of traversal time and success rate (see Section IX), which means in addition to the cases where PRP on a single side is used in Figure 1, we can also use PRP instead of SHP when the allocation policy is port preservation.

 

INTERLEAVED PREDICTION ON TWO SIDES

The remaining combinations are these not covered by SHP nor prediction. The final stretch to go is to do simultaneous prediction on both sides. However, it is a seemingly tricky deadlock situation because every peer needs to know the port that will be opened by the other peer without the other peer sending anything. Which we solve as follows.

  • Interleaved PRP-PRP. In this case actually double prediction is very simple because the rendez-vouz server can pick a port for each side and instruct the involved peers to simultaneously bind it and start the communication process.
  • Interleaved PRP-PRC This case is also easily solvable thanks to preservation. Because z can inform the peer with a port contiguity allocation policy about the specific endpoint of the opposite peer. The latter in turn will run a port prediction process using the obtained endpoint in the second consecutive message.
  • Interleaved PRC-PRC This one is the trickiest and it needs a small modification in the way prediction by contiguity is done. The idea is that the two consecutive packets, the first to z and the second to the opposite peer can not be sent after each other immediately. Instead, both peers are commanded by z to send a packet to z itself. From that, z deduces the ports that will be opened on each side in the future and sends to both peers informing them about the opposite peer’s predicted endpoint. Both peers in their turn send a punching packet to each other. The problem with this scheme is that there is more time between the consecutive packets which makes it more susceptible to the possibility of another peer behind any of the NATs sending a packet in in between. Like the case in single PRC, port scanning is the only resort, but in general this combination has lower success rate compared to single PRC (see SectionIX).

 

EVALUATION

Apart from the reasoning above, we have done a sanity check on our logic using our emulation platform. That is, we wrote our own NAT boxes, which behave according to the semantics defined in Section III. We also implemented the Rendez-Vous server and the nodes that are capable of performing all the traversal techniques in Section V. For each case, we ran the suggested traversal technique and we made sure direct communication is indeed achievable. Real-life evaluation was needed to gain insights on other aspects like probability of encountering a given type, success rates of traversal techniques and time needed for the traversal process to complete.

A. Distribution of Types

We wanted to know how likely is it to encounter each of the types in τ . We have collected cumulative results for peers who have joined our network over time. As shown in Figure 2: i) we encountered 13 out of the 27 possible types; ii) we found that (m = EI, f = PD, a = PP) is a rather popular type (approx. 37%) of all encountered types, which is fortunate because port preservation is quite friendly to deal with and it is with a very relaxed mapping; iii) about 11% are the worst kind to encounter, because when two peers of this type need to talk, interleaved prediction is needed with a shaky success probability.

B. Adoption of BEHAVE RFC

By looking at each policy alone, we can see to what extent the recommendations of the BEHAVE RFC [8] (f = EI/HD, m = EI) are adopted. As shown in Table IX, for filtering, the majority are adopting the policy discouraged by the RFC, while for mapping the majority were following the recommendation. For allocation, the RFC did not make any specific relevant recommendation. The percentage of NATs following both recommendations was 30%.

C. Success Rate

Given the set of peers present in the network at one point in time, we conduct a connectivity test where all peers try to connect to each other. We group the result by traversal techniques, e.g. SHP is applicable for 186 combinations, so we average the success rate over all combinations and the whole process is repeated a number of times, we have found  as expected that SHP is rather dependable as it succeeds 96% of the time. We also found that PRP is as good as SHP, which is quite positive given that we found that the probability of occurrence of preservation is quite high in the last section. Interleaved PRP-PRP is also rather good with slightly worse success rate. The three remaining techniques involving PRC in a way or the other are causing the success rate to drop significantly especially for PRC-PRC mainly because of the additional delay for interleaving.

D. Time to traverse

When it comes to the time needed for the traversal process to complete, we find two main classes, SHP and PRP in one class and PRC in another class, even when we do PRC-PRP, it is faster than PRC alone because the number of messages is less.

CONCLUSION & FUTURE WORK
In this paper, we have presented our experience with trying to find a comprehensive analysis of what combinations of NAT types are traversable. We have shown that using a semi-formal reasoning that covers all cases and we provided a slightly augmented versions of the well-known traversal techniques and shown which ones are applicable for which combinations.We have shown that about 80% of all possible combinations are traversable.

Using our deployment base for P2P live streaming, we have shown that only 50% for all possible types are encounterable. We have also reported our findings on the success probability and time of traversing the different combinations.

For future work: a) Modeling: we would like to enrich the model to make it capture real-life aspects like expiration of NAT rules, multiple levels of NAT, subtleties of conflicts between many peers behind the same NAT, NATs that use different policies in different situations, and support for uPnP and TCP; b) Real-life Evaluation: more insight into the tradeoff between success probability and timing, preventing the techniques as being identified as malicious actions in some corporate firewalls; c) Dissemination: releasing our library and simulator as open-source for third-party improvement and evaluation. XI.

 

ACKNOWLEDGMENTS

We would like to thank all members of Peerialism’s development team for the help and collaboration on the implementation of our NAT traversal techniques, in particular Magnus Hedbeck for his patience and valuable feedback. The anonymous reviewers of ICCCN have provided a really inspiring set of comments that helped us to improve the quality of this paper.