diff --git a/router/doc/tunnel-alt.html b/router/doc/tunnel-alt.html new file mode 100644 index 000000000..075709e45 --- /dev/null +++ b/router/doc/tunnel-alt.html @@ -0,0 +1,428 @@ +$Id: tunnel.html,v 1.10 2005/01/16 01:07:07 jrandom Exp $ +
+1) Tunnel overview
+2) Tunnel operation
+2.1) Message preprocessing
+2.2) Gateway processing
+2.3) Participant processing
+2.4) Endpoint processing
+2.5) Padding
+2.6) Tunnel fragmentation
+2.7) PRNG pairs
+2.8) Alternatives
+2.8.1) Adjust tunnel processing midstream
+2.8.2) Use bidirectional tunnels
+3) Tunnel building
+3.1) Peer selection
+3.1.1) Exploratory tunnel peer selection
+3.1.2) Client tunnel peer selection
+3.2) Request delivery
+3.3) Pooling
+3.4) Alternatives
+3.4.1) Telescopic building
+3.4.2) Non-exploratory tunnels for management
+4) Tunnel throttling
+5) Mixing/batching
+
+ +

1) Tunnel overview

+ +

Within I2P, messages are passed in one direction through a virtual +tunnel of peers, using whatever means are available to pass the +message on to the next hop. Messages arrive at the tunnel's +gateway, get bundled up and/or fragmented into fixed sizes tunnel messages, +and are forwarded on to the next hop in the tunnel, which processes and verifies +the validity of the message and sends it on to the next hop, and so on, until +it reaches the tunnel endpoint. That endpoint takes the messages +bundled up by the gateway and forwards them as instructed - either +to another router, to another tunnel on another router, or locally.

+ +

Tunnels all work the same, but can be segmented into two different +groups - inbound tunnels and outbound tunnels. The inbound tunnels +have an untrusted gateway which passes messages down towards the +tunnel creator, which serves as the tunnel endpoint. For outbound +tunnels, the tunnel creator serves as the gateway, passing messages +out to the remote endpoint.

+ +

The tunnel's creator selects exactly which peers will participate +in the tunnel, and provides each with the necessary confiruration +data. They may have any number of hops, but may be constrained with various +proof-of-work requests to add on additional steps. It is the intent to make +it hard for either participants or third parties to determine the length of +a tunnel, or even for colluding participants to determine whether they are a +part of the same tunnel at all (barring the situation where colluding peers are +next to each other in the tunnel). A pair of synchronized PRNGs are used at +each hop in the tunnel to validate incoming messages and prevent abuse through +loops.

+ +

Beyond their length, there are additional configurable parameters +for each tunnel that can be used, such as a throttle on the frequency of +messages delivered, how padding should be used, how long a tunnel should be +in operation, whether to inject chaff messages, and what, if any, batching +strategies should be employed.

+ +

In practice, a series of tunnel pools are used for different +purposes - each local client destination has its own set of inbound +tunnels and outbound tunnels, configured to meet its anonymity and +performance needs. In addition, the router itself maintains a series +of pools for participating in the network database and for managing +the tunnels themselves.

+ +

I2P is an inherently packet switched network, even with these +tunnels, allowing it to take advantage of multiple tunnels running +in parallel, increasing resiliance and balancing load. Outside of +the core I2P layer, there is an optional end to end streaming library +available for client applications, exposing TCP-esque operation, +including message reordering, retransmission, congestion control, etc.

+ +

2) Tunnel operation

+ +

Tunnel operation has four distinct processes, taken on by various +peers in the tunnel. First, the tunnel gateway accumulates a number +of tunnel messages and preprocesses them into something for tunnel +delivery. Next, that gateway encrypts that preprocessed data, then +forwards it to the first hop. That peer, and subsequent tunnel +participants, unwrap a layer of the encryption, verifying the +integrity of the message, then forward it on to the next peer. +Eventually, the message arrives at the endpoint where the messages +bundled by the gateway are split out again and forwarded on as +requested.

+ +

Tunnel IDs are 4 byte numbers used at each hop - participants know what +tunnel ID to listen for messages with and what tunnel ID they should be forwarded +on as to the next hop. Tunnels themselves are short lived (10 minutes at the +moment), but depending upon the tunnel's purpose, and though subsequent tunnels +may be built using the same sequence of peers, each hop's tunnel ID will change.

+ +

2.1) Message preprocessing

+ +

When the gateway wants to deliver data through the tunnel, it first +gathers zero or more I2NP messages, selects how much padding will be used, +fragments it across the necessary number of 1KB tunnel messages, and decides how +each I2NP message should be handled by the tunnel endpoint, encoding that +data into the raw tunnel payload:

+ + +

The instructions are encoded as follows:

+ + +

The I2NP message is encoded in its standard form, and the +preprocessed payload must be padded to a multiple of 16 bytes.

+ +

2.2) Gateway processing

+ +

After the preprocessing of messages into a padded payload, the gateway builds +a random 4 byte preIV value, iteratively encrypting it and the tunnel message as +necessary, selects the next message ID from its outbound PRNG, and forwards the tuple +{tunnelID, messageID, preIV, encrypted tunnel message} to the next hop.

+ +

How encryption at the gateway is done depends on whether the tunnel is an +inbound or an outbound tunnel. For inbound tunnels, they simply select a random +preIV, postprocessing and updating it to generate the IV for the gateway and using +that IV along side their own layer key to encrypt the preprocessed data. For outbound +tunnels they must iteratively decrypt the (unencrypted) preIV and preprocessed +data with the layer keys for all hops in the tunnel. The result of the outbound +tunnel encryption is that when each peer encrypts it, the endpoint will recover +the initial preprocessed data.

+ +

The preIV postprocessing should be a secure transform of the received value +with sufficient expansion to provide the full 16 byte IV necessary for AES256. +What transform should be used - HMAC-SHA256(preIV, layerKey), using bytes +0:15 as the IV, passing on bytes 16-19 as the next step's preIV? Should +we deliver an additional postprocessing layer key to each peer during the +tunnel creation to reduce the potential exposure +of the layerKey? Should we replace the 4 byte preIV with a full 16 byte preIV +(even though 4 bytes will likely provide a sufficient keyspace in which to +operate, as a single tunnel pumping 100KBps would only use 60,000 IVs)?

+ +

2.3) Participant processing

+ +

When a peer receives a tunnel message, it checks the inbound PRNG for that +tunnel, verifying that the message ID specified is one of the next available IDs, +thereby removing it from the PRNG and moving the window. If the message ID is +not one of the available IDs, it is dropped. The participant then postprocesses +and updates the preIV received to determine the current hop's IV, using that +with the layer key to encrypt the tunnel message. They then select the next +selects the next message ID from its outbound PRNG, forwarding the tuple +{nextTunnelID, nextMessageID, nextPreIV, encrypted tunnel message} to the next hop.

+ +

Each participant also maintains a bloom filter of preIV values used for the +lifetime of the tunnel at their hop, allowing them to drop any messages with +duplicate preIVs. The details of the hash functions used in the bloom filter +are not yet worked out. Suggestions?

+ +

2.4) Endpoint processing

+ +

After receiving and validating a tunnel message at the last hop in the tunnel, +how the endpoint recovers the data encoded by the gateway depends upon whether +the tunnel is an inbound or an outbound tunnel. For outbound tunnels, the +endpoint encrypts the message with its layer key just like any other participant, +exposing the preprocessed data. For inbound tunnels, the endpoint is also the +tunnel creator so they can merely iteratively decrypt the preIV and message, using the +layer keys of each step in reverse order.

+ +

At this point, the tunnel endpoint has the preprocessed data sent by the gateway, +which it may then parse out into the included I2NP messages and forwards them as +requested in their delivery instructions.

+ +

2.5) Padding

+ +

Several tunnel padding strategies are possible, each with their own merits:

+ + + +

Which to use? no padding is most efficient, random padding is what +we have now, fixed size would either be an extreme waste or force us to +implement fragmentation. Padding to the closest exponential size (ala freenet) +seems promising. Perhaps we should gather some stats on the net as to what size +messages are, then see what costs and benefits would arise from different +strategies? See gathered +stats

+ +

2.6) Tunnel fragmentation

+ +

To prevent adversaries from tagging the messages along the path by adjusting +the message size, all tunnel messages are a fixed 1KB in size. To accomidate +larger I2NP messages as well as to support smaller ones more efficiently, the +gateway splits up the larger I2NP messages into fragments contained within each +tunnel message. The endpoint will attempt to rebuild the I2NP message from the +fragments for a short period of time, but will discard them as necessary.

+ +

2.7) PRNG pairs

+ +

To minimize the damage from a DoS attack created by looped tunnels, a series +of synchronized PRNGs are used across the tunnel - the gateway has one, the +endpoint has one, and every participant has two. These in turn are broken down +into the inbound and outbound PRNG for each tunnel - the outbound PRNG is +synchronized with the inbound PRNG of the peer after you (obvious exception being +the endpoint, which has no peer after it). Outside of the PRNG with which each +is synchronized with, there is no relationship between any of the other PRNGs. +This is accomplished by using a common PRNG algorithm [tbd, perhaps +java.lang.random?], +seeded with the values delivered with the tunnel creation request. Each peer +prefetches the next few values out of the inbound PRNG so that it can handle +lost or briefly out of order delivery, using these values to compare against the +received message IDs.

+ +

An adversary can still build loops within the tunnels, but the damage done is +minimized in two ways. First, if there is a loop created by providing a later +hop with its next hop pointing at a previous peer, that loop will need to be +seeded with the right value so that its PRNG stays synchronized with the previous +peer's inbound PRNG. While some messages would go into the loop, as they start +to actually loop back, two things would happen. Either they would be accepted +by that peer, thereby breaking the synchronization with the other PRNG which is +really "earlier" in the tunnel, or the messages would be rejected if the real +"earlier" peer sent enough messages into the loop to break the synchronization.

+ +

If the adversary is very well coordinated and is colluding with several +participants, they could still build a functioning loop, though that loop would +expire when the tunnel does. This still allows an expansion of their work factor +against the overall network load, but with tunnel throttling this could even +be a useful positive tool for mitigating active traffic analysis.

+ +

2.8) Alternatives

+ +

2.8.1) Adjust tunnel processing midstream

+ +

While the simple tunnel routing algorithm should be sufficient for most cases, +there are three alternatives that can be explored:

+ + +

2.8.2) Use bidirectional tunnels

+ +

The current strategy of using two seperate tunnels for inbound and outbound +communication is not the only technique available, and it does have anonymity +implications. On the positive side, by using separate tunnels it lessens the +traffic data exposed for analysis to participants in a tunnel - for instance, +peers in an outbound tunnel from a web browser would only see the traffic of +an HTTP GET, while the peers in an inbound tunnel would see the payload +delivered along the tunnel. With bidirectional tunnels, all participants would +have access to the fact that e.g. 1KB was sent in one direction, then 100KB +in the other. On the negative side, using unidirectional tunnels means that +there are two sets of peers which need to be profiled and accounted for, and +additional care must be taken to address the increased speed of predecessor +attacks. The tunnel pooling and building process outlined below should +minimize the worries of the predecessor attack, though if it were desired, +it wouldn't be much trouble to build both the inbound and outbound tunnels +along the same peers.

+ +

3) Tunnel building

+ +

When building a tunnel, the creator must send a request with the necessary +configuration data to each of the hops, then wait for the potential participant +to reply stating that they either agree or do not agree. These tunnel request +messages and their replies are garlic wrapped so that only the router who knows +the key can decrypt it, and the path taken in both directions is tunnel routed +as well. There are three important dimensions to keep in mind when producing +the tunnels: what peers are used (and where), how the requests are sent (and +replies received), and how they are maintained.

+ +

3.1) Peer selection

+ +

Beyond the two types of tunnels - inbound and outbound - there are two styles +of peer selection used for different tunnels - exploratory and client. +Exploratory tunnels are used for both network database maintenance and tunnel +maintenance, while client tunnels are used for end to end client messages.

+ +

3.1.1) Exploratory tunnel peer selection

+ +

Exploratory tunnels are built out of a random selection of peers from a subset +of the network. The particular subset varies on the local router and on what their +tunnel routing needs are. In general, the exploratory tunnels are built out of +randomly selected peers who are in the peer's "not failing but active" profile +category. The secondary purpose of the tunnels, beyond merely tunnel routing, +is to find underutilized high capacity peers so that they can be promoted for +use in client tunnels.

+ +

3.1.2) Client tunnel peer selection

+ +

Client tunnels are built with a more stringent set of requirements - the local +router will select peers out of its "fast and high capacity" profile category so +that performance and reliability will meet the needs of the client application. +However, there are several important details beyond that basic selection that +should be adhered to, depending upon the client's anonymity needs.

+ +

For some clients who are worried about adversaries mounting a predecessor +attack, the tunnel selection can keep the peers selected in a strict order - +if A, B, and C are in a tunnel, the hop after A is always B, and the hop after +B is always C. A less strict ordering is also possible, assuring that while +the hop after A may be B, B may never be before A. Other configuration options +include the ability for just the inbound tunnel gateways and outbound tunnel +endpoints to be fixed, or rotated on an MTBF rate.

+ +

3.2) Request delivery

+ +

As mentioned above, once the tunnel creator knows what peers should go into +a tunnel and in what order, the creator builds a series of tunnel request +messages, each containing the necessary information for that peer. For instance, +participating tunnels will be given the 4 byte tunnel ID on which they are to +receive messages, the 4 byte tunnel ID on which they are to send out the messages, +the 32 byte hash of the next hop's identity, the pair of PRNG seeds for the inbound +and outbound PRNG, and the 32 byte layer key used to +remove a layer from the tunnel. Of course, outbound tunnel endpoints are not +given any "next hop" or "next tunnel ID" information, and neither the inbound +tunnel gateways nor the outbound tunnel endpoints need both PRNG seeds. To allow +replies, the request contains a random session tag and a random session key with +which the peer may garlic encrypt their decision, as well as the tunnel to which +that garlic should be sent. In addition to the above information, various client +specific options may be included, such as what throttling to place on the tunnel, +what padding or batch strategies to use, etc.

+ +

After building all of the request messages, they are garlic wrapped for the +target router and sent out an exploratory tunnel. Upon receipt, that peer +determines whether they can or will participate, creating a reply message and +both garlic wrapping and tunnel routing the response with the supplied +information. Upon receipt of the reply at the tunnel creator, the tunnel is +considered valid on that hop (if accepted). Once all peers have accepted, the +tunnel is active.

+ +

3.3) Pooling

+ +

To allow efficient operation, the router maintains a series of tunnel pools, +each managing a group of tunnels used for a specific purpose with their own +configuration. When a tunnel is needed for that purpose, the router selects one +out of the appropriate pool at random. Overall, there are two exploratory tunnel +pools - one inbound and one outbound - each using the router's exploration +defaults. In addition, there is a pair of pools for each local destination - +one inbound and one outbound tunnel. Those pools use the configuration specified +when the local destination connected to the router, or the router's defaults if +not specified.

+ +

Each pool has within its configuration a few key settings, defining how many +tunnels to keep active, how many backup tunnels to maintain in case of failure, +how frequently to test the tunnels, how long the tunnels should be, whether those +lengths should be randomized, how often replacement tunnels should be built, as +well as any of the other settings allowed when configuring individual tunnels.

+ +

3.4) Alternatives

+ +

3.4.1) Telescopic building

+ +

One question that may arise regarding the use of the exploratory tunnels for +sending and receiving tunnel creation messages is how that impacts the tunnel's +vulnerability to predecessor attacks. While the endpoints and gateways of +those tunnels will be randomly distributed across the network (perhaps even +including the tunnel creator in that set), another alternative is to use the +tunnel pathways themselves to pass along the request and response, as is done +in TOR. This, however, may lead to leaks +during tunnel creation, allowing peers to discover how many hops there are later +on in the tunnel by monitoring the timing or packet count as the tunnel is +built. Techniques could be used to minimize this issue, such as using each of +the hops as endpoints (per 2.7.2) for a random +number of messages before continuing on to build the next hop.

+ +

3.4.2) Non-exploratory tunnels for management

+ +

A second alternative to the tunnel building process is to give the router +an additional set of non-exploratory inbound and outbound pools, using those for +the tunnel request and response. Assuming the router has a well integrated view +of the network, this should not be necessary, but if the router was partitioned +in some way, using non-exploratory pools for tunnel management would reduce the +leakage of information about what peers are in the router's partition.

+ +

4) Tunnel throttling

+ +

Even though the tunnels within I2P bear a resemblence to a circuit switched +network, everything within I2P is strictly message based - tunnels are merely +accounting tricks to help organize the delivery of messages. No assumptions are +made regarding reliability or ordering of messages, and retransmissions are left +to higher levels (e.g. I2P's client layer streaming library). This allows I2P +to take advantage of throttling techniques available to both packet switched and +circuit switched networks. For instance, each router may keep track of the +moving average of how much data each tunnel is using, combine that with all of +the averages used by other tunnels the router is participating in, and be able +to accept or reject additional tunnel participation requests based on its +capacity and utilization. On the other hand, each router can simply drop +messages that are beyond its capacity, exploiting the research used on the +normal internet.

+ +

5) Mixing/batching

+ +

What strategies should be used at the gateway and at each hop for delaying, +reordering, rerouting, or padding messages? To what extent should this be done +automatically, how much should be configured as a per tunnel or per hop setting, +and how should the tunnel's creator (and in turn, user) control this operation? +All of this is left as unknown, to be worked out for +I2P 3.0

\ No newline at end of file