Filename: 188-bridge-guards.txt
Title: Bridge Guards and other anti-enumeration defenses
Author: Nick Mathewson, Isis Lovecruft
Created: 14 Oct 2011
Modified: 10 Sep 2015
Status: Reserve
[NOTE: This proposal is marked as "reserve" because the enumeration
technique it addresses does not currently seem to be in use. See
ticket tor#7144 for more information. (2020 July 31)]
1. Overview
Bridges are useful against censors only so long as the adversary
cannot easily enumerate their addresses. I propose a design to make
it harder for an adversary who controls or observes only a few
nodes to enumerate a large number of bridges.
Briefly: bridges should choose guard nodes, and use the Tor
protocol's "Loose source routing" feature to re-route all extend
requests from clients through an additional layer of guard nodes
chosen by the bridge. This way, only a bridge's guard nodes can
tell that it is a bridge, and the attacker needs to run many more
nodes in order to enumerate a large number of bridges.
I also discuss other ways to avoid enumeration, recommending some.
These ideas are due to a discussion at the 2011 Tor Developers'
Meeting in Waterloo, Ontario. Practically none of the ideas here
are mine; I'm just writing up what I remember.
2. History and Motivation
Under the current bridge design, an attacker who runs a node can
identify bridges by seeing which "clients" make a large number of
connections to it, or which "clients" make connections to it in the
same way clients do. This has been a known attack since early
versions {XXXX check} of the design document; let's try to fix it.
2.1. Related idea: Guard nodes
The idea of guard nodes isn't new: since 0.1.1, Tor has used guard
nodes (first designed as "Helper" nodes by Wright et al in {XXXX})
to make it harder for an adversary who controls a smaller number of
nodes to eavesdrop on clients. The rationale was: an adversary who
controls or observes only one entry and one exit will have a low
probability of correlating any single circuit, but over time, if
clients choose a random entry and exit for each circuit, such an
adversary will eventually see some circuits from each client with a
probability of 1, thereby building a statistical profile of the
client's activities. Therefore, let each client choose its entry
node only from among a small number of client-selected "guard"
nodes: the client is still correlated with the same probability as
before, but now the client has a nonzero chance of remaining
unprofiled.
2.2. Related idea: Loose source routing
Since the earliest versions of Onion Routing, the protocol has
provided "loose source routing". In strict source routing, the
source of a message chooses every hop on the message's path. But
in loose source routing, the message traverses the selected nodes,
but may also traverse other nodes as well. In other words, the
client selects nodes N_a, N_b, and N_c, but the message may in fact
traverse any sequence of nodes N_1...N_j, so long as N_1=N_a,
N_x=N_b, and N_y=N_c, for 1 < x < y.
Tor has retained this feature, but has not yet made use of it.
3. Design
Every bridge currently chooses a set of guard nodes for its
circuits. Bridges should also re-route client circuits through
these circuits.
Specifically, when a bridge receives a request from a client to
extend a circuit, it should first create a circuit to its guard,
and then relay that extend cell through the guard. The bridge
should add an additional layer of encryption to outgoing cells on
that circuit corresponding to the encryption that the guard will
remove, and remove a layer of encryption on incoming cells on that
circuit corresponding to the encryption that the guard will add.
3.1. Loose-Source Routed Circuit Construction
Alice, an OP, is using a bridge, Bob, and she has chosen the
following path through the network:
Alice -> Bob -> Charlie -> Deidra
However, Bob has decided to take advantage of the loose-source
routing circuit characteristic (for example, in order to use a bridge
guard), and Bob has chosen N additional loose-source routed hop(s),
through which he will transparently relays cells.
NOTE: For the purposes of bridge guards, N is always 1. However, for
completion's sake, the following details of the circuit construction
are generalized to include N > 1. Additionally, the following steps
should hold for a hop at any position in Alice's circuit that has
decided to take advantage of the loose-source routing feature, not
only for bridge ORs.
From Alice's perspective, her circuit path matches the one diagrammed
above. However, the overall path of the circuit is:
Alice -> Bob -> Guillaume -> Charlie -> Deidra
From Bob's perspective, the circuit's path is:
Alice -> Bob -> Guillaume -> Charlie -> UNKNOWN
Interestingly, because Bob's behaviour towards Guillaume and choices
of cell types is that of a normal OP, Guillaume's perspective of the
circuit's path is:
Bob -> Guillaume -> Charlie -> UNKNOWN
That is, to Guillaume, Bob appears (for the most part) to be a
normally connecting client. (See §4.1 for more detailed analysis.)
3.1.1. Detailed Steps of Loose-Source Routed Circuit Construction
1. Connection from OP
Alice has connected to Bob, and she has sent to Bob either a
CREATE/CREATE_FAST or CREATE2 cell.
2. Loose-Source Path Selection
In anticipation of Alice's first RELAY_EARLY cell (which will
contain an EXTEND cell to Alice's next hop), Bob begins
constructing a loose-source routed circuit. To do so, Bob chooses
N additional hop(s):
2.a. For the first additional hop, H_1, Bob chooses a suitable
entry guard node, Guillaume, using the same algorithm as OPs.
See "§5 Guard nodes" of path-spec.txt for additional
information on the selection algorithm.
2.b. Each additional hop, [H_2, ..., H_N], is chosen at random
from a list of suitable, non-excluded ORs.
3. Loose-Source Routed Circuit Extension and Cell Types
Bob now follows the same procedure as OPs use to complete the key
exchanges with his chosen additional hop(s).
While undergoing these following substeps, Bob SHOULD continue to
proceed with Step 4, below, in parallel, as an optimization for
speeding up circuit construction.
3.a. Create Cells
Bob sends the appropriate type of create cell to Guillaume.
For ORs new enough to support the NTor handshake (nearly all
of them at this point), Bob sends a CREATE2 cell. Otherwise,
for ORs which only support the older TAP handshake, Bob sends
either a CREATE or CREATE_FAST cell, using the same
decision-making logic as OPs.
See §4.1 for more information the distinguishability of
bridges based upon whether they use CREATE versus
CREATE_FAST. Also note that the CREATE2 cell has since
become ubiquitous after this proposal was originally drafted.
Thus, because we prefer ORs which use loose-source routing to
behave (as much as possible) like OPs, we now prefer to use
CREATE2.
3.b. Created Cells
Later, when Bob receives a corresponding CREATED/CREATED_FAST
or CREATED2 cell from Guillaume, Bob extracts key material
for the shared forward and reverse keys, KG_f and KG_b,
respectively.
3.c. Extend Cells
When N > 1, for each additional hop, H_i, in [H_2, ..., H_N],
Bob chooses the appropriate type of extend cell for H_i, and
sends this extend cell to H_i-1, who transforms it into a
create cell in order to perform the extension. To choose
which type of extend cell to send, Bob uses the same
algorithm as an OP to determine whether to use EXTEND or
EXTEND2. Similar to the CREATE* cells above, for most modern
ORs, this will very likely mean an EXTEND2 cell.
3.d. Extended Cells
When a corresponding EXTENDED/EXTENDED2 cell is received for
an additional hop, H_i, Bob extracts the shared forward and
reverse keys, Ki_f and Ki_b, respectively.
4. Responding to the OP
Now that the additional hops in Bob's loose-source routed circuit
are chosen, and construction of the loose-source routed circuit
has begun, Bob answers Alice's original CREATE/CREATE_FAST or
CREATE2 cell (from Step 1) by sending the corresponding created
cell type.
Alice has now built a circuit through Bob, and the two share the
negotiated forward and reverse keys, KB_n and KB_p, respectively.
Note that Bob SHOULD do this step in tandem with the loose-source
routed circuit construction procedure outlined in Step 3, above.
5. OP Circuit Extension
Alice then wants to extend the circuit to node Charlie. She makes
a hybrid-encrypted onionskin, encrypted to Charlie's public key,
containing her chosen g^x value. She puts this in an extend cell:
"Extend (Charlie's address) (Charlie's OR Port) (Onionskin)
(Charlie's ID)". She encrypts this with KB_n and sends it as a
RELAY_EARLY cell to Bob.
Bob's behaviour is now dependent on whether the loose-source
routed circuit construction steps (as outlined in Step 3, above)
have already completed.
5.a. The Loose-Source Routed Circuit Construction is Incomplete
If Bob has not yet finished the loose-source routed circuit
construction, then Bob MUST store the first outgoing
(i.e. exitward) RELAY_EARLY cell received from Alice until
the loose-source routed circuit construction has been
completed.
If any incoming (i.e. toward the OP) RELAY* cell is received
while the loose-source routed circuit is not fully
constructed, Bob MUST drop the cell.
If Bob has already stored Alice's first RELAY_EARLY cell, and
Alice sends any additional RELAY* cell, then Bob SHOULD mark
the entire circuit for close with END_CIRC_REASON_TORPROTOCOL.
5.b. The Loose-Source Routed Circuit Construction is Completed
Later, when the loose-source routed circuit is fully
constructed, Bob MUST send any stored cells from Alice
outward by following the procedure described in Step 6.a.
6. Relay Cells
When receiving a RELAY* cell in either direction, Bob MAY keep
statistics on the number of relay cells encountered, as well as
the number of relay cells relayed.
6.a. Outgoing Relay Cells
Bob decrypts the RELAY* cell with KB_n. If the cell becomes
recognized, Bob should now follow the relay command checks
described in Step 6.c.
Bob MUST encrypt the relay cell's underlying payload to each
additional hop in the loose-source routed circuit, in
reverse: for each additional hop, H_i, in [H_N, ..., H_1],
Bob encrypts the relay cell payload to Ki_f, the shared
forward key for the hop H_i.
Bob MUST update the forward digest, DG_f, of the relay cell,
regardless of whether or not the cell is recognized. See
6.c. for additional information on recognized cells.
Bob now sends the cell outwards through the additional hops.
At each hop, H_i, the hop removes a layer of the onionskin by
decrypting the cell with Ki_f, and then hop H_i forwards the
cell to the next addition additional hop H_i+1. When the
final additional hop, H_N, received the cell, the OP's cell
command and payload should be processed by H_N in the normal
manner for an OR.
6.b. Incoming Relay Cells
Bob MUST decrypt the relay cell's underlying payload from
each additional hop in the loose-source routed circuit (in
forward order, this time): For each additional hop, H_i, in
[H_1, ..., H_N], Bob decrypts the relay cell payload with
Ki_b, the shared backward key for the hop H_i.
If the cell has becomes recognized after all decryptions, Bob
should now follow the relay command checks described in Step
6.c.
Bob MUST update the backward digest, DG_b, of the relay cell,
regardless of whether or not the cell is recognized. See
6.c. for additional information on recognized cells.
Bob encrypts the cell towards the OP with KB_p, and sends the
cell inwards.
6.c. Recognized Cells
If a relay cell, either incoming or outgoing, becomes
recognized (i.e. Bob sees that the cell was intended for him)
after decryption, and there is no stream attached to the
circuit, then Bob SHOULD mark the circuit for close if the
relay command contained within the cell is any of the
following types:
- RELAY_BEGIN
- RELAY_CONNECTED
- RELAY_END
- RELAY_RESOLVE
- RELAY_RESOLVED
- RELAY_BEGIN_DIR
Apart from the above checks, Bob SHOULD essentially treat
every cell as "unrecognized" by following the en-/de-cryption
procedures in Steps 6.a. and 6.b. regardless of whether the
cell is actually recognized or not. That is, since this is a
loose-source routed circuit, Bob SHOULD relay cells not
intended for him *and* cells intended for him through the
leaky pipe, no matter what the cell's underlying payload and
command are.
3.1.2. Example Loose-Source Circuit Construction
For example, given the following circuit path chosen by Alice:
Alice -> Bob -> Charlie -> Deidra
when Alice wishes to extend to node Charlie, and Bob the bridge is
using only one additional loose-source routed hop, Guillaume, as his
bridge guard, the following steps are taken:
- Alice packages the extend into a RELAY_EARLY cell and encrypts
the RELAY_EARLY cell with KB_f to Bob.
- Bob receives the RELAY_EARLY cell from Alice, and he follows
the procedure (outlined in §3.1.1. Step 6.a.) by:
* Decrypting the cell with KB_f,
* Encrypting the cell to the forward key, KG_f, which Bob
shares with his guard node, Guillaume,
* Updating the cell forward digest, DG_f, and
* Sending the cell as a RELAY_EARLY cell to Guillaume.
- When Guillaume receives the cell from Bob, he processes it by:
* Decrypting the cell with KG_f. Guillaume now sees that it
is a RELAY_EARLY cell containing an extend cell "intended"
for him, containing: "Extend (Charlie's address) (Charlie's
OR Port) (Onionskin) (Charlie's ID)".
* Performing the circuit extension to the specified node,
Charlie, by acting accordingly: creating a connection to
Charlie if he doesn't have one, ensuring that the ID is as
expected, and then sending the onionskin in a create cell
on that connection. Note that Guillaume is behaving
exactly as a regular node would upon receiving an Extend
cell.
* Now the handshake finishes. Charlie receives the onionskin
and sends Guillaume "CREATED g^y,KH".
* Making an extended cell for Bob which contains
"E(KG_b, EXTENDED g^y KH)", and
* Sending the extended cell to Bob. Note that Charlie and
Guillaume are both still behaving in a manner identical to
regular ORs.
- Bob receives the extended cell from Guillaume, and he follows
the procedure (outlined in §3.1.1. Step 6.b.) by:
* Decrypting the cell with KG_b,
* Encrypting the cell to Alice with KB_b,
* Updating the cell backward digest, DG_b, and
* Sending the cell to Alice.
- Alice receives the cell, and she decrypts it with KB_b, just
as she would have if Bob had extended to Charlie directly.
She then processes the extended cell contained within to
extract shared keys with Charlie. Note that Alice's behaviour
is identical to regular OPs.
3.2. Additional Notes on the Construction
Note that this design does not require that our stream cipher
operations be commutative, even though they are.
Note also that this design requires no change in behavior from any
node other than Bob, and as we can see in the above example in §3.1.2
for Alice's circuit extension, Alice, Guillaume, and Charlie behave
identical to a normal OP and normal ORs.
Finally, observe that even though the circuit N hops longer than it
would be otherwise, no relay's count of permissible RELAY_EARLY cells
falls lower than it otherwise would. This is because the extra hop
that Bob adds is done with RELAY_EARLY cells, then he continues to
relay Alice's cells as RELAY_EARLY, until the appropriate maximum
number of RELAY_EARLY cells is reached. Afterwards, further
RELAY_EARLY cells from Alice are repackaged by Bob as normal RELAY
cells.
4. Alternative designs
4.1. Client-enforced bridge guards
What if Tor didn't have loose source routing? We could have
bridges tell clients what guards to use by advertising those guard
in their descriptors, and then refusing to extend circuits to any
other nodes. This change would require all clients to upgrade in
order to be able to use the newer bridges, and would quite possibly
cause a fair amount of pain along the way.
Fortunately, we don't need to go down this path. So let's not!
4.2. Separate bridge-guards and client-guards
In the design above, I specify that bridges should use the same
guard nodes for extending client circuits as they use for their own
circuits. It's not immediately clear whether this is a good idea
or not. Having separate sets would seem to make the two kinds of
circuits more easily distinguishable (even though we already assume
they are distinguishable). Having different sets of guards would
also seem like a way to keep the nodes who guard our own traffic
from learning that we're a bridge... but another set of nodes will
learn that anyway, so it's not clear what we'd gain.
One good reason to keep separate guard lists is to prevent the
*client* of the bridge from being able to enumerate the guards that
the bridge uses to protect its own traffic (by extending a circuit
through the bridge to a node it controls, and finding out where the
extend request arrives from).
5. Additional bridge enumeration methods and protections
In addition to the design above, there are more ways to try to
prevent enumeration.
Right now, there are multiple ways for the node after a bridge to
distinguish a circuit extended through the bridge from one
originating at the bridge. (This lets the node after the bridge
tell that a bridge is talking to it.)
5.1. Make it harder to tell clients from bridges
When using the older TAP circuit handshake protocol, one of the
giveaways is that the first hop in a circuit is created with
CREATE_FAST cells, but all subsequent hops are created with CREATE
cells.
However, because nearly everything in the network now uses the newer
NTor circuit handshake protocol, clients send CREATE2 cells to all
hops, regardless of position. Therefore, in the above design, it's
no longer quite so simple to distinguish an OP connecting through
bridge from an actual OP, since all of the circuits that extend
through a bridge now reach its guards through CREATE2 cells (whether
the bridge originated them or not), and only as a fallback (e.g. if
an additional node in the loose-source routed path does not support
NTor) will the bridge ever use CREATE/CREATE_FAST. (Additionally,
when using the fallback mathod, the behaviour for choosing either
CREATE or CREATE_FAST is identical to normal OP behaviour.)
The CREATE/CREATE_FAST distinction is not the only way for a
bridge's guard to tell bridges from orginary clients, however.
Most importantly, a busy bridge will open far more circuits than a
client would. More subtly, the timing on response from the client
will be higher and more highly variable that it would be with an
ordinary client. I don't think we can make bridges behave wholly
indistinguishably from clients: that's why we should go with guard
nodes for bridges.
[XXX For further research: we should study the methods by which a
bridge guard can determine that they are acting as a guard for a
bridge, rather than for a normal OP, and which methods are likely to
be more accurate or efficient than others. -IL]
5.2. Bridge Reachability Testing
Currently, a bridge's reachability is tested both by the bridge
itself (called "self-testing") and by the BridgeAuthority.
5.2.1. Bridge Reachability Self-Testing
Before a bridge uploads its descriptors to the BridgeAuthority, it
creates a special type of testing circuit which ends at itself:
Bob -> Guillaume -> Charlie -> Bob
Thus, going to all this trouble to later use loose-source routing in
order to relay Alice's traffic through Guillaume (rather than
connecting directly to Charlie, as Alice intended) is diminished by
the fact that Charlie can still passively enumerate bridges by
waiting to be asked to connect to a node which is not contained
within the consensus.
We could get around this option by disabling self-testing for bridges
entirely, by automatically setting "AssumeReachable 1" for all bridge
relays… although I am not sure if this is wise.
Our best idea thus far, for bridge reachability self-testing, is to create
a circuit like so:
Bridge → Guard → Middle → OtherMiddle → Guard → Bridge
While, clearly, that circuit is just a little bit insane, it must be that
way because we cannot simply do:
Bridge → Guard → Middle → Guard → Bridge
because the Middle would refuse to extend back to the previous node
(all ORs follow this rule). Similarly, it would be inane to do:
Bridge → Guard → Middle → OtherMiddle → Bridge
because, obviously, that merely shifts the problem to OtherMiddle and
accomplishes nothing. [XXX Is there something smarter we could do? —IL]
5.2.2. Bridge Reachability Testing by the BridgeAuthority
After receiving Bob's descriptors, the BridgeAuthority attempts to
connect to Bob's ORPort by making a direct TLS connection to the
bridge's advertised ORPort.
Should we change this behaviour? One the one hand, at least this
does not enable any random OR in the entire network to enumerate
bridges. On the other hand, any adversary who can observe packets
from the BridgeAuthority is capable of enumeration.
6. Other considerations
What fraction of our traffic is bridge traffic? Will this alter
our circuit selection weights?