A short introduction to Tor
Basic functionality
Tor is a distributed overlay network designed to anonymize low-latency TCP-based applications such as web browsing, secure shell, and instant messaging. The network is built of a number of servers, called relays (also called "onion routers" or "ORs" in some older documentation).
To connect to the network, a client needs to download an up-to-date signed directory of the relays on the network. These directory documents are generated and signed by a set of semi-trusted directory authority servers, and are cached by the relays themselves. (If a client does not yet have a directory, it finds a cache by looking at a list of stable cache locations, distributed along with its source code.)
For more information on the directory subsystem, see the directory protocol specification.
After the client knows the relays on the network, it can pick a relay and open a channel to one of these relays. A channel is an encrypted reliable non-anonymous transport between a client and a relay or a relay and a relay, used to transmit messages called cells. (Under the hood, a channel is just a TLS connection over TCP, with a specified encoding for cells.)
To anonymize its traffic, a client chooses a path—a sequence of relays on the network— and opens a channel to the first relay on the path (if it does not already have a channel open to that relay). The client then uses that channel to build a multi-hop cryptographic structure called a circuit. A circuit is built over a sequence of relays (typically three). Every relay in the circuit knows its precessor and successor, but no other relays in the circuit. Many circuits can be multiplexed over a single channel.
For more information on how paths are selected, see the path specification. The first hop on a path, also called a guard node, has complicated rules for its selection; for more on those, see the guard specification.
Once a circuit exists, the client can use it to exchange fixed-length relay cells with any relay on the circuit. These relay cells are wrapped in multiple layers of encryption: as part of building the circuit, the client negotiates a separate set of symmetric keys with each relay on the circuit. Each relay removes (or adds) a single layer of encryption for each relay cell before passing it on.
A client uses these relay cells
to exchange relay messages with relays on a circuit.
These "relay messages" in turn are used
to actually deliver traffic over the network.
In the simplest use case,
the client sends a BEGIN
message
to tell the last relay on the circuit
(called the exit node)
to create a new session, or stream,
and associate that stream
with a new TCP connection to a target host.
The exit node replies with a CONNECTED
message
to say that the TCP connection has succeeded.
Then the client and the exit exchange DATA
messages
to represent the contents of the anonymized stream.
Note that as of 2023, the specifications do not perfectly distinguish between relay cells and relay messages. This is because, until recently, there was a 1-to-1 relationship between the two: every relay cell held a single relay message. As proposal 340 is implemented, we will revise the specifications for improved clarify on this point.
Other kinds of relay messages can be used for more advanced functionality.
Using a system called conflux a client can build multiple circuits to the same exit node, and associate those circuits within a conflux set. Once this is done, relay messages can be sent over either circuit in the set, depending on capacity and performance.
For more on conflux, which has been integrated into the C tor implementation, but not yet (as of 2023) into this document, see proposal 329.
Advanced topics: Onion services and responder anonymity
In addition to initiating anonymous communications, clients can also arrange to receive communications without revealing their identity or location. This is called responder anonymity, and the mechanism Tor uses to achieve it is called onion services (or "hidden services" or "rendezvous services" in some older documentation).
For the details on onion services, see the Tor Rendezvous Specification.
Advanced topics: Censorship resistence
In some places, Tor is censored. Typically, censors do this by blocking connections to the addresses of the known Tor relays, and by blocking traffic that resembles Tor.
To resist this censorship, some Tor relays, called bridges, are unlisted in the public directory: their addresses are distributed by other means. (To distinguish ordinary published relays from bridges, we sometimes call them public relays.)
Additionally, Tor clients and bridges can use extension programs, called pluggable transports, that obfuscate their traffic to make it harder to detect.