Hidden services: overview and preliminaries
Hidden services aim to provide responder anonymity for bidirectional stream-based communication on the Tor network. Unlike regular Tor connections, where the connection initiator receives anonymity but the responder does not, hidden services attempt to provide bidirectional anonymity.
Participants:
Operator -- A person running a hidden service
Host, "Server" -- The Tor software run by the operator to provide
a hidden service.
User -- A person contacting a hidden service.
Client -- The Tor software running on the User's computer
Hidden Service Directory (HSDir) -- A Tor node that hosts signed
statements from hidden service hosts so that users can make
contact with them.
Introduction Point -- A Tor node that accepts connection requests
for hidden services and anonymously relays those requests to the
hidden service.
Rendezvous Point -- A Tor node to which clients and servers
connect and which relays traffic between them.
Improvements over previous versions
Here is a list of improvements of this proposal over the legacy hidden services:
a) Better crypto (replaced SHA1/DH/RSA1024 with SHA3/ed25519/curve25519) b) Improved directory protocol leaking less to directory servers. c) Improved directory protocol with smaller surface for targeted attacks. d) Better onion address security against impersonation. e) More extensible introduction/rendezvous protocol. f) Offline keys for onion services g) Advanced client authorization
Notation and vocabulary
Unless specified otherwise, all multi-octet integers are big-endian.
We write sequences of bytes in two ways:
1. A sequence of two-digit hexadecimal values in square brackets,
as in [AB AD 1D EA].
2. A string of characters enclosed in quotes, as in "Hello". The
characters in these strings are encoded in their ascii
representations; strings are NOT nul-terminated unless
explicitly described as NUL terminated.
We use the words "byte" and "octet" interchangeably.
We use the vertical bar | to denote concatenation.
We use INT_N(val) to denote the network (big-endian) encoding of the unsigned integer "val" in N bytes. For example, INT_4(1337) is [00 00 05 39]. Values are truncated like so: val % (2 ^ (N * 8)). For example, INT_4(42) is 42 % 4294967296 (32 bit).
Cryptographic building blocks
This specification uses the following cryptographic building blocks:
* A pseudorandom number generator backed by a strong entropy source.
The output of the PRNG should always be hashed before being posted on
the network to avoid leaking raw PRNG bytes to the network
(see [PRNG-REFS]).
* A stream cipher STREAM(iv, k) where iv is a nonce of length
S_IV_LEN bytes and k is a key of length S_KEY_LEN bytes.
* A public key signature system SIGN_KEYGEN()->seckey, pubkey;
SIGN_SIGN(seckey,msg)->sig; and SIGN_CHECK(pubkey, sig, msg) ->
{ "OK", "BAD" }; where secret keys are of length SIGN_SECKEY_LEN
bytes, public keys are of length SIGN_PUBKEY_LEN bytes, and
signatures are of length SIGN_SIG_LEN bytes.
This signature system must also support key blinding operations
as discussed in appendix [KEYBLIND] and in section [SUBCRED]:
SIGN_BLIND_SECKEY(seckey, blind)->seckey2 and
SIGN_BLIND_PUBKEY(pubkey, blind)->pubkey2 .
* A public key agreement system "PK", providing
PK_KEYGEN()->seckey, pubkey; PK_VALID(pubkey) -> {"OK", "BAD"};
and PK_HANDSHAKE(seckey, pubkey)->output; where secret keys are
of length PK_SECKEY_LEN bytes, public keys are of length
PK_PUBKEY_LEN bytes, and the handshake produces outputs of
length PK_OUTPUT_LEN bytes.
* A cryptographic hash function H(d), which should be preimage and
collision resistant. It produces hashes of length HASH_LEN
bytes.
* A cryptographic message authentication code MAC(key,msg) that
produces outputs of length MAC_LEN bytes.
* A key derivation function KDF(message, n) that outputs n bytes.
As a first pass, I suggest:
* Instantiate STREAM with AES256-CTR.
* Instantiate SIGN with Ed25519 and the blinding protocol in
[KEYBLIND].
* Instantiate PK with Curve25519.
* Instantiate H with SHA3-256.
* Instantiate KDF with SHAKE-256.
* Instantiate MAC(key=k, message=m) with H(k_len | k | m),
where k_len is htonll(len(k)).
When we need a particular MAC key length below, we choose MAC_KEY_LEN=32 (256 bits).
For legacy purposes, we specify compatibility with older versions of the Tor introduction point and rendezvous point protocols. These used RSA1024, DH1024, AES128, and SHA1, as discussed in rend-spec.txt.
As in [proposal 220], all signatures are generated not over strings themselves, but over those strings prefixed with a distinguishing value.
Protocol building blocks
In sections below, we need to transmit the locations and identities of Tor nodes. We do so in the link identification format used by EXTEND2 messages in the Tor protocol.
NSPEC (Number of link specifiers) [1 byte]
NSPEC times:
LSTYPE (Link specifier type) [1 byte]
LSLEN (Link specifier length) [1 byte]
LSPEC (Link specifier) [LSLEN bytes]
Link specifier types are as described in tor-spec.txt. Every set of link specifiers SHOULD include at minimum specifiers of type [00] (TLS-over-TCP, IPv4), [02] (legacy node identity) and [03] (ed25519 identity key). Sets of link specifiers without these three types SHOULD be rejected.
As of 0.4.1.1-alpha, Tor includes both IPv4 and IPv6 link specifiers in v3 onion service protocol link specifier lists. All available addresses SHOULD be included as link specifiers, regardless of the address that Tor actually used to connect/extend to the remote relay.
We also incorporate Tor's circuit extension handshakes, as used in the CREATE2 and CREATED2 cells described in tor-spec.txt. In these handshakes, a client who knows a public key for a server sends a message and receives a message from that server. Once the exchange is done, the two parties have a shared set of forward-secure key material, and the client knows that nobody else shares that key material unless they control the secret key corresponding to the server's public key.
Assigned relay message types
These relay message types are reserved for use in the hidden service protocol.
32 -- RELAY_COMMAND_ESTABLISH_INTRO
Sent from hidden service host to introduction point;
establishes introduction point. Discussed in
[REG_INTRO_POINT].
33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS
Sent from client to rendezvous point; creates rendezvous
point. Discussed in [EST_REND_POINT].
34 -- RELAY_COMMAND_INTRODUCE1
Sent from client to introduction point; requests
introduction. Discussed in [SEND_INTRO1]
35 -- RELAY_COMMAND_INTRODUCE2
Sent from introduction point to hidden service host; requests
introduction. Same format as INTRODUCE1. Discussed in
[FMT_INTRO1] and [PROCESS_INTRO2]
36 -- RELAY_COMMAND_RENDEZVOUS1
Sent from hidden service host to rendezvous point;
attempts to join host's circuit to
client's circuit. Discussed in [JOIN_REND]
37 -- RELAY_COMMAND_RENDEZVOUS2
Sent from rendezvous point to client;
reports join of host's circuit to
client's circuit. Discussed in [JOIN_REND]
38 -- RELAY_COMMAND_INTRO_ESTABLISHED
Sent from introduction point to hidden service host;
reports status of attempt to establish introduction
point. Discussed in [INTRO_ESTABLISHED]
39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED
Sent from rendezvous point to client; acknowledges
receipt of ESTABLISH_RENDEZVOUS message. Discussed in
[EST_REND_POINT]
40 -- RELAY_COMMAND_INTRODUCE_ACK
Sent from introduction point to client; acknowledges
receipt of INTRODUCE1 message and reports success/failure.
Discussed in [INTRO_ACK]
Acknowledgments
This design includes ideas from many people, including
Christopher Baines,
Daniel J. Bernstein,
Matthew Finkel,
Ian Goldberg,
George Kadianakis,
Aniket Kate,
Tanja Lange,
Robert Ransom,
Roger Dingledine,
Aaron Johnson,
Tim Wilson-Brown ("teor"),
special (John Brooks),
s7r
It's based on Tor's original hidden service design by Roger Dingledine, Nick Mathewson, and Paul Syverson, and on improvements to that design over the years by people including
Tobias Kamm,
Thomas Lauterbach,
Karsten Loesing,
Alessandro Preite Martinez,
Robert Ransom,
Ferdinand Rieger,
Christoph Weingarten,
Christian Wilms,
We wouldn't be able to do any of this work without good attack designs from researchers including
Alex Biryukov,
Lasse Ă˜verlier,
Ivan Pustogarov,
Paul Syverson,
Ralf-Philipp Weinmann,
See [ATTACK-REFS] for their papers.
Several of these ideas have come from conversations with
Christian Grothoff,
Brian Warner,
Zooko Wilcox-O'Hearn,
And if this document makes any sense at all, it's thanks to editing help from
Matthew Finkel,
George Kadianakis,
Peter Palfrader,
Tim Wilson-Brown ("teor"),
[XXX Acknowledge the huge bunch of people working on 8106.] [XXX Acknowledge the huge bunch of people working on 8244.]
Please forgive me if I've missed you; please forgive me if I've misunderstood your best ideas here too.