Filename: 266-removing-current-obsolete-clients.txt
Title: Removing current obsolete clients from the Tor network
Author: Nick Mathewson
Created: 14 Jan 2016
Status: Superseded
Superseded-by: 264, 272.
1. Introduction
Frequently, we find that very old versions of Tor should no longer be
supported on the network. To remove relays is easy enough: we
simply update the directory authorities to stop listing relays that
advertise versions that are too old.
But to disable clients is harder.
In another proposal I describe a system for letting future clients go
gracefully obsolete. This proposal explains how we can safely
disable the obsolete clients we have today (and all other client
versions of Tor to date, assuming that they will someday become
obsolete).
1.1. Why disable clients?
* Security. Anybody who hasn't updated their Tor client in 5
years is probably vulnerable to who-knows-what attacks. They
aren't likely to get much anonymity either.
* Withstand zombie installations. Some Tors out there were once
configured to start-on-boot systems that are now unmaintained.
(See 1.4 below.) They put needless load on the network, and help
nobody.
* Be able to remove backward-compatibility code. Currently, Tor
supports some truly ancient protocols in order to avoid breaking
ancient versions or Tor. This code needs to be maintained and
tested. Some of it depends on undocumented or deprecated or
non-portable OpenSSL features, and makes it hard to produce a
conforming Tor server implementation.
* Make it easier to write a conforming Tor relay. If a Tor relay
needs to support every Tor client back through the beginning of
time, that makes it harder to develop and test compatible
implementations.
1.2. Is this dangerous?
I don't think so. This proposal describes a way to make older
clients gracefully disconnect from the network only when a majority
of authorities agree that they should. A majority of authorities
already have the ability to inflict arbitrary degrees of sabotage on
the consensus document.
1.3. History
The earliest versions of Tor checked the recommended-versions field
in the directory to see whether they should keep running. If they
saw that their version wasn't recommended, they'd shut down. There
was an "IgnoreVersion" option that let you keep running anyway.
Later, around 2004, the rule changed to "shut down if the version is
_obsolete_", where obsolete was defined as "not recommended, and
older than a version that is recommended."
In 0.1.1.7-alpha, we made obsolete versions only produce a warning,
and removed IgnoreVersion. (See 3ac34ae3293ceb0f2b8c49.)
We have still disabled old tor versions. With Tor 0.2.0.5-alpha,
we disabled Tor versions before 0.1.1.6-alpha by having the v1
authorities begin publishing empty directories only.
In version 0.2.5.2-alpha, we completely removed support for the v2
directory protocol used before Tor 0.2.0; there are no longer any v2
authorities on the network.
Tor versions before 0.2.1 will currently not progress past fetching
an initial directory, because they believe in a number of directory
authority identity keys that no longer sign the directory.
Tor versions before 0.2.4 are (lightly) throttled in multihop
circuit creation, because we prioritize ntor CREATE cells over
TAP ones when under load.
1.4. The big problem: slow zombies and fast zombies
It would be easy enough to 'disable' old clients by simply removing
server support for the obsolete protocols that they use. But there's
a problem with that approach: what will the clients do when they fail
to make connections, or to extend circuits, or whatever else they are
no longer able to do?
* Ideally, I'd like such clients to stop functioning _quietly_. If
they stop contacting the network, that would be best.
* Next best would be if these clients contacted the network only
occasionally and at different times. I'll call these clients
"slow zombies".
* Worse would be if the clients contact the network frequently,
over and over. I'll call these clients "fast zombies". They
would be at their worst when they focus on authorities, or when
they act in synchrony to all strike at once.
One goal of this proposal is to ensure that future clients do not
become zombies at all; and that ancient clients become slow zombies
at worst.
2. Some ideas that don't work.
2.1. Dropping connections based on link protocols.
Tor versions before 0.2.3.6-alpha use a renegotiation-based
handshake instead of our current handshake. We could detect these
handshakes and close the connection at the relay side if the client
attempts to renegotiate.
I've tested these changes on versions maint-0.2.0 through
maint-0.2.2. They result in zombies with the following behavior:
The client contact each authority it knows about, attempting to
make a one-hop directory connection. It fails, detects a failure,
then reconnects more and more slowly ... but one hour later, it
resets its connection schedule and starts again.
In the steady state this appears to result in about two connections
per client per authority per hour. That is probably too many.
(Most authorities would be affected: of the authorities that existed
in 0.2.2, gabelmoo has moved and turtles has shut down. The
authorities Faravahar and longclaw are new. The authorities moria1,
tor26, dizum, dannenberg, urras, maatuska and maatuska would all get
hit here.) [two maatuskas? -RD]
(We could simply remove the renegotiation-detection code entirely,
and reply to all connections with an immediate VERSIONS cell. The
behavior would probably be the same, though.)
If we throttled connections rather than closing them, we'd only get
one connection per authority per hour, but authorities would have to
keep open a potentially huge number of sockets.
2.2. Blocking circuit creation under certain circumstances
In tor 0.2.5.1-alpha, we began ignoring the UseNTorHandshake option,
and always preferring the ntor handshake where available.
Unfortunately, we can't simply drop all TAP handshakes, since clients
and relays can still use them in the hidden service protocol. But
we could detect these versions by:
Looking for use of a TAP handshake from an IP not associated
with any known relay, or on a connection where the client
did not authenticate. (This could be from a bridge, but clients
don't build circuits that go to an IntroPoint or RendPoint
directly after a bridge.)
This would still result in clients not having directories, however,
and retrying once an hour.
3. Ideas that might work
3.1. Move all authorities to new ports
We could have each authority known to older clients start listening
for connections at a new port P. We'd forward the old port to the new
port. Once sufficiently many clients were using the new ports, we
could disable the forwarding.
This would result in the old clients turning into zombies as above,
but they would only be scrabbling at nonexistent ports, causing less
load on the authorities.
[This proposal would probably be easiest to implement.]
3.2. Start disabling old link protocols on relays
We could have new relays start dropping support for the old link
protocols, while maintaining support on the authorities and older
relays.
The result here would be a degradation of older client performance
over time. They'd still behave zombieishly if the authorities
dropped support, however.
3.3. Changing the consensus format.
We could allow 'f' (short for "flag") as a synonym for 's' in
consensus documents. Later, if we want to disable all Tor versions
before today, we can change the consensus algorithm so that the
consensus (or perhaps only the microdesc consensus) is spelled with
'f' lines instead of 's' lines. This will create a consensus which
older clients and relays parse as having all nodes down, which will
make them not connect to the network at all.
We could similarly replace "r" with "n", or replace Running with
Online, or so on.
In doing this, we could also rename fresh-until and valid-until, so
that new clients would have the real expiration date, and old clients
would see "this consensus never expires". This would prevent them
from downloading new consensuses.
[This proposal would result in the quietest shutdown.]
A. How to "pull the switch."
This is an example timeline of how we could implement 3.3 above,
along with proposal 264.
TIME 0:
Implement the client/relay side of proposal 264, backported
to every currently extant Tor version that we still
support.
At the same time, add support for the new consensus type to
all the same Tor versions.
Don't disable anything yet.
TIME 1....N:
Encourage all distributions shipping packages for those old
tor versions to upgrade to ones released at Time 0 or later.
Keep informed of the upgrade status of the clients and
relays on the Tor network.
LATER:
At some point after nearly all clients and relays have
upgraded to the versions released at Time 0 or later, we
could make the switchover to publishing the new consensus
type.
B. Next steps.
We should verify what happens when currently extant client
versions get an empty consensus. This will determine whether
3.3 will not work. Will they try to fetch a new one from the
authorities at the end of the validity period.
Another option is from Roger: we could add a flag meaning "ignore
this consensus; it is a poison consensus to kill old Tor
versions." And maybe we could have it signed only by keys that
the current clients won't accept. And we could serve it to old
clients rather than serving them the real consensus. And we
could give it a really high expiration time. New clients
wouldn't believe it. We'd need to flesh this out.
Another option is also from Roger: Tell new clients about new
locations to fetch directories from. Keep the old locations working
for as long as we want to support them. We'd need to flesh this
out too.
The timeline above requires us to keep informed of the status of
the different clients and relays attempting to connect to the tor
network. We should make sure we'll actually able to do so.
http://meetbot.debian.net/tor-dev/2016/tor-dev.2016-02-12-15.01.log.html
has a more full discussion of the above ideas.