crypto

Peer to peer Tweeting, explored

Had another laymen discussion with my dad today (laymen since he's a statistical physicist and I'm a hacker, so neither of us are experts on information theory) and became even more convinced that Twitter could only truly unfold its potential as a political censorship-buster if it worked in a decentralized fashion, as I described some weeks ago.

Here's a bit of a brainstorm, listing several points.

Requirements

Short messages must be distributed quickly throughout a scalable network without relying on any individual node. Individual peers must not be able to trace messages back to their physical origin, yet messages must be authenticated as originating from common sources to establish a reputation system.

Authentication

In order to prove that a message originated from the same source as another message, both can be digitally signed using a public key scheme. While the paper on Off-the-Record Communication, or, Why Not To Use PGP warns that signing in this fashion negates deniability, that seems to be an inherent price paid for one-to-many communication. Multiple recipients are by definition multiple witnesses to the exchange.

What doesn't seem to be necessary is an authentication of the intermediary nodes. IE, if a message is sent by Alice to Bob, Bob passes it on to Carl, and Carl sends it to Daniel, then Daniel has absolutely no need to know that Bob was involved in the chain. Alice's signature is enough to establish the message as Alice's message. (The security of Alice's secret key, and contingency in the event it is compromised, are separate problems.)

UDP

A popular tool of disruption (particularly in China) appears to be a manufactured interruption of TCP connections via an RST (reset) packet. The messages this protocol sends are by nature tiny (Twitter's great innovation) and could easily fit into a single UDP packet, crypto overhead and all (UDP's theoretical maximum is 64kB). UDP offers the advantage of being harder to eavesdrop on, immune to RST forgery, and able to be sent via broadcast and multicast.

A disadvantage is the loss of a handshake-based key exchange. Messages either have to be passed on in clear, or the initial introduction of two peers must include a key exchange that will be used for all succeeding encryption (regular expiration of the key may be a good idea).

Blind relays

As the requirements state, the origin of a message must be untraceable, as politically sensitive messages originate with valuable and endangered sources. Ideally, even the first recipient (Bob) should not be aware that they are in contact with the original author (Alice). That means that Alice would sign a message, pass it on to Bob, who would then pass it on to further peers without modification or further authentication. The number of hops a message has been through must not be recorded. Likewise, a precise creation timestamp (to within less than an hour) must not be recorded as Bob could then derive that all messages sent by Alice's computer are brand-new and likely originated with her.

The network cannot, by definition, notice circular transmissions, at least without journalling messages sent recently. But the journal doesn't need to be long to prevent a self-amplified flood - it would probably be enough to remember messages received in the past day, and drop any messages received again within that time.

However, this system still has some vulnerability: If Alice persistently sends her own messages to Bob, Bob will eventually notice that all his messages signed by Alice tend to first reach him from Alice's computer, and might guess that Alice owns Alice's computer. I can see two countermeasures for this that may or may not be mutually exclusive:

1. Alice needs to send her own messages to a few of a wide pool of recipients. When messages signed by Alice are sent by Alice's computer to 1 out of n peers with equal likelihood, then to each of those peers will get their first copy from Alice's computer only with 1/n likelihood. The larger n is, the less Alice's computer will stand out as a sender ot Alice's messages.
2. Bob must be limited to a small number of peers that will pass anything on to him. If Bob receives messages from n peers, then his likelihood of receiving a particular message from a particular peer is 1/n. The larger n is, the more suspicious Bob will be to receive multiple messages signed by Alice from the same (Alice's) computer.

Discovery without discovery

In order to keep peers blind, the second idea above dictates that no single node is allowed to explore a wide section of the network. When Alice introduces Bob and Carl to each other, Bob and Carl both know the other is one of Alice's peers. If Bob finds out a significant portion of Alice's peers, he might start guessing which of them sent her a particular message she passed on to him. More so, if Bob also learns Carl's peers, he can find the intersect of both their peers and try to triangulate the origin of a message he received from both Alice and Carl.

I note that tiredness increasingly instills paranoia in me, so some of the above might be less worrisome than I fear now (especially if Alice and Carl delay their messages by a random number of seconds, so Bob cannot guess whether the message he got from both Alice and Carl reached both of them via the same peer.

Godfathering

It goes without saying that to be secure, the network must not rely on any single tracker reachable via a public site (analogous to The Pirate Bay for BitTorrent). Even when the tracker is outside the censor's reach, access to it can be blocked too easily. My idea is that the prime contact ("Godfather", so to speak) of every new initiate is the person they received the software from. But this poses a significant risk of deception: When Eve gives the software to Alice, Eve has complete power over Alice's initial contacts. By feeding Alice only peers owned by a single entity (eg the Iranian government), Alice is unknowingly immersed in a hostile network - whenever she sends out a message, the government knows she wrote it. How do you ensure that the Godfather is not a double agent?

Will brainstorm more when I'm awake again.

Twitter via P2P

This is a bit of speculative brainstorming. A lot of this must have been thought of before, and when I have time I will research to see what papers have been written about it.

One premise is that Twitter's social network is currently being used in a different way than just a community of contacts. It is also a network through which messages spread virally, repeated and relayed from person to person. Casual users of Twitter may not have seen much of the informal so-called "RT" code yet. It is an abbreviation for "Re-Tweet", and is used interchangeably to signify that a certain message should be resubmitted by anyone who reads it, or that it already is a resubmission originally by someone else. This technique is used particularly in the #iran/#neda "tweetsphere", where news updates are globally significant rather than being personal communication between individuals.

A sample of what such updates may look like is here, picked up some minutes ago.

(14:16:50) Twitter: iranbaan: Mousavi: I don't fear responding to the gov. allegations #IranElection (14:17:28) Twitter: iranbaan: Mousavi: I'm ready to show those who run the #election are now lining up with those who commit riots and #IranElection (14:17:52) Twitter: iranbaan: Mousavi: I won't back down by threats that their characters are known to our people #Iranelection (14:18:13) Twitter: iranbaan: Mousavi:If those who committed atrocities in 1999 in Tehr. Univ. has been punished we wouldn't see today #Iranelection (14:18:30) Twitter: dominiquerdr: RT : Mousavi has officially announce that he can not get in touch with ppl. http://tinyurl.com/kvbewq #iranelection (14:18:47) Twitter: dominiquerdr: RT : Mousavi: I don't fear responding to the gov. allegations #IranElection (14:19:03) Twitter: dominiquerdr: RT : Mousavi: I'm ready to show those who run the #election are now lining up with those who commit riots and #IranElection (14:19:15) Twitter: lotfan: Ahmadinejad Assails Obama as Opposition Urges Defiance http://bit.ly/2h1wK #iranelection #gr88

This acts as a mass-moderated, theoretically decentralized network spreading short messages between peers without requiring every broadcast to be sent to every person, allowing indirect messages to spread further the more significantly they are considered by the people who read them.

I say "theoretically" because in practice, all this still happens over the Twitter database. Twitter's social network is a virtual construct within the central entity of Twitter. "Retweeting" does not actually do anything like "passing on a message"; it merely produces a copy of the same message on the same server, readable by other subscribers. To block this communication, all a censor has to do is filter access to twitter.com (which is in fact happening in Iran, last I heard).

Yesterday around three in the morning, I spent some time feverishly wondering whether Twitter's virtual "relay network" on a single database could be turned into an actual decentralized relay network between a multitude of computers, highly resistant to filtering.

The threats that this network must be safe from are these:

Port blocking
The software must either communicate via randomized ports or via a port that cannot be blocked unilaterally (80).
Active Infiltration
The network must be resistant to malicious flooding/spamming. This would work via a web of trust where peers are gradually gaining more trust the more of their messages are passed on.
Attacking the center
Whatever mechanism the network uses to introduce peers to each other, no central database is safe from attack.
Passive Infiltration
The network must not allow peers to harvest peer identities, because privacy is a matter of life or death in Iran right now.

Protections against the first two threats are solidly established, and implemented in many Peer-to-Peer technologies in the wild, such as BitTorrent (which already has reputation networks for prioritizing those peers known to be the best contributors).

The second threat is somewhat tricky, as Blue Frog and more recently The Pirate Bay have shown. Blue Security, for those who missed it, used a peer-based approach to spam fighting three years ago. In May/June 2006, Blue Security was hit by vast DDoS attacks and eventually shut down: The central server, the vulnerable Achilles heel in the system, had been disabled. The Pirate Bay is presently embroiled in a lawsuit for "enabling" copyright infringement, and in spite of the existence of "trackerless Torrents" via a Distributed Hash Table, it is clear that without a central tracker like TPB, discovering an initial peer i less simple.

The last is even more intricate, as it makes discovery of peers barely possible. When you join a BitTorrent cloud, you subscribe to a tracker database that contains your IP and those of the other peers. An interested entity (such as the RIAA and its slightly more evil cousin, the Iranian secret police) can easily subscribe to the same tracker, discover those peers that happen to be within its sphere of influence and then act accordingly (sue for millions, beat to death with axes, et cetera). Those who use BitTorrent for copyright violations don't bother with more privacy, as legal proceedings in the US impose (some) rules on what constitutes evidence or ethical investigation. Basij on motorcycles with axes knocking on your door at night don't have such inhibitions, so people need hard crypto.

The basic dilemma is: how can a decentralized network grow dynamically, while no single computer is allowed access to all peers, and no central database is safe from attack? It seems like a chicken-and-egg problem. Some steganography ideas come to mind (hiding peer addresses in remote corners of the web or in spam mail), or some indirect routing ideas (eg. all connections inside a censoring country must first go outside that country across jurisdictional boundaries), but that's very theoretical. I don't know enough about network technology.

Brainstorm out.

Syndicate content