Peer to peer Tweeting, explored
Had another laymen discussion with my dad today (laymen since he's a statistical physicist and I'm a hacker, so neither of us are experts on information theory) and became even more convinced that Twitter could only truly unfold its potential as a political censorship-buster if it worked in a decentralized fashion, as I described some weeks ago.
Here's a bit of a brainstorm, listing several points.
Requirements
Short messages must be distributed quickly throughout a scalable network without relying on any individual node. Individual peers must not be able to trace messages back to their physical origin, yet messages must be authenticated as originating from common sources to establish a reputation system.
Authentication
In order to prove that a message originated from the same source as another message, both can be digitally signed using a public key scheme. While the paper on Off-the-Record Communication, or, Why Not To Use PGP warns that signing in this fashion negates deniability, that seems to be an inherent price paid for one-to-many communication. Multiple recipients are by definition multiple witnesses to the exchange.
What doesn't seem to be necessary is an authentication of the intermediary nodes. IE, if a message is sent by Alice to Bob, Bob passes it on to Carl, and Carl sends it to Daniel, then Daniel has absolutely no need to know that Bob was involved in the chain. Alice's signature is enough to establish the message as Alice's message. (The security of Alice's secret key, and contingency in the event it is compromised, are separate problems.)
UDP
A popular tool of disruption (particularly in China) appears to be a manufactured interruption of TCP connections via an RST (reset) packet. The messages this protocol sends are by nature tiny (Twitter's great innovation) and could easily fit into a single UDP packet, crypto overhead and all (UDP's theoretical maximum is 64kB). UDP offers the advantage of being harder to eavesdrop on, immune to RST forgery, and able to be sent via broadcast and multicast.
A disadvantage is the loss of a handshake-based key exchange. Messages either have to be passed on in clear, or the initial introduction of two peers must include a key exchange that will be used for all succeeding encryption (regular expiration of the key may be a good idea).
Blind relays
As the requirements state, the origin of a message must be untraceable, as politically sensitive messages originate with valuable and endangered sources. Ideally, even the first recipient (Bob) should not be aware that they are in contact with the original author (Alice). That means that Alice would sign a message, pass it on to Bob, who would then pass it on to further peers without modification or further authentication. The number of hops a message has been through must not be recorded. Likewise, a precise creation timestamp (to within less than an hour) must not be recorded as Bob could then derive that all messages sent by Alice's computer are brand-new and likely originated with her.
The network cannot, by definition, notice circular transmissions, at least without journalling messages sent recently. But the journal doesn't need to be long to prevent a self-amplified flood - it would probably be enough to remember messages received in the past day, and drop any messages received again within that time.
However, this system still has some vulnerability: If Alice persistently sends her own messages to Bob, Bob will eventually notice that all his messages signed by Alice tend to first reach him from Alice's computer, and might guess that Alice owns Alice's computer. I can see two countermeasures for this that may or may not be mutually exclusive:
1. Alice needs to send her own messages to a few of a wide pool of recipients. When messages signed by Alice are sent by Alice's computer to 1 out of n peers with equal likelihood, then to each of those peers will get their first copy from Alice's computer only with 1/n likelihood. The larger n is, the less Alice's computer will stand out as a sender ot Alice's messages.
2. Bob must be limited to a small number of peers that will pass anything on to him. If Bob receives messages from n peers, then his likelihood of receiving a particular message from a particular peer is 1/n. The larger n is, the more suspicious Bob will be to receive multiple messages signed by Alice from the same (Alice's) computer.
Discovery without discovery
In order to keep peers blind, the second idea above dictates that no single node is allowed to explore a wide section of the network. When Alice introduces Bob and Carl to each other, Bob and Carl both know the other is one of Alice's peers. If Bob finds out a significant portion of Alice's peers, he might start guessing which of them sent her a particular message she passed on to him. More so, if Bob also learns Carl's peers, he can find the intersect of both their peers and try to triangulate the origin of a message he received from both Alice and Carl.
I note that tiredness increasingly instills paranoia in me, so some of the above might be less worrisome than I fear now (especially if Alice and Carl delay their messages by a random number of seconds, so Bob cannot guess whether the message he got from both Alice and Carl reached both of them via the same peer.
Godfathering
It goes without saying that to be secure, the network must not rely on any single tracker reachable via a public site (analogous to The Pirate Bay for BitTorrent). Even when the tracker is outside the censor's reach, access to it can be blocked too easily. My idea is that the prime contact ("Godfather", so to speak) of every new initiate is the person they received the software from. But this poses a significant risk of deception: When Eve gives the software to Alice, Eve has complete power over Alice's initial contacts. By feeding Alice only peers owned by a single entity (eg the Iranian government), Alice is unknowingly immersed in a hostile network - whenever she sends out a message, the government knows she wrote it. How do you ensure that the Godfather is not a double agent?
Will brainstorm more when I'm awake again.
- 2 comments
- 1181 reads
Twitter via P2P
This is a bit of speculative brainstorming. A lot of this must have been thought of before, and when I have time I will research to see what papers have been written about it.
One premise is that Twitter's social network is currently being used in a different way than just a community of contacts. It is also a network through which messages spread virally, repeated and relayed from person to person. Casual users of Twitter may not have seen much of the informal so-called "RT" code yet. It is an abbreviation for "Re-Tweet", and is used interchangeably to signify that a certain message should be resubmitted by anyone who reads it, or that it already is a resubmission originally by someone else. This technique is used particularly in the #iran/#neda "tweetsphere", where news updates are globally significant rather than being personal communication between individuals.
A sample of what such updates may look like is here, picked up some minutes ago.
This acts as a mass-moderated, theoretically decentralized network spreading short messages between peers without requiring every broadcast to be sent to every person, allowing indirect messages to spread further the more significantly they are considered by the people who read them.
I say "theoretically" because in practice, all this still happens over the Twitter database. Twitter's social network is a virtual construct within the central entity of Twitter. "Retweeting" does not actually do anything like "passing on a message"; it merely produces a copy of the same message on the same server, readable by other subscribers. To block this communication, all a censor has to do is filter access to twitter.com (which is in fact happening in Iran, last I heard).
Yesterday around three in the morning, I spent some time feverishly wondering whether Twitter's virtual "relay network" on a single database could be turned into an actual decentralized relay network between a multitude of computers, highly resistant to filtering.
The threats that this network must be safe from are these:
- Port blocking
- The software must either communicate via randomized ports or via a port that cannot be blocked unilaterally (80).
- Active Infiltration
- The network must be resistant to malicious flooding/spamming. This would work via a web of trust where peers are gradually gaining more trust the more of their messages are passed on.
- Attacking the center
- Whatever mechanism the network uses to introduce peers to each other, no central database is safe from attack.
- Passive Infiltration
- The network must not allow peers to harvest peer identities, because privacy is a matter of life or death in Iran right now.
Protections against the first two threats are solidly established, and implemented in many Peer-to-Peer technologies in the wild, such as BitTorrent (which already has reputation networks for prioritizing those peers known to be the best contributors).
The second threat is somewhat tricky, as Blue Frog and more recently The Pirate Bay have shown. Blue Security, for those who missed it, used a peer-based approach to spam fighting three years ago. In May/June 2006, Blue Security was hit by vast DDoS attacks and eventually shut down: The central server, the vulnerable Achilles heel in the system, had been disabled. The Pirate Bay is presently embroiled in a lawsuit for "enabling" copyright infringement, and in spite of the existence of "trackerless Torrents" via a Distributed Hash Table, it is clear that without a central tracker like TPB, discovering an initial peer i less simple.
The last is even more intricate, as it makes discovery of peers barely possible. When you join a BitTorrent cloud, you subscribe to a tracker database that contains your IP and those of the other peers. An interested entity (such as the RIAA and its slightly more evil cousin, the Iranian secret police) can easily subscribe to the same tracker, discover those peers that happen to be within its sphere of influence and then act accordingly (sue for millions, beat to death with axes, et cetera). Those who use BitTorrent for copyright violations don't bother with more privacy, as legal proceedings in the US impose (some) rules on what constitutes evidence or ethical investigation. Basij on motorcycles with axes knocking on your door at night don't have such inhibitions, so people need hard crypto.
The basic dilemma is: how can a decentralized network grow dynamically, while no single computer is allowed access to all peers, and no central database is safe from attack? It seems like a chicken-and-egg problem. Some steganography ideas come to mind (hiding peer addresses in remote corners of the web or in spam mail), or some indirect routing ideas (eg. all connections inside a censoring country must first go outside that country across jurisdictional boundaries), but that's very theoretical. I don't know enough about network technology.
Brainstorm out.
- 1 comment
- 1513 reads
Twitter Daemon, follow-up
In re my previous post, needless to say it is nearly impossible to properly operate a terminal if the background scripts interrupts each minute by injecting messages directly into the character console. Particularly when you are trying to edit a text file in vim at the moment.
I saw the light of reason and am now simply appending the messages to a text file. Then, I can open another terminal, set it to follow the text file using tail -f file. Whew. My shell is calm and peaceful again.
- Add new comment
- 1796 reads
My very own Twitter Daemon
I've built myself a twitter notifications daemon out of duct-tape and spit (with liberal application of bash)!
First of all, here is the daemon script itself. It is in bash, and runs in a continuous loop until killed. I use several components, most notably "twidge" to download new messages. Twidge is a command line utility for twitter. When twidge receives new messages, I display them with notify-send.
#!/bin/bash echo "Starting demon..." while true do echo "Downloading messages on "`date` messages=`twidge -c ~/.twidgecron lsrecent -alsu|tac|awk -F "t" '{print $2,$4}'` if [ ! -z "$messages" ] then echo "New messages!" for terminal in `ls /dev/pts` do echo "" /dev/pts/$terminal echo "New Tweets!" /dev/pts/$terminal done echo "$messages"|while read message do message=($message) sender=${message[0]} icon=/usr/share/icons/gnome/scalable/status/mail-unread.svg icon=`bash ~/scripts/twitter-user-pic $sender` content=${message[@]:1} notify-send -i $icon -t 15000 "$sender" "$content" for terminal in `ls /dev/pts` do echo "$sender $content" /dev/pts/$terminal done done else echo "Nothing new." fi sleep 60 done
The "twitter-user-pic" bash script downloads and stores user's profile images. It looks like this.
#!/bin/bash if [ ! -z "$1" ] then picture="$HOME/graphics/avatars/twitter/$1" if [ -f "$picture" ] then echo $picture else url=`python2.6 ~/scripts/python/twitchy/twitchy.py picture -u "$1"` if [ ! -z "$url" ] then picture="$HOME/graphics/avatars/twitter/$1" wget -qO - "$url" "$picture" echo $picture fi fi fi
"twitchy.py" is my own little Python script that can currently only query user profile picture URLs (may be extended later). As you can see, the twitter-user-pic script checks a local file cache to see if the picture exists, and otherwise downloads it. Everything about this is rickety and hacky, starting from the lack of file extensions.
This is my python script:
#!/usr/bin/python2.6 import sys; import getopt; import twitter; def main(argv): api = twitter.Api(); api.SetCredentials("arancaytar", "hunter2"); // Don't bother trying. command = "" user = "" if argv[0][0] != "-": command = argv[0] argv = argv[1:] opts, args = getopt.getopt(argv, "c:u:", ["command=", "user="]); for opt, arg in opts: if opt in ("-c", "--command"): command = arg elif opt in ("-u", "--user"): user = arg if user != "": account = api.GetUser(user) if command == "picture": print account.profile_image_url else: if command == "picture": print "picture command requires -u user" main(sys.argv[1:])
So now you can see that while the endless-loop script runs, I will get messages sent both to the desktop (where they are unfortunately not clickable, as notify-send doesn't do actions). Instead, I'm broadcasting them (again, in a pretty dirty way, using /dev/pts/*) to all open TTYs (which I'm anticipating will crash something badly, but nothing has happened so far).
Then I have a starting and stopping script that works like this:
#!/bin/bash if [ "$1" == 'start' ] then if [ ! -f "$HOME/.twitterd.pid" ] then echo "Starting" ~/scripts/twitcron.sh /dev/null 2/dev/null echo $! "$HOME/.twitterd.pid" else echo "Already running: "`cat $HOME/.twitterd.pid` fi else if [ "$1" == 'stop' ] then if [ -f "$HOME/.twitterd.pid" ] then echo "Stopping" pid=`cat $HOME/.twitterd.pid` kill $pid rm $HOME/.twitterd.pid else echo "Not running." fi fi fi
And it actually works! I've launched the process and now get those tweets live to the desktop and the terminal. It's great. 
- 1 comment
- 2091 reads
Synching with Twitter
I'm trying to get my Drupal blog to submit every update to my Twitter account. Let's see if it works - I've definitely seen this in action among some Drupalers, so it shouldn't require any tweaking like the Drivel/Taxonomy thing did.
... update, some hours later... well, the Twitter part was easy; it was, predictably, the Drivel/Twitter part that was such horror.
Learning point: Drupal's Blog API has the problem that it implements the API of less powerful tools. MovableType doesn't even seem to have a free-tagging mechanism, let alone multiple vocabularies. The solution, really, is to make Drupal-spacific XMLRPC functions. Drupal offers features that leave other blogging engines in the dust, and the only way to make sure all these features can be used by client applications is to set its own standard rather than only implementing other APIs.
- 6 comments
- 935 reads
