
Back in 2006 the GNU Telephony project delivered the first interoperability free software (free as in freedom) reference implementation of Phil Zimmerman's ZRTP stack, largely thanks to the work of Werner Dittman and Federico Pouzols, and as first used in the Twinkle softphone. ZRTP relied on "social key verification", and hence the benefits offered were limited to real-time VoIP applications where one could engage in "social verification". I had thought about creating a more general purpose solution that offers the benefits of ZRTP, including unique per-session crypto-keys that nobody knows and hence protects both the operators who cannot be tortured or past sessions should one of the communicating nodes be later compromised at a future point. I have come up with a design for a new kind of stack that meets these objectives, and I would be happy to see it introduced to users in Ubuntu Narwhal.
For some background, the premise of Phil Zimmermann’s ZRTP system is that each party generates a (largely) random per-session private and public key. The public keys are then exchanged during session setup, and a unique hash is generated at each end. This hash can only be the same if there is no man in the middle substituting keys, for they cannot generate a valid hash match on the public keys without knowing the private keys. These hashes are then “socially verified” directly by the users reading their hashes to each other over a VoIP call.
The approach I am taking is to directly exchange the hashes of the per-session generated public keys between the endpoints by signing them with a digital signature as a means to guarantee the hashes are actually valid. Each end then can compare the hash it generates locally with the signed hash it receives. The hashes by themselves reveal nothing about the actual keys used, and the signature assures they are reported without being altered, so their transmission in the clear does not compromise the integrity of the system and hence, like in GPG itself, the signing keys for the hashes can be static and used to verify users and sessions through a web of trust.
To actually break this kind of system, one has to have access to the private signing key on the node as well as sitting in the middle at the right time. If one has physical access to or otherwise has directly compromised such a node, then one need not go through such a complex process to compromise that user’s remaining communication anyway. In any case, the security of un-compromised nodes remains intact.
This methodology I feel works far better for the kinds of secure exchanges that SSL is traditionally being used for today as well as better for peer-to-peer cryptographic applications and realtime sessions. Since each session uses a random and unique key set, there is similarly (to ZRTP) no forward knowledge to decrypt past sessions even if the present node one is communicating with has become compromised. This also, like ZRTP, remains a zero knowledge system where the operator of such services has no knowledge of the actual keys being used, and hence cannot be tortured to provide information he does not have.
Since the process of verification can be automated, it can be used for protecting things like email exchange (smtp, imap, etc), vpn’s, etc. In fact anything SSL is used for this should also be usable. Verification need not be done “in-session”, as it can also be done in realtime communication sessions entirely separately with ZRTP style social key exchange if that is desired. Signed hashes of past sessions can even be stored somewhere that is separately accessible through alternate means for later verification, since again neither the hashes nor their signatures reveal anything about the actual keys used in past sessions.
Once a reference implementation of the core stack is completed, it should be interesting to see what kinds of applications it might better enable that must communicate securely on the public Internet.