Clearskies Cryptography

Last week I introduced a project I'm working on called ClearSkies, an open-source, peer-to-peer replacement for cloud sync software. I have since finished the first draft of the protocol. During the development of the protocol I ended up changing the core cryptography three times, and had to rewrite a good portion of the spec each time. While this sounds painful, it's far better than rewriting an entire program. In this post I will talk about each of the methods and the perceived weaknesses and strengths of each.

The specific part of cryptography involved is the key exchange.

Pre-shared key

My first solution is similar to that used by BitTorrent Sync (hereafter btsync). Each share has a key, which when given to another person, can be used both as a peer discovery mechanism and the communication mechanism.

In my solution, a 128-bit key was generated. The SHA256 hash of the key was used as a share ID which could then be used to find other people with the same key. The key can then be used as an AES encryption key to encrypt communication.

To make it possible to share in read-only mode, the SHA256 of the read-write key becomes the read-only key, and the SHA256 of the read-only key is the share ID. The keys need a unique prefix of some sort to know which was which.

The key can be written in base32 to make it easy to share, in which case it ends up being 20 bytes, plus an extra byte to identify the type.

The advantage of this method is that it is very simple. If the key is shared in person, it can be very secure, and is not vulnerable to a man-in-the-middle attack.

Conversely, a big disadvantage is that in practice a lot of keys would be sent via IM, SMS, and email.

Another disadvantage is that there's no way to tell if a peer that claims it is a read-write peer is actually a read-write peer. Read-write peers are the only peers authorized to make changes to files, but a malicious read-only peer could pretend it was a read-write peer and delete or change files. It could even potentially replace files with a virus. While a read-write peer would be immune by ignoring file changes coming from a read-only peer, it means that file changes can't propogate through a read-only peer.

Elliptic Curve

To combat the authenticity weakness, I reworked the protocol to use Elliptic Curve (EC) public key cryptography. To maintain the same level of security, the 128-bit key had to be a 256-bit key. This was used as an EC private key, which could be used as a pre-shared key for read-write shares. The private key could be used to sign updates, which would mean that read-only peers could no longer impersonate read-write peers.

The read-only key derivation was tricky. In order to maintain the same semantics as the PSK, I used the public key of the read-write key as the read-only key. This meant that the "public" key wasn't actually safe to share publicly.

The advantage of this method was that it was possible to sign all updates.

There were quite a few disadvantages:

The base32 version of the keys were quite long at 41 characters
EC isn't as well supported by cryptography toolsets
Some cryptographers don't trust EC since it originated with the NSA
Reusing the public key as the read-only key seemed too inventive

Access Codes

A much better way to do signatures would be with RSA public key cryptography. It is well understood and has stood the test of time. The problem is that RSA keys are HUGE. A 2048-bit key would take 358 bytes as base32. In order to exchange RSA keys, I decided to use a temporary access code in order to add a new peer to the share. Once connected with the access code, the AES and RSA keys can be exchanged and then the connection can be terminated.

This seems to be a good solution since it also reduced the keys-shared-via-email vulnerability. The access codes are by default both temporary and single-use, so an attack would need to happen in real time.

The access codes can also be short. While I settled on 128-bit, I believe that 64-bit would work equally well.

By having a separate key exchange, the read-only and read-write keys can be completely independent.

Since some users will want to create shares with long-lived access keys (imagine a group of coworkers in an office), I added the ability to create long-lived and multi-use keys as well.

The biggest disadvantage of this method is that it added more complexity to the protocol. Whether or not that is a problem that matters remains to be seen.

Steven Jewel Blog

Clearskies Cryptography

Pre-shared key

Elliptic Curve

Access Codes