Protocol

Version 1 (DRAFT)

Goals

The goals of this protocol are:

  1. Provide semi-trusted parties with the means of peer-to-peer communication in group chats (channels)
  2. Efficient synchronization of channel messages between semi-trusted peers
  3. Confidentiality and integrity of messages. Only past and current channel participants can read the messages. Read and write access has limited timespan. Malicious parties cannot modify message content
  4. Eventual consistency. Messages must be viewable offline. New messages can be posted offline and distributed once remote peers are available.

Security Disclaimer

Current version of protocol DOES NOT have:

Notes

Here and below Sodium is used for all cryptography operations.

Protocol Buffers are used for encoding of all messages.

Inspired by DAT, channel identifiers are generated with:

channel_id = HASH(channel_pub_key, 'peerlinks-channel-id')[:32]

All SyncRequests are encrypted with crypto_secretbox_easy and crypto_secretbox_open_easy using:

symmetric_key = HASH(channel_pub_key, 'peerlinks-symmetric')[:crypto_secretbox_KEYBYTES]

and random nonce.

The protocol below is transport-agnostic in a sense that it could be run using any available transport: MultipeerConnectivity, https, …

Everywhere below “TBS” stands for To Be Signed and indicates the data that is signed by some secret key, and the signature is stored in signature field of next to tbs field.

Initialization sequence

The first message over any wire protocol MUST be:

message Hello {
  uint32 version = 1;
  string peer_id = 2;
}

NOTE: peer_id.length MUST be checked to be equal to 32 bytes.

The hello.version specifies the protocol version and MUST be checked by the recipient. In case of the mismatch and/or other errors Error SHOULD be sent:

message Error {
  string reason = 1;
}

and connection MUST be closed.

Once both parties have sent and received hellos the hello.peer_id and remote_hello.peer_id are converted to big integers local and remote respectively. If abs(local - remote) is less than 2 ** 255 - the peer with the least big integer id MUST send Shake. If abs(local - remote) is greater or equal to 2 ** 255 - the peer with the greatest integer id MUST send Shake:

message Shake {
  bool is_duplicate = 1;
}

Connection MUST be closed if Shake is received in violation of big integer procedure above. Peer sending Shake SHOULD take remote_hello.peer_id in account to prevent duplicate connections. shake.is_duplicate SHOULD be set to true only if the connection to remote peer is deemed to be duplicate by sender. In this case the connection SHOULD be closed after sending Shake.

NOTE: reason.length MUST be checked to be less than 1024 utf-8 codepoints.

Further communication between peers happens using:

message SyncRequest {
  message TBS {
    // Empty for Feeds
    repeated Link chain = 1;

    oneof content {
      Query query = 2;
      Bulk bulk = 3;
    }

    bytes response_pub_key = 4;
  }

  message Content {
    TBS tbs = 1;

    // crypto_sign_detached(signature, tbs, leafSecretKey)
    bytes signature = 2;
  }

  bytes channel_id = 1;
  uint32 seq = 2;

  // `crypto_secretbox_easy(out, Content, symmetric_key)`
  bytes nonce = 3;
  bytes box = 4;
}

message SyncResponse {
  message Content {
    oneof content {
      QueryResponse queryResponse = 1;
      BulkResponse bulkResponse = 2;
    }
  }

  bytes channel_id = 1;
  uint32 seq = 2;

  // Encrypted with `crypto_box_seal` using `sync_request.response_pub_key`
  // from `SyncRequest`
  bytes box = 3;
}

message Packet {
  oneof content {
    Error error = 1;
    EncryptedInvite invite = 2;

    // Synchronization
    SyncRequest sync_request = 3;
    SyncResponse sync_response = 4;

    // Request synchronization on new messages
    Notification notification = 5;
  }
}

Particular packet sub-types are described below.

(TODO(indutny): find a mechanism to deter peers from spamming each other. Rate limit does not work, because the peer cannot be identified consistently in MultipeerConnectivity)

Channels, messages, and identities

At this point the peers SHOULD issue synchronization queries, but this section is not possible to discuss without discussing what channels and messages are.

Each peer SHOULD have one or more identities. The identity is:

  1. Asymmetric key pair (sodium.sign.keyPair)
  2. A set of “chains” for each channel the identity is allowed to post to.

Message

Each identity comes with their own channel (just as twitter users have their own feed). The channel MUST have a root message and MAY have more messages of following format:

message ChannelMessage {
  // First message on any channel
  message Root {
  }

  message Body {
    oneof body {
      Root root = 1;
      string json = 2;
    }
  }

  message TBS {
    // NOTE: can be empty only in the root message
    repeated bytes parents = 1;

    // height = max(p.height for p in parents)
    int64 height = 2;

    // Link chain that leads from the channel's public key to the signer of
    // this message
    repeated Link chain = 3;

    // Floating point unix time
    double timestamp = 4;

    // body of the message
    Body body = 5;
  }

  TBS tbs = 1;

  // crypto_sign_detached(signature, tbs, leafSecretKey)
  bytes signature = 2;
}

Despite having many fields the aim of ChannelMessage is to form a Direct Acyclic Graph (DAG) similar to the commit graph in git. In other words, each message except for the root MUST have one or more parents. content.parents contains the hashes of these parents, and the hash of the message might be used as a parent for some future message. Thus messages form a directed graph with no cycles (merges are possible, but just as in git they are not cycles because of edge directions).

The hash of the message is computed as:

hash = HASH(Content)[:32]

Maximum text length of json in UTF-8 codepoints MUST be enforced and is:

content.body MUST be json for non-root messages.

content.height is a number of edges between the message and the channel.root. channel.root naturally MUST have height = 0, and in general any message MUST have:

height = max(p.height for p in parents) + 1

height is an essential property for synchronization (more on it later).

content.timestamp MUST be a valid Unix time since Jan 1st 1970 00:00:00 UTC, and MUST be greater or equal to the maximum timestamp of the message parents. content.timestamp MUST NOT be too far in the future. Particular implementation SHOULD decide on the value of this leeway (~2 minutes is recommended).

The validity of content.timestamp and the validity of links MUST be enforced through the following mechanism. Suppose that for a received message:

min_timestamp = min(p.content.timestamp for p in content.parents)
max_timestamp = max(p.content.timestamp for p in content.parents)

The message MUST NOT be accepted and should be treated as INVALID (peer SHOULD issue an Error protocol packet and MUST close the connection to such peer) if:

delta_time = max_timestamp - min_timestamp

is greater than 30 days. This mechanism ensures that the peers with expired links can at worst create messages in the past, and that after 30 days those messages will no longer need to be synchronized or persisted.

NOTE: It is still possible for peers with valid chain to re-introduce those messages from the past into the current branch of DAG if they are out-of-sync with other peers. However, the mechanism above makes it unlikely (especially with introductions of https relays.)

The subscribers of the channel MUST verify the messages against full DAG:

Merges

Whenever new message is posted by a participant it SHOULD:

  1. Take at most 128 current DAG leaves
  2. For content.timestamp use maximum of: current unix time, leaves’ timestamps. (NOTE: with the 2 minute leeway above, the maximum difference is bounded anyway)
  3. Remove those that differ from the maximum timestamp by more than 30 days. These leaves SHOULD be used as parents for the new message. Peers naturally merge different branches into a single new leaf.

The chain is a collection of signed links:

message Link {
  message TBS {
    bytes trustee_pub_key = 1;
    string trustee_display_name = 2;
    double valid_from = 3;
    double valid_to = 4;

    // NOTE: This MUST be filled either by sender/recipient before
    // generating/verifying the signature below.
    bytes channel_id = 5;
  }

  TBS tbs = 1;

  // crypto_sign_detached(signature, tbs, parentSecretKey)
  bytes signature = 2;
}

that establish trust in a manner similar to Public Key Infrastructure (PKI). Each successive link in a chain is signed by the private key of the previous link. The first link is signed by the root. The maximum length of the chain is 3. Peer MUST validate the length of the chain against the limit, and MUST NOT accept messages with longer chains. (This should encourage peers to form tighter groups, and have more control over participants. The number 3 MAY be revised in the future version of the protocol.)

trustee_display_name is assigned by the issuer of the link. Maximum length of this string is 128 UTF-8 codepoints and MUST be enforced. Minimum length is 1 and MUST be enforced. Note that each participant of the channel gets unique “display path” (array of display names) starting from the root.

The link.valid_from and link.valid_to are both Unix-times since Jan 1st 1970 00:00:00 UTC. The valid_to of the whole chain is the minimum of valid_to of all links. The valid_from is then the maximum of valid_from of all links. This mechanism ensures granularity of control over peers’ ability to write new messages. The validity of the chain MUST be checked against the content.timestamp (see constraints on content.timestamp above.)

NOTE: When issuing new link set valid_from a bit in the past to avoid issues with slightly out-of-sync time. 2 minutes in the past should be safe to use.

The message.signature MUST be generated using the private key that corresponds to the last public key in the chain, or the channel’s private key if the chain is empty.

Invite

It is easy to see that the write access to the channel MUST be checked by validating the chain. Peers that do not have valid chain can read the channel, but cannot write to it. The members of channel with write access can invite other peers to participate in this channel by first requesting from them in a form of scanned QR code (or by other means):

message InviteRequest {
  bytes peer_id = 1;
  bytes trustee_pub_key = 2;
  bytes box_pub_key = 3;
}

where trustee_pub_key is the invitee’s public key, and box_pub_key is a public part of result of crypto_box_keypair from Sodium (with the secret part being box_secret_key).

NOTE: requesting invite reveals public key and an associated channel.

Upon receiving InviteRequest the peer having write access to the channel MUST consider the invitation carefully and ONLY IN CASE of user confirmation issue an EncryptedInvite:

message EncryptedInvite {
  // NOTE: `request_id = HASH(req.trustee_pub_key, 'peerlinks-invite')[:32]`
  bytes request_id = 1;

  bytes box = 2;
}

NOTE: crypto_box_seal, crypto_box_seal_open from Sodium are used for encrypting/decrypting the box contents using box_secret_key.

NOTE: peer_id.length MUST be checked to be equal to 32 bytes. trustee_pub_key length MUST be checked.

When decrypted encrypted_invite.box becomes:

message Invite {
  bytes channel_pub_key = 1;
  string channel_name = 2;

  repeated Link chain = 4;
}

The channel_name is a suggested name for the channel and MUST have no more than 128 UTF-8 codepoints.

The invite.links MUST be a chain from channel_priv_key to the request.trustee_key.

Synchronization

Notes

All requests has to be encrypted with a symmetric key derived from the channel public key:

symmetric_key = HASH(channel_pub_key, 'peerlinks-symmetric')[:crypto_secretbox_KEYBYTES]

and wrapped into SyncRequest packet’s box and nonce fields using crypto_secretbox_easy. Additional fields are:

SyncResponse is constructred by encrypting sub-response using either sync_request.chain.leaf_key or sync_request.public_key depending on the type of the feed. The encryption/decyption methods are similar to ones used in Invite process (crypto_box_seal, etc).

When chain is present it MUST be checked to be valid using current time.

SyncRequest could be issued for the channel that is not known to the peer. In such case SyncResponse’s box MUST be empty. Issuer of SyncRequest MUST handle such responses as empty responses.

Details

This DAG would make little sense without synchronization.

NOTE: For all messages below and for each channel the first Query and/or Bulk messages SHOULD have seq = 0, and the each next Query or Bulk SHOULD increment it by 1 and wrap it as in unsigned integer addition in C. The QueryResponse and BulkResponse MUST contain the same seq value as in the Query and Bulk respectively that triggered them.

Unencrypted Query is sent in order to request the latest messages from the channel:

message Query {
  repeated Link chain = 1;

  oneof cursor {
    int64 height = 2;
    bytes hash = 3;
  }
  bool is_backward = 4;
  uint32 limit = 5;
}

The query.cursor MUST be either of height or hash.

If query.chain is valid (see Link section above) the remote peer MUST respond with:

message QueryResponse {
  message Abbreviated {
    repeated bytes parents = 1;
    bytes hash = 2;
  }

  repeated Abbreviated abbreviated_messages = 1;
  bytes forward_hash = 2;
  bytes backward_hash = 3;
}

The synchronization process is following:

  1. Send Query with cursor.height set to local min_leaf_height (see below)
  2. Recipient of Query with cursor.height computes the starting height as min(min_leaf_height, cursor.height)
  3. Recipient replies with a slice of the list of abbreviated messages (see CRDT Order below) starting from the first message of content.height == height and either up to the latest message or up to query.limit messages total
  4. The recipient walks over received messages sequentially, Bulk-requesting the messages with known parents and messages with parents in the QueryResponse
  5. The recipient SHOULD keep the list of unknown parents seen so far, and in case if the list does not overflow should proceed to loop below:
  6. If the count of unknown parents is non-zero - issue a query with cursor.hash = response.backward_hash and is_backward = true
  7. Once there are no more unknown parents - the recipient repeatedly issues a query with is_backward = false and cursor.hash = reponse.forward_hash
  8. The synchronization is complete when the response the response with forward_hash == nil is fully processed and has no missing parents.

Malicious nodes or DAGs with divergent branches may overflow the list of unknown parents. In this case, cursor.hash has to be set to root.hash, and is_backward should be set to false. In other words, the sync has to start from the very first message. This is called “Full Sync”.

Bulk fetch

During synchronization process above the recipient MUST request the messages that are not present in their dataset. This can be done with Bulk packet:

message Bulk {
  repeated Link chain = 1;

  repeated bytes hashes = 2;
}

If bulk.chain is valid (see Link section above) the remote peer MUST respond with as many messages as possible, but NOT LESS THAN one. The messages should be in the same order as in bulk.hashes, except for allowed omissions for those hashes that were not found in the datastore or the trailing hashes that are omitted due to some constraints. In the latter case forward_index MUST be set to the number of processes messages:

message BulkResponse {
  repeated ChannelMessage messages = 1;
  uint32 forward_index = 2;
}

The originator of Bulk SHOULD resume the fetch if response.forward_index is not equal to the number of bulk.hashes they sent.

CRDT Order

The messages in channel MUST be sorted by increasing height and then by increasing hash. Thus the list of messages per-channel becomes a CRDT list, and two fully synchronized peers MUST agree completely on the order of the messages.

(TODO(indutny): does 30 day delta mechanism help here?)

New messages

On either received or posted message the peer MUST send:

message Notification {
  bytes channel_id = 1;
}

to notify ALL connected peers that new data is available for the channel. The subscribed peers SHOULD attempt to do a synchronization.

Ideas

(Something that is not implemented in neither client nor protocol, but might be desired in the future)

  1. Backward and forward secrecy of messages. Perhaps KDF chain could be used at least?
  2. Partial sync up to the last available message signed by root