From b9e8dbcf2afac5946cfdd2810fede419b78065f3 Mon Sep 17 00:00:00 2001
From: xdrm-brackets <xdrm.brackets.dev@gmail.com>
Date: Sun, 29 Jul 2018 11:57:27 +0200
Subject: [PATCH] add PROTOCOL.md [WIP]

---
 PROTOCOL.md | 225 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 225 insertions(+)
 create mode 100644 PROTOCOL.md
diff --git a/PROTOCOL.md b/PROTOCOL.md
new file mode 100644
index 0000000..79baf97
--- /dev/null
+++ b/PROTOCOL.md
@@ -0,0 +1,225 @@
+# Stateless Time Scrambling Protocol
+
+
+
+## Motivation
+
+After designing some APIs, I found out that you must have at some point a *token* system where the token is a fixed-length string. An easy MITM attack could be to repeat a previously sent request while the token is still valid. Or in worse conditions, catch the client *token* and build a malicious request with its authenticated session.
+
+In the rest of this document we will admit that what travels through the network is public, so any MITM can store it. Obviously I highly recommend using TLS for communicating, but we look here for a consistent token system, it only can be better through TLS.
+
+A good solution could be to use a one-time token so the server sends back a new token for each response. This behavior means that the token travels through the network - which we consider public - before being used. In addition we would like the server to create only a secret key once to fasten the authentication system, because it could manage millions of clients.
+
+A better solution would be to keep a private key and wrap it in a one-time system that generates a public token for each request. It would avoid attackers to repeat our requests or guess the private key from the token. Also short-lived one-time passwords have a mechanism that we could use to build a time-dependent system.
+
+**What we need**
+1. Generate a *public token* for each request from a fixed *private key*
+2. The *public token* never to be the same
+3. Each *public token* to be only valid a few seconds after sending it
+4. Each *public token* to give no clue that could help guessing the next token.
+5. A system where the *server* does not have to share a private key with each *client* and does not need to.
+
+**Technology requirements**
+1. Mixing 2 hashes in a way that without one of them, the other is *cryptographically impossible* to guess (*i.e. [one-time pad](https://en.wikipedia.org/wiki/One-time_pad)*).
+2. Having a time-dependent unique hash, that could be found only a few seconds after sending it (as for *[TOTP](https://tools.ietf.org/html/rfc6238)*).
+3. A cryptographic hash function that, from an input of any length, outputs a fixed-length digest in a way that is *impossible* to guess the input back from it. 
+
+**Protocols to define**
+
+This document will define and bundle 2 distinct protocols to implement a token system that implements the previous statements.
+
+1. a <u>Stateless Time Scrambling Protocol</u> to take care of the request's expiration over time
+2. a <u>Stateless Cyclic Hash Algorithm</u>  to generate several public keys from a single secret key in a way that no clue is given over published keys.
+
+## General knowledge & Notations
+
+##### Notation
+
+| Symbols | Description |
+|:-----:|:----------|
+| $\parallel a\parallel $ | The absolute value of $a$ ; *e.g.* $\parallel a \parallel = \parallel -a \parallel$ |
+| $\mid a \mid$ | The integer value of $a$ ; *e.g.* $\mid 12.34 \mid = 12$ |
+| $a \oplus b$ | The bitwise operation XOR between binary words $a$ and $b$ |
+| $h(m)$ | The digest of the message $m$ by a consistent cryptographic hashing function $h()$ ; *e.g. sha512* |
+| $h^n(m)$ | The digest of the $n$-recursive hashing function $h()$ with the input data $m$ ; *e.g. $h^2(m) \equiv h(h(m))$ , $h^1(m) \equiv h(m)$ and $h^0(m) \equiv m$*. |
+| $a \mod b$ | The result of $a$ modulo $b$ ; *i.e.* the remainder of the Euclidean division of $a$ by $b$ |
+| $T_{now}$ | The current *Unix Timestamp* in seconds |
+
+
+
+##### Entities
+
+- A machine $C$ (*typically a client*)
+- A machine $S$ (*typically a server*)
+
+##### Common variables
+
+These variables are both on the server and clients. They are specific to the server so each client must match these.
+
+| Notation | Name | Description |
+|:--------:|:----:|:------------|
+| $W$ | time window divider | A fixed number of seconds that is typically the maximum transmission time from end to end. |
+
+##### Client variables
+
+These variables defines the state of each client, each having different values.
+
+| Notation | Name | Description |
+|:--------:|:----:|:------------|
+| $K$ | shifting key | The client private key. It is only known by the client, it must be large enough not to be brute forced. |
+| $ n_0 $ | shifting nonce | A private number that is decremented for each request. It is unique to each private key $K$. Before $n_0$ reaches 0, a new key $K$ must be generated and $n_0$ is set to its higher value. |
+| $s$ | next request order | The next request order is a number that, according on its value, will change the client's behavior : <br>- $0$ : normal request<br>- $1$ : new key generated<br>- $2$ : rescue mode (resynchronize with the server) |
+
+##### Server variables
+
+| Notation | Name | Description |
+|:--------:|:----:|:------------|
+| $H$ | last valid hash | The server stores the last valid hash from the client to check the next one. |
+
+If a client sends its token $h^{n_0}(K)$, if the token is valid the server stores it inside $H$.
+
+> Note that for the first synchronization, the server has to "blindly" consider the token as valid.
+
+When the client sends its next token $h^{n_0-1}(K)$, the server  has to <u>hash</u> it and compare it with the last token $H$.
+
+>  $h(h^{n_0-1}) = h^{n_0}(K)$
+
+## Description of the problem
+
+$C$ wants to send a token that will only be valid one time and within a fixed time window.
+
+> *Note: This document only gives a solution for the time-dependent feature, the one-time aspect is wrapped into the implementation of the $f()​$ function. If you only need the time-dependent feature, you can set $f(x, n^1)​$ to always return $x​$ so the key $K​$ will be only protected by the time protection algorithm.*
+
+##### Constraints
+
+- $S$ must be able to recover the token if the data is received within the time window.
+- If the window expired, the token must be invalidated by the server.
+
+##### Limitations
+
+- If an arbitrary catches, then blocks a request from $C$ to $S$ and sends it afterwards, it will be authenticated. This case is equivalent to being $C$ (with all secret variables), which can *never* occur if you use TLS. Notice that you won't be able to extract anything from the token anyway.
+- With requests meta data (*e.g. HTTP headers containing the date*), an attacker knowing $W$ can forge the time hash $h_n$ and be able to recover the private key $K$ by processing a simple *XOR* on the public token. Because the cyclic-hash algorithm generates a unique pseudo-random token from $K$ for each request, this case does not give the attacker any clue about the next token to be sent.
+
+
+
+## Protocol
+
+Each request and response will hold a <u>pair of tokens</u>.
+
+### 1. Client request
+
+This case is the default one where $n_0$ is far from $1$ so there is no key generation to do.
+
+
+| Step | Description | Formula |
+|:----:|-------------|:--------|
+| `c1` | Decrement the shifting nonce | $n_0 = n_0 - 1$ |
+| `c2` | Calculate the one-time token $T_C$ | $T_C = h^{n_0}(K)$ |
+| `c3` | Get the current window id $n_C$ | $n_C =\ \mid \frac{T_{now}}{W} \mid$ |
+| `c4` | Calculate $m_C$, the parity of $n_C$ | $m_C = n_C \mod 2$ |
+| `c5` | Calculate the time hash $h_{n_C}$ | $h_{n_C} = h(n_C)$ |
+| `c6` | Calculate $T_{req}$, the scrambled *request token* | $T_{req} = T_C \oplus h_{n_C}$ |
+
+**Steps explanation**
+
+- `c2` - The window id corresponds to the index of the time slice where slices are $W$ seconds wide. By dividing the time in slices of $W$ seconds, if we process the same calculation at an interval of $W$ or less seconds, we will have either the same result or a result greater by 1.
+- `c3` - The window id parity $m_C$ allows us to adjust the value of $n_S$ made on $S$ when it receives the request. This difference of 1 second is caused by the division of time in slices, the precision is also divided by $W$.
+  - $T_{now}\mod W = 0 \implies \mid \frac{T_{now}}{W} \mid = \mid \frac{T_{now}+(W-1)}{W} \mid$; no need for adjustment
+  - $T_{now}\mod W = 1 \implies \mid \frac{T_{now}}{W} \mid = \mid \frac{T_{now}+(W-1)}{W} \mid + 1$; need to subtract $1$
+  - $T_{now}\mod W = 2 \implies \mid \frac{T_{now}}{W} \mid = \mid \frac{T_{now}+(W-1)}{W} \mid + 1$; need to subtract $1$
+  - $...$
+  - $T_{now}\mod W = (W-1) \implies \mid \frac{T_{now}}{W} \mid = \mid \frac{T_{now}+(W-1)}{W} \mid + 1$; need to subtract $1$
+- `c4` - $h(n_C)$ allows $h_{n_C}$ to be $L$ bits long and protects $n_C$ to be predictable from $h_{n_C}$.
+- `c5` - we process a one-time pad between $T_C$ and $h_{n_C}$ it is crucial that both values have the same size of $L$ bits. It makes $T_C$ impossible to extract without having the value $h_{n_C}$, this property applies in both ways.
+
+
+
+
+**Short formulas**
+
+| Field to send | Short formula                                    |
+| :-----------: | ------------------------------------------------ |
+|   $T_{req}$   | $h^{n_0}(K) \oplus h(\mid\frac{T_{now}}{W}\mid)$ |
+|     $m_C$     | $\mid\frac{T_{now}}{W}\mid \mod 2$               |
+
+> Note
+>
+> - In order to send all the data in one request, for instance you can simply concatenate the 2 variables.
+
+
+
+### 2. Server check
+
+<u>Received data</u>
+
+- $T_{req}​$ the received request token
+- $m_C$ the received time id parity
+
+| Step | Description | Formula |
+|:----:|-------------|---------|
+| `s1` | Store the reception time window id $n'$ | $n' = \mid \frac{T_{now}}{W}\mid$ |
+| `s2` | Calculate $m_S$, the parity of $n'$ | $m_S = n' \mod 2$ |
+| `s3` | Use $m_C$ to try to correct the reception window id and guess the request time id | $n_S = n' - \parallel m_C - m_S \parallel$ |
+| `s4` | Calculate the time hash $h_{n_S}$ | $h_{n_S} = h(n_S)$ |
+| `s5` | Cancel $h_{n_S}$ to extract $T_{C'}$ | $T_{C'} = T_{req} \oplus h_{n_S}$ |
+| `s6` | Check if $T_{C'}$ matches $T_S$ | $T_{C'} = T_S ?$ |
+
+> If $T_{C'} = T_S$, $S$ can  consider that $C$ sent the request $0$ to $(W + \frac{W}{2})$ seconds ago.
+
+
+
+**Steps explanation**
+
+- `s1` - If $C$ and $S$ have the same values for $K$ and $n^1$, $f(K,n^1)$ must result in the same output; in other words $T_C=T_S$.
+
+- `s3`/`s4` - $\| m_C - m_s\|$ is the difference between $m_C$ and $m_S$.
+
+- If the receiver time window id ($n'$) is the same as the sender ($n_C$)
+
+    $n'=n_C \implies m_S=m_C$
+
+    $m_C=m_S \implies \| m_C - m_S\| = 0$
+
+    $n_S = n_C - 0$
+
+    $n_S=n_C$, the time ids are the same, $S$ can now unscramble the request to check the token 
+
+  - If the receiver time window if further the sender by $1$
+
+    $n'=n_C+1 \implies \| m_C - m_S \| = 1$
+
+    $n_S = n_C + 1 - 1$
+
+    $n_S = n_C$, the time ids are the same, $S$ can now unscramble the request to check the token
+
+  - If the receiver time window if further the sender by $2$ or more, let $k \in \N$
+
+    $n'= n_C+2+k \implies \parallel m_C - m_S\parallel \in \{0,1\}$
+
+    $- \parallel m_C - m_S\parallel \in \{-1,0\}$
+
+    $n_S \in \{n_C + 2 + k - 1, n_C + 2 + k + 0\}$
+
+    $n_S \in \{n_C + k + 1, n_C + k + 2\}$
+
+    $\rarr n_S = n_C + k + 1 \implies \forall (k\in \N), n_S \gt n_C$, the time ids differ, $S$ cannot extract $T_S$ 
+
+    $\rarr n_S = n_C + k + 2 \implies \forall (k \in \N), n_S \gt n_C+1$, the time ids differ, $S$ cannot extract $T_S$
+
+- `s6` - By the *non-idempotency* (*i.e. $a\oplus b \oplus a = b$*) and *associativity* properties of the *XOR* operator, considering $h_{n_S}=h_{n_C}$:
+
+  $T_{C'} = T_{req} \oplus h_{n_S} = (T_C \oplus h_{n_C}) \oplus h_{n_S}$
+
+  $h_{n_S} = h_{n_C} \implies T_{C'} = T_C \oplus h_{n_C} \oplus h_{n_C}$
+
+  $T_{C'} = T_C$, the one-time token of $C$ have successfully been extracted
+
+- `s7` - If $K$ and $n^1$ are the same on both machines, $T_S=T_C$. Furthermore, if the time ids are the same (*c.f. step `s6`*) the 2 tokens should match. 
+
+
+
+**Short formulas**
+
+​	The whole unscrambling process can be shortened into the following formula resulting in $0$ if the client is authenticated.
+
+$T_{req} \oplus h(\mid \frac{T_{now}}{W}\mid - \parallel m_C - (\mid \frac{T_{now}}{W}\mid \mod 2) \parallel) \oplus f(K, n^1)$