Keywords

1 Introduction

The vast majority of symmetric cryptographic schemes is built upon a pseudorandom permutation, such as AES  [21]. Such a function gets as input a key and bijectively transforms its input to an output such that the function is hard to distinguish from random if an attacker has no knowledge about the key. The approach is natural: the design as well as the analysis of pseudorandom permutations has faced ample research. Yet, in many encryption modes  [3], message authentication codes  [4, 10, 18, 65], authenticated encryption schemes  [24, 29, 49] and other applications of pseudorandom permutations  [45], the underlying primitive is only used in forward direction. Here, one does not make use of the invertibility of the permutation, and even stronger: the fact that one uses a pseudorandom permutation instead of a pseudorandom function comes at a security penalty.

A prominent example of this is the Wegman-Carter nonce-based message authentication code from 1981  [18, 65]:

$$\begin{aligned} \mathsf {WC}^{F,H}(\nu ,m) = F(\nu ) \oplus H(m)\,, \end{aligned}$$

where F is a pseudorandom function transforming a nonce \(\nu \) to an n-bit output and H a universal hash function transforming an arbitrary length message m to an n-bit output. Provided that F and H are sufficiently strong and the nonce is never repeated, this construction is known to achieve \(2^n\) security  [65]. However, given the thorough understanding in pseudorandom permutation design, Shoup suggested to instantiate Wegman-Carter using a pseudorandom permutation P, leading to a construction now known as the Wegman-Carter-Shoup construction  [61]:

$$\begin{aligned} \mathsf {WCS}^{P,H}(\nu ,m) = P(\nu ) \oplus H(m)\,. \end{aligned}$$

This construction, however, is known to only achieve approximately \(2^{n/2}\) birthday bound security in the size of P  [11, 47, 61]. This bound may be fine for sufficiently large pseudorandom permutations like the AES, but with the use of legacy ciphers and with the rise of lightweight pseudorandom permutations [1, 2, 15, 16, 23, 28, 35, 43, 60, 66] whose widths could get down to 64 or even 32 bits, birthday attacks are a practical thread as recently demonstrated by McGrew  [48] and Bhargavan and Leurent  [12].

This and comparable examples (e.g., counter mode encryption  [3] and GCM authenticated encryption  [49]) showcase the value and need for pseudorandom functions. Unfortunately, we have little understanding in how to design dedicated pseudorandom functions, the only two notable exceptions to date being SURF  [9] and AES-PRF  [52] (see also Sect. 1.1.3). With respect to generic constructions, the well-established PRP-PRF switch dictates that a pseudorandom permutation behaves like a pseudorandom function up to the birthday bound, \(2^{n/2}\) where n is the primitive width  [6, 8, 19, 25, 34, 36]. This switch allows one to obtain a pseudorandom function by simply taking a pseudorandom permutation, but yet, it incurs a loss in the security bound that is comparable to the loss in moving from Wegman-Carter to Wegman-Carter-Shoup.

1.1 Beyond Birthday Bound PRP-to-PRF Conversion

Various methods to transform a PRP into a PRF have been proposed that achieve security beyond the birthday bound on the block size of the underlying primitive. This work will mostly be concerned with two of them: the sum of permutations and truncation.

1.1.1 Sum of Permutations

The sum of two independent n-bit permutations \(P_1,P_2\),

$$\begin{aligned} \mathsf {SoP}^{P_1,P_2}(x) = P_1(x) \oplus P_2(x)\,, \end{aligned}$$
(1)

was first introduced by Bellare et al.  [7]. Closely following this introduction, Lucks  [46] proved around \(2^{2n/3}\) security and Bellare and Impagliazzo  [5] around \(2^n/n\) security. An intensive line of research of Patarin  [56,57,58] yielded around optimal \(2^n\) security, up to constant, following the mirror theory. Dai et al.  [22] proved around \(2^n\) security using their rather compact and elegant chi-squared method.

The two independent permutations can be simulated using a single one through domain separation  [5, 46]:

$$\begin{aligned} \mathsf {SoSP}^{P}(x) = P(x\Vert 0) \oplus P(x\Vert 1)\,. \end{aligned}$$
(2)

The scheme achieves a similar level of security as \(\mathsf {SoP}\)  [22, 57].

A generalization worth describing is the CENC construction of Iwata  [37]. CENC offers a tradeoff between counter mode and the sum of permutations. It is determined by a parameter \(w\geqslant 1\) and uses \(P(x\Vert 0)\) to mask w subsequent blocks \(P(x\Vert 1),\ldots ,P(x\Vert w)\). Iwata proved \(2^{2n/3}\) security  [37]. Iwata et al.  [38] argued that, in fact, optimal \(2^n\) security of CENC directly follows from Patarin’s mirror theory. Bhattacharya and Nandi  [14] re-confirmed this bound using the chi-squared method.

1.1.2 Truncation

The idea of truncation consists of simply discarding part of the output of an n-bit permutation P:

$$\begin{aligned} \mathsf {Trunc}_a^P(x) = \mathsf {left}_a(P(x))\,, \end{aligned}$$
(3)

where \(0\leqslant a\leqslant n\). The idea dates back to Hall et al.  [34], who proved \(2^{n-a/2}\) security for a specific selection of parameters a. Bellare and Impagliazzo  [5] and Gilboa and Gueron  [26] improved the scope of the proof to tight \(2^{n-a/2}\) security for all parameter choices. The first documented solution for the problem, however, dates back to 1978, when Stam  [62], derived it in a non-cryptographic context. (See also Gilboa et al.  [27].) Bhattacharya and Nandi  [13] recently transformed Stam’s analysis to the chi-squared method and derived the identical \(2^{n-a/2}\) bound. Mennink  [50] considered a general treatment of truncation with pre- and post-processing and related the generalized scheme with historical results of Stam from 1986  [63].

1.1.3 Other Approaches

We briefly elaborate on two more recent approaches on beyond birthday bound secure PRP-to-PRF conversion. Cogliati and Seurin  [20] introduced Encrypted Davies-Meyer:

$$\begin{aligned} \mathsf {EDM}^{P_1,P_2}(x) = P_2(P_1(x) \oplus x)\,, \end{aligned}$$
(4)

where \(P_1\) and \(P_2\) are two n-bit permutations. They proved security up to around \(2^{2n/3}\). Dai et al.  [22] proved security of the construction up to around \(2^{3n/4}\) using the chi-squared method and Mennink and Neves  [51] proved security up to around \(2^n/n\) using the mirror theory.

Mennink and Neves  [51] proposed its dual version Encrypted Davies-Meyer Dual:

$$\begin{aligned} \mathsf {EDMD}^{P_1,P_2}(x) = P_2(P_1(x)) \oplus P_1(x)\,. \end{aligned}$$
(5)

They proved that \(\mathsf {EDMD}^{P_1,P_2}\) is at least as secure as \(\mathsf {SoP}^{P_1,P_2}\). In other words, the construction is known to achieve around \(2^n\) security. Mennink and Neves  [52] subsequently used the construction to design a dedicated PRF based on the AES  [21]. Bernstein’s SURF  [9], dating back to 1997, follows the same idea.

1.2 Truncation in GCM-SIV

GCM is a well-established authenticated encryption scheme  [40, 49]. It follows the nonce-based encrypt-then-MAC paradigm, where encryption is performed in counter mode and the associated data and ciphertext are subsequently authenticated using the GHASH universal hash function.

GCM is vulnerable to nonce misuse attacks. Gueron and Lindell introduced GCM-SIV, a nonce misuse resistant authenticated encryption scheme. Several variants of GCM-SIV exist  [29, 32, 33, 39], and we will focus on the most recent one. It follows the nonce misuse resistant SIV mode of Rogaway and Shrimpton  [59] and uses individual ingredients of GCM. In the context of this work, we are particularly interested in the key derivation function of GCM-SIV  [33]. This key derivation function is based on an \((n=128)\)-bit block cipher E and it derives either 256 bits of key material (if E is instantiated with AES-128) or 384 bits of key material (if E is instantiated with AES-256) based on key k and nonce \(\nu \) as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathsf {left}_{n/2}(E_k(\nu \Vert 0)) \parallel \cdots \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 3))\,, \text { for 256-bit subkey,}\\ \mathsf {left}_{n/2}(E_k(\nu \Vert 0)) \parallel \cdots \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 5))\,, \text { for 384-bit subkey.} \end{array}\right. } \end{aligned}$$
(6)

This key derivation was in fact introduced in a follow-up version of GCM-SIV  [33] after weaknesses were discovered in the original mechanism  [55]. The key derivation of (6) has actually been disputed over time. Iwata and Seurin  [41] advocated for the sum of permutations instead, and Bose et al.  [17] noted that one can even just leave block ciphers in, as bijectivity in the key derivation function do not matter in the bigger picture of GCM-SIV. Despite this disputation, GCM-SIV enjoys strong support from the practical community. GCM-SIV is considered for standardization by the IETF-CFRG  [30, 31] and NIST  [54]. Therefore, it is a legitimate question to investigate the exact behavior of the key derivation function within GCM-SIV.

1.3 Summation-Truncation Hybrid

Besides the difference in security guaranteed between truncation and the sum of permutations, \(2^{n-a/2}\) versus \(2^n\), the former has another drawback: \(n-a\) bits are truncated and simply discarded. We will demonstrate that this practice is wasteful: one can make more economical use of the discarded randomness without any sacrification of security!

Before heading to the main contribution, let us first consider what we can do with the discarded part of truncation if we focused on a single truncation call. In other words, we compute \(y=P(x)\), output \(\mathsf {left}_a(y)\) and discard \(\mathsf {right}_{n-a}(y)\). We wish to make more economical use of \(\mathsf {right}_{n-a}(y)\). One way of doing so is to simply add the value with \(\mathsf {left}_a(y)\); another way of doing so might be to split \(\mathsf {right}_{n-a}(y)\) in two pieces, add the results, and append that value to the output of the truncation. It appears that, regardless of the adopted approach, one arrives at a generalized truncation function in the terminology of Mennink  [50, 63]. His result describes that, whatever post-processing is applied to P, security of the scheme is tightly determined at \(2^{n-a'/2}\), where \(a'\) is the output size of the generalized truncation function. (In the former example, \(a'=a\), whereas in the latter example, \(a'= a + (n-a)/2 = (n+a)/2\).) In other words, security of the construction does not increase and it might even decrease if the truncated data is attempted to be used more economically.

A next step is to look at two subsequent truncation calls, as appear, e.g., in the GCM-SIV key derivation (6). We present the Summation-Truncation Hybrid \(\mathsf {STH}\), that at a high level consists of two parallel evaluations of truncation, where the truncated parts are not discarded but rather summed together and appended to the output. In detail, if P is an n-bit permutation and a is a parameter satisfying \(0\leqslant a\leqslant n\), the Summation-Truncation Hybrid is a pseudorandom function that maps \(n-1\) bits of input to \(n+a\) bits of output as follows:

$$\begin{aligned} \mathsf {STH}_a^P(x) = \mathsf {left}_a(P(x\Vert 0)) \parallel \mathsf {left}_a(P(x\Vert 1)) \parallel \mathsf {right}_{n-a}\left( P(x\Vert 0)\oplus P(x\Vert 1)\right) \,. \end{aligned}$$
(7)

The function is depicted in Fig. 1.

Fig. 1.
figure 1

Summation-Truncation Hybrid \(\mathsf {STH}_a\) of (7).

Clearly, \(\mathsf {STH}_a\) is exactly as expensive as two evaluations of \(\mathsf {Trunc}_a\), but differs in that it outputs \(n-a\) bits for free. This may give a significant efficiency gain for repeated evaluations of truncation, for instance in the GCM-SIV key derivation in (6). Concretely, considering the case of GCM-SIV with 128-bit keys, it suffices to make three permutation calls instead of four, and for the case of 256-bit keys it suffices to make four permutation calls instead of six. We go into more detail for GCM-SIV in Sect. 7.

We also consider a variant of \(\mathsf {STH}\) based on two permutations without domain separation. In detail, if \(P_1\) and \(P_2\) are n-bit permutations and a is a parameter satisfying \(0 \leqslant a \leqslant n\), the Summation-Truncation Hybrid 2 is a pseudorandom function that maps n bits of input to \(n+a\) bits of output as follows:

$$\begin{aligned} \mathsf {STH2}_a^{P_1,P_2}(x) = \mathsf {left}_a(P_1(x)) \parallel \mathsf {left}_a(P_2(x)) \parallel \mathsf {right}_{n-a}\left( P_1(x) \oplus P_2(x)\right) \,. \end{aligned}$$
(8)

Its properties are very similar to the ones of the original \(\mathsf {STH}\).

1.4 Security of Summation-Truncation Hybrid

In Sect. 3 we formally prove that the security of \(\mathsf {STH}_a\) is determined by the security of truncation, i.e., that q evaluations of \(\mathsf {STH}_a\) are approximately as secure as 2q bits of truncation, despite the \(q(n-a)\) bits of free random output. A comparison between the efficiency and security of truncation, summation and \(\mathsf {STH}\) is shown in Fig. 2.

Fig. 2.
figure 2

Comparison between the efficiency and security of summation, truncation and the Summation-Truncation Hybrid. The rate denotes the average number of input bits needed for every output bit. Lower is more efficient.

The core idea of the proof consists of separating the truncation and the summation. This is not directly possible, as both parts share some secret information: the random permutation. The separation, at a high level, is performed by getting rid of this shared secret so that one only has to reason based on public information.

In more detail, in the proof we first execute the truncation part of the construction, based on the secret permutation. Then, we select a new secret permutation for the summation. On the upside, this trick makes it possible to reason about the truncation and summation parts independently. On the downside, replacing the secret permutation half-way gives rise to a different construction than the original one, of course. To remedy this, the new permutations is not selected from the set of all possible permutations, but rather from those that are compatible with the output generated by the truncation. This set is solely based on public information, i.e., information known by the adversary, as we do give the output of the truncation.

As a bonus, as truncation gives information about the outputs of the permutation directly, we see that this set of compatible permutations is easy to reason about. We demonstrate that choosing such a random permutation is the same as choosing a family of permutations with the indices equal to the outputs of the truncation. As we can now replace the truncation with a random function, relying on the extensive state of the art on truncation  [5, 13, 26, 34, 62], these indices even become uniformly distributed, which makes them nice to handle.

This transition then brings us to the final part of the proof, that generalizes security of the sum of permutations with just a single permutation to the sum of permutations based on an arbitrary family of permutations. The analysis relies on the chi-squared method and generalizes the proof of Dai et al.  [22], with the catch that we not only consider a family of permutations, but rather that the selection of permutations from this family is uniformly distributed as it depends on the outputs of the random function (that replaced the truncation).

2 Preliminaries

Let \(n,a,b\in \mathbb {N}\) with \(n\geqslant a,b\). We denote by \(\{0,1\}^{n}\) the set of bit strings of length n. If \(x\in \{0,1\}^{n}\), \(\mathsf {left}_a(x)\) returns the a leftmost bits and \(\mathsf {right}_b(x)\) the b rightmost bits of x, in such a way that

$$\begin{aligned} x = \mathsf {left}_a(x)\parallel \mathsf {right}_{n-a}(x)\,. \end{aligned}$$

For \(n,m\in \mathbb {N}\), we denote by \(\mathsf {Perm}[n]\) the set of all permutations on \(\{0,1\}^{n}\) with \(I_n \in \mathsf {Perm}[n]\) the identity permutation, and by \(\mathsf {Func}[n,m]\) the set of all functions from \(\{0,1\}^{n}\) to \(\{0,1\}^{m}\).

If \(\mathcal {X}\) is a finite set, \(x\xleftarrow {{\scriptscriptstyle \$}}\mathcal {X}\) denotes the event of uniformly randomly drawing x from \(\mathcal {X}\). For two distributions \(\mu ,\nu \) over a finite event space \(\varOmega \), the statistical distance between \(\mu \) and \(\nu \) is defined as

$$\begin{aligned} \left\Vert \mu - \nu \right\Vert&= \sum _{x \in \varOmega } \max (0, \mu (x) - \nu (x)) \\&= \max _{A \subseteq \varOmega } \left| \mu (A) - \nu (A) \right| \,. \end{aligned}$$

A randomized algorithm \(\mathcal {O}\) introduces its variables as random variables, so \(\mathbb {P}_{\mathcal {O}}\left[ x = a\right] \) denotes the probability that the variable x in algorithm \(\mathcal {O}\) is equal to a. As a shorthand we denote \(\mathbb {P}_{\mathcal {O}}\left[ a\right] \) and \(\mathbb {P}_{\mathcal {O}}\left[ A\right] \), with a a single value and A a set of values, for the probabilities that algorithm \(\mathcal {O}\) returns the value a or a value in the set A, respectively. Just \(\mathbb {P}_{\mathcal {O}}\) denotes the distribution of the return value of algorithm \(\mathcal {O}\). For a random variable X with distribution \(\mu \), we denote its expectation by \(\mathbb {E}_{\mu }\left[ X\right] \). If the distribution is clear from the context, we write just \(\mathbb {E}\left[ X\right] \).

A distinguisher \(\mathcal {D}\) is an algorithm that is given access to an oracle \(\mathcal {O}\) to which it can make a certain amount of queries, and afterwards it outputs \(b\in \{0,1\}\).

We briefly state an elementary property of conditional expectation.

Lemma 1

Suppose that \(\mathbb {E}\left[ Y \mid X\right] = \mathbb {E}\left[ Y\right] \). Then \(\mathbb {E}\left[ X \cdot Y\right] = \mathbb {E}\left[ X\right] \cdot \mathbb {E}\left[ Y\right] \).

Proof

By the law of total expectation we have

$$\begin{aligned} \mathbb {E}\left[ X \cdot Y\right]&= \mathbb {E}\left[ \mathbb {E}\left[ X \cdot Y \mid X\right] \right] \\&= \mathbb {E}\left[ X \cdot \mathbb {E}\left[ Y \mid X\right] \right] \\&= \mathbb {E}\left[ X \cdot \mathbb {E}\left[ Y\right] \right] \\&= \mathbb {E}\left[ X\right] \cdot \mathbb {E}\left[ Y\right] \,. \end{aligned}$$

   \(\square \)

2.1 Block Ciphers

Let \(\kappa ,n\in \mathbb {N}\). A block cipher \(E:\{0,1\}^{\kappa }\times \{0,1\}^{n}\rightarrow \{0,1\}^{n}\) is a permutation on n-bit strings for every fixed key \(k\in \{0,1\}^{\kappa }\). Denote by \(\mathsf {Perm}[n]\) the set of all permutations on \(\{0,1\}^{n}\). The security of a block cipher is measured by the distance between \(E_k\) for secret key from a random permutation \(P\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Perm}[n]\). The advantage of a distinguisher \(\mathcal {D}\) in breaking the PRP (pseudorandom permutation) security of E is defined as

$$\begin{aligned} \mathbf {Adv}_{E}^{\mathrm {prp}}(\mathcal {D}) = \left\Vert \mathbb {P}\left[ \mathcal {D}^{E_k}=1\right] - \mathbb {P}\left[ \mathcal {D}^{P}=1\right] \right\Vert \,, \end{aligned}$$
(9)

where the probabilities are taken over the random drawing of \(k\xleftarrow {{\scriptscriptstyle \$}}\{0,1\}^{\kappa }\), \(P\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Perm}[n]\), and the randomness used by \(\mathcal {D}\). Distinguisher \(\mathcal {D}\) is usually bound by a query complexity q and a time complexity t. The maximum over all such distinguishers is denoted by \(\mathbf {Adv}_{E}^{\mathrm {prp}}(q, t)\).

2.2 Pseudorandom Functions

Let \(n,m\in \mathbb {N}\). Let \(F^P\in \mathsf {Func}[n,m]\) be a function from \(\{0,1\}^{n}\) to \(\{0,1\}^{m}\) that is instantiated with a permutation \(P\in \mathsf {Perm}[n]\). The security of F is measured by the distance between \(F^P\) for secret and uniformly randomly drawn \(P\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Perm}[n]\) from a random function \(R\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Func}[n,m]\). The advantage of a distinguisher \(\mathcal {D}\) in breaking the PRF (pseudorandom function) security of F is defined as

$$\begin{aligned} \mathbf {Adv}_{F}^{\mathrm {prf}}(\mathcal {D}) = \left\Vert \mathbb {P}\left[ \mathcal {D}^{F^P}=1\right] - \mathbb {P}\left[ \mathcal {D}^{R}=1\right] \right\Vert \,. \end{aligned}$$
(10)

For \(q\in \mathbb {N}\), we define by \(\mathbf {Adv}_{F}^{\mathrm {prf}}(q)\) the maximum advantage over all distinguishers \(\mathcal {D}\) making q queries to the oracle.

2.3 Truncation

Our security analysis will, in part, rely on the PRF security of truncation of (3). We copy the result of Stam  [62], translated to cryptographic terminology  [13, 50].

Lemma 2

(Truncation  [13, 50, 62]). Let \(n,a,q\in \mathbb {N}\) such that \(0\leqslant a\leqslant n\). We have:

$$\begin{aligned} \mathbf {Adv}_{\mathsf {Trunc}_a}^{\mathrm {prf}}(q) \leqslant \left( \left( {\begin{array}{c}q\\ 2\end{array}}\right) /2^{2n-a}\right) ^{1/2}\,. \end{aligned}$$
(11)

3 Summation-Truncation Hybrid

Let \(n,a\in \mathbb {N}\) such that \(0\leqslant a\leqslant n\), and write \(b=n-a\). Let \(P\in \mathsf {Perm}[n]\) be a permutation. We define the Summation-Truncation Hybrid \(\mathsf {STH}_a^P\in \mathsf {Func}[n-1,n+a]\) as follows:

$$\begin{aligned} \mathsf {STH}_a^P(x) = \mathsf {left}_a(P(x\Vert 0)) \parallel \mathsf {left}_a(P(x\Vert 1)) \parallel \mathsf {right}_b\left( P(x\Vert 0)\oplus P(x\Vert 1)\right) \,. \end{aligned}$$
(12)

The function is depicted in Fig. 1. As expressed in this figure, we refer to the first a bits as u, the second a bits as v, and the final b bits as w, and write \(y=u\Vert v\Vert w\).

Clearly, \(\mathsf {STH}_a\) is equivalent to \(\mathsf {SoP}\) for \(a=0\). If \(a=n\), \(\mathsf {STH}_a\) consists of a concatenation of two block cipher evaluations. For general a, one evaluation of \(\mathsf {STH}_a\) with the last b bits discarded is equivalent to a double evaluation of \(\mathsf {Trunc}_a\). As we will show, however, there is no reason to discard these b bits. Stated differently, q evaluations of \(\mathsf {STH}_a\) are roughly as secure as 2q evaluations of \(\mathsf {Trunc}_a\), although the former outputs significantly more random data.

Theorem 1

Let \(n,a,q\in \mathbb {N}\) such that \(0\leqslant a\leqslant n\), and write \(b=n-a\). Assume that \(b \geqslant \max (n/12, 10)\). We have:

$$\begin{aligned} \mathbf {Adv}_{\mathsf {STH}_a}^{\mathrm {prf}}(q) \leqslant 3 \left( \frac{q}{2^{n-a/3}}\right) ^{3/2} + \frac{1}{\sqrt{2\pi }} \left( \frac{q}{2^{n-5}}\right) ^{2^{b-2}} + \frac{q}{2^n} + \mathbf {Adv}_{\mathsf {Trunc}_a}^{\mathrm {prf}}(2q)\,. \end{aligned}$$
(13)

The dominating bound of Theorem 1 is, in fact, \(\mathbf {Adv}_{\mathsf {Trunc}_a}^{\mathrm {prf}}(2q)\), and therefore, security of \(\mathsf {STH}_a\) is only marginally worse than that of \(\mathsf {Trunc}_a\), even though it is much more efficient. We prove Theorem 1 in Sect. 4.

Remark 1

In Theorem 1, as well as in Lemma 2, we focus on PRF security in the information-theoretic setting, where the underlying primitive is a secret random permutation. One can easily transfer these results to a complexity-theoretic setting where P is defined as a block cipher instance \(E_k\) for secret key. More detailed, the bound of \(\mathsf {Trunc}_a\) (Lemma 2) carries over with an additional loss of \(\mathbf {Adv}_{E}^{\mathrm {prp}}(q, t)\), and the bound of \(\mathsf {STH}_a\) carries over with an additional loss of \(\mathbf {Adv}_{E}^{\mathrm {prp}}(2q, t)\), where the time complexity is bounded by t.

We also consider a variant of \(\mathsf {STH}\) based on two independent random permutations without domain separation. Let \(n,a\in \mathbb {N}\) such that \(0\leqslant a\leqslant n\), and write \(b=n-a\). Let \(P_1,P_2\in \mathsf {Perm}[n]\) be two permutations. We define \(\mathsf {STH2}_a^{P_1,P_2} \in \mathsf {Func}[n, n+a]\) as follows:

$$\begin{aligned} \mathsf {STH2}_a^{P_1,P_2}(x) = \mathsf {left}_a(P_1(x)) \parallel \mathsf {left}_a(P_2(x)) \parallel \mathsf {right}_b\left( P_1(x)\oplus P_2(x)\right) \,. \end{aligned}$$
(14)

We get a similar bound as for the original \(\mathsf {STH}\).

Theorem 2

Let \(n,a,q\in \mathbb {N}\) such that \(0\leqslant a\leqslant n\), and write \(b=n-a\). Assume that \(b \geqslant \max (n/12, 10)\). We have:

$$\begin{aligned} \mathbf {Adv}_{\mathsf {STH2}_a}^{\mathrm {prf}}(q) \leqslant 3 \left( \frac{q}{2^{n-a/3}}\right) ^{3/2} + \frac{1}{\sqrt{2\pi }} \left( \frac{q}{2^{n-5}}\right) ^{2^{b-2}} + 2\mathbf {Adv}_{\mathsf {Trunc}_a}^{\mathrm {prf}}(q)\,. \end{aligned}$$
(15)

We prove Theorem 2 in Sect. 6.

One might also be interested in creating a larger instance of the hybrid. One approach would be to consider applying other functions than summation on the discarded parts. For example, one could apply the generalized CENC construction  [37] on top of them. This could lead to other interesting results and might improve the efficiency as well, but this is left for potential future work.

4 Proof of Theorem 1

Let \(n,a,q\in \mathbb {N}\) such that \(0\leqslant a\leqslant n\), and write \(b=n-a\). As of now, we will drop subscript a to \(\mathsf {STH}\) for brevity. Consider any distinguisher \(\mathcal {D}\) making q queries to its oracle. Without loss of generality, \(\mathcal {D}\) is deterministic and does not make pointless queries, i.e., \(x_i \ne x_j\) for all \(i \ne j\). Our goal is to bound the distance between \(\mathsf {STH}^P\) for a random permutation \(P\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Perm}[n]\) on the one hand and a random function \(R\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Func}[n-1,n+a]\) on the other hand:

$$\begin{aligned} \mathbf {Adv}_{\mathsf {STH}}^{\mathrm {prf}}(\mathcal {D}) = \left\Vert \mathbb {P}\left[ \mathcal {D}^{\mathsf {STH}^P}=1\right] - \mathbb {P}\left[ \mathcal {D}^{R}=1\right] \right\Vert \,. \end{aligned}$$
(16)

We will bound (16) in multiple steps. The first step (Sect. 4.1) will be to show that, without loss of generality, we can move to a non-adaptive setting and argue based on probabilities of transcripts to occur. Then, the second step (Sect. 4.2) consists of transforming the real world \(\mathsf {STH}^P\) into a world that separates the \(\mathsf {Trunc}\) and \(\mathsf {SoP}\) parts within \(\mathsf {STH}\). Then, the third step (Sect. 4.3) replaces that \(\mathsf {Trunc}\) part by a random function, at the cost of \(\mathbf {Adv}_{\mathsf {Trunc}}^{\mathrm {prf}}(2q)\). Then, the fourth step (Sect. 4.4) operates on the ideal world R: it transforms it into a world that does not output strings of the form \(u\Vert u\Vert 0^b\) for \(u\in \{0,1\}^{a}\), noting that these never occur in the real world we were left with in the third step. Finally, the fifth step (Sect. 4.5) bounds the remaining two worlds using the chi-squared method.

4.1 Moving Towards Transcripts

As a first step, we note that the input adaptivity does not help and it suffices to simply consider the probability of transcripts to occur in the real world and in the ideal world. Let \(\mathcal {O}_1\) be an oracle that generates transcripts as lists of random strings, Algorithm 1, and \(\mathcal {O}_2 = \mathsf {NSTH}\) (Non-adaptive \(\mathsf {STH}\)) be an oracle that generates transcripts as results of \(\mathsf {STH}^P\) with random \(P\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Perm}[n]\) and fixed inputs \(x_i=i\), Algorithm 2.

We will show that the advantage of attacker \(\mathcal {D}\) in distinguishing the two worlds of (16) is at most the statistical distance between world \(\mathcal {O}_1\) and world \(\mathcal {O}_2\). Suppose we get a transcript \(\varvec{\tau }\) from any of the two oracles \(\mathcal {O}_1\) or \(\mathcal {O}_2\), we will use it to simulate \(\mathcal {D}\)’s oracles as follows. If \(\mathcal {D}\) makes query \(x_i\), we respond with \(u_i\Vert v_i\Vert w_i\) from the transcript. Denote by A the set of all transcripts \(\varvec{\tau }\) for which \(\mathcal {D}\) returns 1. Note that we can cleanly define this, as \(\mathcal {D}\) is deterministic and its decision only depends on \(\varvec{\tau }\); moreover, the fact that world \(\mathcal {O}_2\) uses fixed inputs for P does not matter as it is a random permutation. Then, the advantage of \(\mathcal {D}\) is at most

(17)

Henceforth, it is sufficient to restrict our focus to the statistical distance between \(\mathcal {O}_1\) and \(\mathcal {O}_2\).

figure a

4.2 Permutation-Separated \(\mathsf {STH}\)

We define a variant of \(\mathcal {O}_2\), namely \(\mathcal {O}_3=\mathsf {PSTH}\) (Permutation-separated \(\mathsf {STH}\)), in Algorithm 3. This oracle “separates” the \(\mathsf {Trunc}\) part and the \(\mathsf {SoP}\) part within \(\mathsf {NSTH}\). In more detail, it first calls internal procedure \(\mathsf {PTrunc}\) that draws a random permutation \(P\xleftarrow {{\scriptscriptstyle \$}}\mathsf {Perm}[n]\) and outputs the lists \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\). Then, it calls internal procedure \(\mathsf {PSoP}\) that takes the two lists \((\textit{\textbf{u}},\textit{\textbf{v}})\) and returns a list \(\textit{\textbf{w}}\) using a random permutation \(P'\). This permutation is randomly drawn from a set \(\mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\subseteq \mathsf {Perm}[n]\) defined as the set of all permutations from \(\mathsf {Perm}[n]\) for which \(\mathsf {PTrunc}\) would return \((\textit{\textbf{u}},\textit{\textbf{v}})\). Note that in our analysis this set will never be empty, so the ‘else’ branch will never be taken and is included solely to complete the algorithm.

figure b

We will prove that any transcript \(\varvec{\tau }\) is equally likely in \(\mathcal {O}_2\) and \(\mathcal {O}_3\). Consider any valid transcript \(\varvec{\tau }\), and define by \(\mathsf {Perm}_{\text {result}}(\varvec{\tau })\subseteq \mathsf {Perm}[n]\) the set of all permutations that give result \(\varvec{\tau }=(\textit{\textbf{u}},\textit{\textbf{v}},\textit{\textbf{w}})\) when used in \(\mathcal {O}_2=\mathsf {NSTH}\). Then,

$$\begin{aligned} \mathbb {P}_{\mathcal {O}_2}\left[ \varvec{\tau }\right] = \mathbb {P}_{\mathsf {NSTH}}\left[ \varvec{\tau }\right] = \frac{\left| \mathsf {Perm}_{\text {result}}(\varvec{\tau })\right| }{\left| \mathsf {Perm}[n]\right| }\,. \end{aligned}$$

On the other hand, for \(\mathcal {O}_3=\mathsf {PSTH}\), we first have to get the right \((\textit{\textbf{u}},\textit{\textbf{v}})\):

$$\begin{aligned} \mathbb {P}_{\mathsf {PTrunc}}\left[ (\textit{\textbf{u}},\textit{\textbf{v}})\right] = \frac{\left| \mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\right| }{\left| \mathsf {Perm}[n]\right| }\,. \end{aligned}$$

Next, we have to get the right \(\textit{\textbf{w}}\). As \(\mathsf {Perm}_{\text {result}}(\varvec{\tau })\subseteq \mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\), this probability is equal to:

$$\begin{aligned} \mathbb {P}_{\mathsf {PSoP}(\textit{\textbf{u}},\textit{\textbf{v}})}\left[ \textit{\textbf{w}}\right]&= \frac{\left| \mathsf {Perm}_\text {result}(\varvec{\tau })\right| }{\left| \mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\right| }\,. \end{aligned}$$

The randomnesses in \(\mathsf {PTrunc}\) and \(\mathsf {PSoP}\) are independent, hence the two probabilities are independent as well. This means that the probability of getting \(\varvec{\tau }\) in \(\mathcal {O}_3=\mathsf {PSTH}\) is equal to their product. In other words:

$$\begin{aligned} \mathbb {P}_{\mathcal {O}_3}\left[ \varvec{\tau }\right] = \mathbb {P}_{\mathsf {PSTH}}\left[ \varvec{\tau }\right]&= \mathbb {P}_{\mathsf {PSTH}}\left[ \varvec{\tau }\mid (\textit{\textbf{u}},\textit{\textbf{v}}) \leftarrow \mathsf {PTrunc}\right] \cdot \mathbb {P}_{\mathsf {PTrunc}}\left[ (\textit{\textbf{u}},\textit{\textbf{v}})\right] \\&+ \mathbb {P}_{\mathsf {PSTH}}\left[ \varvec{\tau }\mid (\textit{\textbf{u}},\textit{\textbf{v}}) \not \leftarrow \mathsf {PTrunc}\right] \cdot (1 - \mathbb {P}_{\mathsf {PTrunc}}\left[ (\textit{\textbf{u}},\textit{\textbf{v}})\right] ) \\&= \mathbb {P}_{\mathsf {PSoP}(\textit{\textbf{u}},\textit{\textbf{v}})}\left[ \textit{\textbf{w}}\right] \cdot \mathbb {P}_{\mathsf {PTrunc}}\left[ (\textit{\textbf{u}},\textit{\textbf{v}})\right] + 0 \\&= \frac{\left| \mathsf {Perm}_\text {result}(\varvec{\tau })\right| }{\left| \mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\right| } \cdot \frac{\left| \mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\right| }{\left| \mathsf {Perm}[n]\right| }\\&= \frac{\left| \mathsf {Perm}_\text {result}(\varvec{\tau })\right| }{\left| \mathsf {Perm}[n]\right| } \\&= \mathbb {P}_{\mathsf {NSTH}}\left[ \varvec{\tau }\right] = \mathbb {P}_{\mathcal {O}_2}\left[ \varvec{\tau }\right] \,. \end{aligned}$$

We have henceforth obtained that

$$\begin{aligned} \left\Vert \mathbb {P}_{\mathcal {O}_2} - \mathbb {P}_{\mathcal {O}_3} \right\Vert = 0\,. \end{aligned}$$
(18)

4.3 Isolating Truncation Advantage

Next, we define \(\mathcal {O}_4=\mathsf {RSTH}\) (Random function-based \(\mathsf {STH}\)) in Algorithm 4. The algorithm is identical to \(\mathcal {O}_3=\mathsf {PSTH}\), but with the function \(\mathsf {PTrunc}\) replaced by a random function \(S\). Note that \(S\) is written as separate procedure; this is done to suit further analysis in Sect. 4.5.

figure c

The only difference between \(\mathcal {O}_3=\mathsf {PSTH}\) and \(\mathcal {O}_4=\mathsf {RSTH}\) is in the generation of \((\textit{\textbf{u}},\textit{\textbf{v}})\): in the former, they are generated as a truncated permutation, whereas in the latter they are generated as a random function. Therefore, we immediately have:

$$\begin{aligned} \left\Vert \mathbb {P}_{\mathcal {O}_3} - \mathbb {P}_{\mathcal {O}_4} \right\Vert \leqslant \left\Vert \mathbb {P}_{\mathsf {PTrunc}} - \mathbb {P}_{S} \right\Vert = \mathbf {Adv}_{\mathsf {Trunc}}^{\mathrm {prf}}(2q) \,. \end{aligned}$$
(19)

4.4 Discarding the Zero

We proceed on the other end of (17). We turn \(\mathcal {O}_1=R_1\) into \(\mathcal {O}_0=R_0\) that operates identically except that it never returns \(w_i = 0^b\) when \(u_i = v_i\). The oracle is given in Algorithm 5. As before, we write \(T\) as a separate procedure to suit further analysis in Sect. 4.5.

figure d

We look at the statistical distance between \(\mathbb {P}_{\mathcal {O}_1}\) and \(\mathbb {P}_{\mathcal {O}_0}\). Let \(\mathsf {bad}_1\) be the set of transcripts \(\varvec{\tau }= (\textit{\textbf{u}},\textit{\textbf{v}},\textit{\textbf{w}})\) such that \(u_i = v_i\) and \(w_i = 0^b\) for some i. As \(\mathbb {P}_{\mathcal {O}_0}\left[ \varvec{\tau }\right] = 0\) for \(\varvec{\tau }\in \mathsf {bad}_1\) and \(\mathbb {P}_{\mathcal {O}_1}\left[ \varvec{\tau }\right] \leqslant \mathbb {P}_{\mathcal {O}_0}\left[ \varvec{\tau }\right] \) for \(\varvec{\tau }\notin \mathsf {bad}_1\) we see, where A can be any set of transcripts, that

$$\begin{aligned} \left\Vert \mathbb {P}_{\mathcal {O}_1} - \mathbb {P}_{\mathcal {O}_0} \right\Vert&= \max _A \left| \mathbb {P}_{\mathcal {O}_1}\left[ A\right] - \mathbb {P}_{\mathcal {O}_0}\left[ A\right] \right| \nonumber \\&= \mathbb {P}_{\mathcal {O}_1}\left[ \mathsf {bad}_1\right] \nonumber \\&\leqslant \sum _{i=1}^q \mathbb {P}_{\mathcal {O}_1}\left[ u_i = v_i, w_i = 0^b\right] \nonumber \\&= \frac{q}{2^n} \,. \end{aligned}$$
(20)

4.5 Final Step

Looking back, Eqs. (17), (18), (19), and (20) have transformed our original goal (16) into

$$\begin{aligned} \mathbf {Adv}_{\mathsf {STH}}^{\mathrm {prf}}(\mathcal {D}) \leqslant \left\Vert \mathbb {P}_{\mathcal {O}_0} - \mathbb {P}_{\mathcal {O}_4} \right\Vert + \frac{q}{2^n} + \mathbf {Adv}_{\mathsf {Trunc}}^{\mathrm {prf}}(2q)\,. \end{aligned}$$
(21)

We now look at the worlds \(\mathcal {O}_0\) and \(\mathcal {O}_4\). Noting that in both worlds \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) are generated identically, we can parameterize these worlds. We define \(\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}} = T(\textit{\textbf{u}},\textit{\textbf{v}})\) and \(\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}} = \mathsf {PSoP}(\textit{\textbf{u}},\textit{\textbf{v}})\), so that in both cases \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) are generated by \(S\) and \(\textit{\textbf{w}}\) by \(\mathcal {O}_b^{\textit{\textbf{u}},\textit{\textbf{v}}}\) for \(b \in \{0,4\}\). This means that

$$\begin{aligned} \left\Vert \mathbb {P}_{\mathcal {O}_0} - \mathbb {P}_{\mathcal {O}_4} \right\Vert&= \sum _{\textit{\textbf{u}},\textit{\textbf{v}}} \sum _{\textit{\textbf{w}}} \max (0, \mathbb {P}_{\mathcal {O}_0}\left[ (\textit{\textbf{u}},\textit{\textbf{v}},\textit{\textbf{w}})\right] - \mathbb {P}_{\mathcal {O}_4}\left[ (\textit{\textbf{u}},\textit{\textbf{v}},\textit{\textbf{w}})\right] ) \\&= \sum _{\textit{\textbf{u}},\textit{\textbf{v}}} \sum _{\textit{\textbf{w}}} \max (0, \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}}\left[ \textit{\textbf{w}}\right] - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}}\left[ \textit{\textbf{w}}\right] ) \cdot \mathbb {P}_{S}\left[ (\textit{\textbf{u}},\textit{\textbf{v}})\right] \\&= \sum _{\textit{\textbf{u}},\textit{\textbf{v}}} \left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \cdot \mathbb {P}_{S}\left[ (\textit{\textbf{u}},\textit{\textbf{v}})\right] \\&= \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ \left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \right] \,, \end{aligned}$$

with \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) drawn uniformly. The remaining task boils down to bounding the distance between \(\mathsf {PSoP}\) of Algorithm 3 and random function \(T\) of Algorithm 5.

For this, we first define \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) and \(C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) as the number of previous elements in \(\textit{\textbf{u}},\textit{\textbf{v}}\) equal to \(u_i\) and \(v_i\), respectively, as follows:

$$\begin{aligned} C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)&= \left| \{j | j< i, u_j = u_i\}\right| + \left| \{j | j< i, v_j = u_i\}\right| \,,\\ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)&= \left| \{j | j< i, u_j = v_i\}\right| + \left| \{j | j < i, v_j = v_i\}\right| \,. \end{aligned}$$

In our derivation we want to assume that these values stay below \(2^{b-2}\). We define \(\mathsf {bad}_2\) as the set of all \((\textit{\textbf{u}},\textit{\textbf{v}})\) such that \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \geqslant 2^{b-2}\) or \(C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \geqslant 2^{b-2}\) for some i. We want to discard the bad cases while still reasoning about \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) as uniformly random values. For this, we use the following lemma.

Lemma 3

Let f be a non-negative function such that \(\left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \leqslant f(\textit{\textbf{u}}, \textit{\textbf{v}})\) for \((\textit{\textbf{u}}, \textit{\textbf{v}}) \notin \mathsf {bad}_2\). Then

$$\begin{aligned} \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ \left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \right]&\leqslant \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ f(\textit{\textbf{u}},\textit{\textbf{v}})\right] \end{aligned}$$
(22)
$$\begin{aligned}&+ \mathbb {P}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ (\textit{\textbf{u}},\textit{\textbf{v}}) \in \mathsf {bad}_2\right] \,. \end{aligned}$$
(23)

Note that in \(\mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ f(\textit{\textbf{u}},\textit{\textbf{v}})\right] \) the values \(\textit{\textbf{u}},\textit{\textbf{v}}\) are still drawn uniformly.

Proof

For \((\textit{\textbf{u}},\textit{\textbf{v}}) \notin \mathsf {bad}_2\) we have that \(\left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \leqslant f(\textit{\textbf{u}},\textit{\textbf{v}})\). On the other hand, for \((\textit{\textbf{u}},\textit{\textbf{v}}) \in \mathsf {bad}_2\) we get \(\left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \leqslant 1 \leqslant f(\textit{\textbf{u}},\textit{\textbf{v}}) + 1\). Together, this means that \(\left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \leqslant f(\textit{\textbf{u}},\textit{\textbf{v}}) + \mathbf {1}_{\mathsf {bad}_2}(\textit{\textbf{u}},\textit{\textbf{v}})\), where \(\mathbf {1}_{\mathsf {bad}_2}\) is the indicator function of \(\mathsf {bad}_2\), which is 1 for \((\textit{\textbf{u}},\textit{\textbf{v}}) \in \mathsf {bad}_2\) and 0 otherwise. By taking the expectation on both sides this results in

$$\begin{aligned} \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ \left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \right]&\leqslant \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ f(\textit{\textbf{u}},\textit{\textbf{v}}) + \mathbf {1}_{\mathsf {bad}_2}(\textit{\textbf{u}},\textit{\textbf{v}})\right] \\&= \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ f(\textit{\textbf{u}},\textit{\textbf{v}})\right] + \mathbb {P}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ (\textit{\textbf{u}},\textit{\textbf{v}}) \in \mathsf {bad}_2\right] \,. \end{aligned}$$

   \(\square \)

We derive bounds for (22) with suitable f and (23) separately.

4.5.1 Bounding (22)

As a first step we have to find a non-negative function f such that \(\left\Vert \mathbb {P}_{\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}} - \mathbb {P}_{\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}} \right\Vert \leqslant f(\textit{\textbf{u}}, \textit{\textbf{v}})\) for \((\textit{\textbf{u}}, \textit{\textbf{v}}) \notin \mathsf {bad}_2\). The following theorem gives such function.

Theorem 3

Let \(a,b,q \in \mathbb {N}\) and let \(\textit{\textbf{u}}= (u_1, \ldots , u_q)\) and \(\textit{\textbf{v}}= (v_1, \ldots , v_q)\) be vectors of length q with elements in \(\{0,1\}^{a}\) such that \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i), C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) < 2^{b-2}\) for all i. Let \(\mathcal {O}\) and \(\mathcal {R}\) be as in Algorithm 6. Then

$$\begin{aligned} \left\Vert \mathbb {P}_{\mathcal {O}} - \mathbb {P}_{\mathcal {R}} \right\Vert \leqslant \sqrt{\frac{4}{2^{3b}} \sum _{i=1}^q C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \cdot C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)} \,. \end{aligned}$$
figure e

Here \(\langle x \rangle _n\) is the encoding of x as a n-bit string.

The proof of Theorem 3 will be given in Sect. 5.

It is obvious that \(\mathcal {R}\) equals \(\mathcal {O}_0^{\textit{\textbf{u}},\textit{\textbf{v}}}\). We will next show that \(\mathcal {O}\) generates the same distribution as \(\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}\), by looking at the distribution of \(U_i\) given all previous values (the analysis is symmetrical for the values \(V_i\)).

In world \(\mathcal {O}\), the value is generated by the permutation \(P_{u_i}\) with the input \(\langle C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \rangle _{b-1} \Vert 0\). Note that we can encode \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) as a \(b-1\)-bit string, as we assume that \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)< 2^{b-2} < 2^{b-1}\). The output value of \(P_{u_i}\) will be distributed uniformly from \(\{0,1\}^{b}\) minus its previously generated values. These values, in turn, are \(U_j\) and \(V_j\) such that \(u_j = u_i\) or \(v_j = u_i\), respectively, for \(j < i\). Note that we do get a new value, as \(\langle C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \rangle _{b-1} \Vert 0\) is different from \(\langle C_{\textit{\textbf{u}},\textit{\textbf{v}}}(j) \rangle _{b-1} \Vert 0\) or \(\langle C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(j) \rangle _{b-1} \Vert 1\) for such j.

In world \(\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}\), the value is generated by the single permutation \(P'\) selected from the set \(\mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\) with the new input \(i \Vert 0\). Note that \(\mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\) is never empty as we assume that \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i), C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)< 2^{b-2} < 2^b\) for all i, hence there always exists a permutation that would generate \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\). We know that the first a bits of the output of \(P'\) have to be equal to \(u_i\). This means that previously generated values of \(P'\) do not matter as long as their first a bits are different. Again, for the last b bits we know that they cannot be equal to \(U_j\) or \(V_j\) with \(u_j = u_i\) or \(v_j = u_i\), respectively, for \(j < i\). Furthermore, the value is uniformly chosen from the remaining elements in the set \(\{0,1\}^{b}\), as \(P'\) is selected uniformly from \(\mathsf {Perm}_\text {comp}(\textit{\textbf{u}},\textit{\textbf{v}})\).

This means that the distribution of all \(U_i\)’s is the same in both worlds. As the analysis of all \(V_i\)’s is similar, both \(\mathcal {O}\) and \(\mathcal {O}_4^{\textit{\textbf{u}},\textit{\textbf{v}}}\) have the same distribution.

We will now use Theorem 3 to bound (22). As the property \(\mathbb {E}\left[ X\right] ^2 \leqslant \mathbb {E}\left[ X^2\right] \) implies that \(\mathbb {E}\left[ \sqrt{X}\right] \leqslant \sqrt{\mathbb {E}\left[ X\right] }\), we get

$$\begin{aligned} \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ f(\textit{\textbf{u}},\textit{\textbf{v}})\right]&\leqslant \sqrt{\mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ \frac{4}{2^{3b}} \sum _{i=1}^q C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \cdot C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] } \nonumber \\&= \sqrt{\frac{4}{2^{3b}} \sum _{i=1}^q \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \cdot C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] } \,. \end{aligned}$$
(24)

Although \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) and \(C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) are not independent, we will show that their expectations are independent, i.e. that \(\mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \mid C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] = \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] \). First of all, as \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) are distributed uniform, every \(u_j\) and \(v_j\) has a probability of \(1/2^a\) of being equal to \(u_i\) or \(v_i\) for \(j < i\), hence

$$\begin{aligned} \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] = \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] = \frac{2(i-1)}{2^a} \,. \end{aligned}$$

Next, we have to compute \(\mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \mid C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] \). In this case we condition over the event that \(u_i = v_i\). If this is the case, we know the value of \(C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) exactly, as it is equal to \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\). On the other hand, if \(u_i \ne v_i\), we know that there are \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) less candidates, but also that every candidate has a higher probability \(1/(2^a-1)\) of being equal to \(v_i\). This gives the following result:

$$\begin{aligned} \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \mid C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right]&= \mathbb {P}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ u_i = v_i\right] \cdot \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \mid C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i), u_i = v_i\right] \\&+ \mathbb {P}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ u_i \ne v_i\right] \cdot \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \mid C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i), u_i \ne v_i\right] \\&= \frac{1}{2^a} \cdot C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) + \left( 1-\frac{1}{2^a}\right) \cdot \frac{2(i-1) - C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)}{2^a-1} \\&= \frac{1}{2^a} \cdot C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) + \frac{2^a-1}{2^a} \cdot \frac{2(i-1) - C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)}{2^a-1} \\&= \frac{1}{2^a} \cdot C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) + \frac{1}{2^a} \cdot \left( 2(i-1) - C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right) \\&= \frac{2(i-1)}{2^a} \\&= \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] \,. \end{aligned}$$

By Lemma 1 this means that we have \(\mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \cdot C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] = \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] \cdot \mathbb {E}_{\textit{\textbf{u}},\textit{\textbf{v}}}\left[ C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\right] \), so

This finishes the first part of the bound.

4.5.2 Bounding (23)

We now look at (23). The event \((\textit{\textbf{u}}, \textit{\textbf{v}}) \in \mathsf {bad}_2\) occurs when a \(2^{b-2}\)-collision occurs inside \((\textit{\textbf{u}}, \textit{\textbf{v}})\). As \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) are chosen uniformly, the probability of getting a t-collision is bounded by

$$\begin{aligned} \frac{(2q)^t}{2^{a(t-1)} \cdot t!} \,, \end{aligned}$$

where we later substitute \(t = 2^{b-2}\). By Stirling’s approximation, which says that

$$\begin{aligned} t! \geqslant \sqrt{2\pi t} \left( \frac{t}{e}\right) ^t \geqslant \sqrt{2\pi } \left( 2^{-3/2} \cdot t\right) ^t \,, \end{aligned}$$

we get that

(25)

From the assumption that \(b \geqslant n/12\) and \(b \geqslant 10\) (hence \(b \leqslant 2^b/96\)), we get that \(a \leqslant n \leqslant 12b \leqslant 2^b/8 = t/2\), so

(26)

Finally, by substituting \(t = 2^{b-2}\) we get

This finishes the second part of the bound.

5 Proof of Theorem 3

Let \(a,b,q \in \mathbb {N}\) and let \(\textit{\textbf{u}}= (u_1, \ldots , u_q)\) and \(\textit{\textbf{v}}= (v_1, \ldots , v_q)\) be vectors of length q with elements in \(\{0,1\}^{a}\) such that \(C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i), C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) < 2^{b-2}\) for all i. Let \(\mathcal {O}\) and \(\mathcal {R}\) be as in Algorithm 6. We denote their outputs by \(\textit{\textbf{w}}= (w_1, \ldots , w_q)\). Further, for \(i\in \{0,\ldots ,q\}\) denote \(\textit{\textbf{w}}_i=(w_1,\ldots ,w_i)\).

We will rely on the chi-squared method by Dai et al.  [22]. For each \(i=1,\ldots ,q\) and each \(\textit{\textbf{w}}_{i-1}\), define

$$\begin{aligned} \chi ^2(\textit{\textbf{w}}_{i-1}) = \sum _{w} \frac{\big (\mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \big )^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\,. \end{aligned}$$
(27)

The chi-squared method gives the following bound  [22]:

Lemma 4 (Chi-Squared Method)

Consider two systems \(\mathcal {O},\mathcal {R}\). Suppose that for any vector \(\textit{\textbf{w}}_i\), \(\mathbb {P}_{\mathcal {R}}\left[ \textit{\textbf{w}}_i\right] >0\) whenever \(\mathbb {P}_{\mathcal {O}}\left[ \textit{\textbf{w}}_i\right] >0\). Then,

$$\begin{aligned} \left\Vert \mathbb {P}_{\mathcal {O}} - \mathbb {P}_{\mathcal {R}} \right\Vert \leqslant \left( \frac{1}{2}\sum _{i=1}^q \mathbb {E}_{\mathcal {O}}\left[ \chi ^2(\textit{\textbf{w}}_{i-1})\right] \right) ^{1/2}\,. \end{aligned}$$

Our proof of Theorem 3 is related to that of Dai et al.  [22], where they look at both the \(\mathsf {SoSP}\) construction of (2) for a single permutation and the \(\mathsf {SoP}\) construction of (1) based on two different permutations. In our terminology, these correspond to the case of \(\textit{\textbf{u}}= \textit{\textbf{v}}= (0^a, \ldots , 0^a)\) and \(\textit{\textbf{u}}= (0^a, \ldots , 0^a)\), \(\textit{\textbf{v}}= (1^a, \ldots , 1^a)\), respectively. Our analysis, thus, carefully combines and generalizes these approaches. An additional difficulty arises from the fact that the different cases depend on the values of \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\), that may be arbitrary.

In the chi-squared method we have to reason over \(\mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \). However, in our case it is difficult to do this directly, as the conditional probability does not give information about the intermediate values \(U_j\) and \(V_j\) for \(j < i\), but only about their sum \(w_j = U_j \oplus V_j\). The following lemma shows that we can assume this extra information without increasing the bound. Intuitively, this is similar to the fact that giving an adversary more information does not lower its advantage.

Lemma 5

Let \(Z_{i-1}\) be a random variable in world \(\mathcal {O}\) (but not necessarily in world \(\mathcal {R}\)). Then,

$$\begin{aligned} \mathbb {E}_{\mathcal {O}}\left[ \chi ^2(\textit{\textbf{w}}_{i-1})\right]&\leqslant \sum _w \mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \,. \end{aligned}$$

Proof

Recall that

$$\begin{aligned} \chi ^2(\textit{\textbf{w}}_{i-1})&= \sum _{w} \frac{\big (\mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \big )^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] } \,. \end{aligned}$$

Let \(\textit{\textbf{w}}_{i-1}\) and w be fixed and write \(p = \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \). Then

$$\begin{aligned}&\frac{1}{p} \left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] - p\right) ^2 \\&= \frac{1}{p} \left( \sum _z \mathbb {P}_{\mathcal {O}}\left[ Z_{i-1} = z \mid \textit{\textbf{w}}_{i-1}\right] \cdot \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1} = z\right] - p\right) ^2 \\&= \frac{1}{p} \left( \mathbb {E}_{\mathcal {O}}\left[ \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] \biggm | \textit{\textbf{w}}_{i-1}\right] - p\right) ^2 \\&= \frac{1}{p} \mathbb {E}_{\mathcal {O}}\left[ \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] - p \biggm | \textit{\textbf{w}}_{i-1}\right] ^2 \\&\leqslant \frac{1}{p} \mathbb {E}_{\mathcal {O}}\left[ \left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] - p\right) ^2 \biggm | \textit{\textbf{w}}_{i-1}\right] \\&= \mathbb {E}_{\mathcal {O}}\left[ \frac{1}{p}\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] - p\right) ^2 \biggm | \textit{\textbf{w}}_{i-1}\right] \,. \end{aligned}$$

Furthermore, by taking the expectation on both sides we get

$$\begin{aligned}&\mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \\&\leqslant \mathbb {E}_{\mathcal {O}}\left[ \mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] } \Biggm | \textit{\textbf{w}}_{i-1}\right] \right] \\&= \mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}, Z_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \,. \end{aligned}$$

The proof is completed by combining both equations.    \(\square \)

In our case we take \(Z_i = (\textit{\textbf{U}}_i, \textit{\textbf{V}}_i)\) with \(\textit{\textbf{U}}_i = (U_1, \ldots , U_i)\) and \(\textit{\textbf{V}}_i = (V_1, \ldots , V_i)\). Note that in this case we can ignore \(\textit{\textbf{w}}_i\), as its value is fixed given \(Z_i\).

We now reformulate \(\mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] \). Given \(\textit{\textbf{U}}_{i-1}\) and \(\textit{\textbf{V}}_{i-1}\), we look at the probability that \(U_i \oplus V_i = w\) for an arbitrary w. For this, we define:

$$\begin{aligned} S_i&= \{U_j | j< i, u_j = u_i\} \cup \{V_j | j< i, v_j = u_i\}\,,\\ S'_i&= \{U_j | j< i, u_j = v_i\} \cup \{V_j | j < i, v_j = v_i\}\,. \end{aligned}$$

We write \(s_i = \left| S_i\right| \), \(s'_i = \left| S'_i\right| \), and \(D_{i,w} = \left| S_i \cap (S'_i \oplus w)\right| \).

In order for \(U_i \oplus V_i\) to be equal to w, the variable \(U_i\) must take a value from \(\{0,1\}^{b}\setminus (S_i \cup (S'_i \oplus w))\). The number of choices for this is exactly

$$\begin{aligned} 2^b - |S_i \cup (S'_i \oplus w)|&= 2^b - |S_i| - |S'_i \oplus w| + |S_i \cap (S'_i \oplus w)| \nonumber \\&= 2^b - s_i - s'_i + D_{i,w} \,. \end{aligned}$$
(28)

Moreover, the choice of \(V_i\) is fixed to \(U_i \oplus w\).

We claim that, regardless of whether \(u_i\) and \(v_i\) are equal or distinct,

$$\begin{aligned} \mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \leqslant \frac{8 s_i s'_i}{2^{4b}} \,. \end{aligned}$$
(29)

The proof of (29) will be given in Sect. 5.2 (for the case where \(u_i=v_i\)) and in Sect. 5.3 (for the case where \(u_i\ne v_i\)). The two proofs will rely on some probabilistic analysis of \(D_{i,w}\), given in Sect. 5.1.

Before getting there, however, we first complete the proof under the hypothesis that (29) holds. Note that \(s_i\) and \(s'_i\) do not depend on the specific values of \(U_j\) or \(V_j\), they only depend on the value of \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\). In fact \(s_i = C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\) and \(s'_i = C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i)\), which means that

$$\begin{aligned}&\left\Vert \mathbb {P}_{\mathcal {O}} - \mathbb {P}_{\mathcal {R}} \right\Vert ^2 \\&\leqslant \frac{1}{2} \sum _{i=1}^q \mathbb {E}_{\mathcal {O}}\left[ \chi ^2(\textit{\textbf{w}}_{i-1})\right] \\&\leqslant \frac{1}{2} \sum _{i=1}^q \sum _w \mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \\&\leqslant \frac{4}{2^{4b}} \sum _{i=1}^q \sum _w s_i s'_i \\&\leqslant \frac{4}{2^{3b}} \sum _{i=1}^q C_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \cdot C'_{\textit{\textbf{u}},\textit{\textbf{v}}}(i) \,. \end{aligned}$$

5.1 Expectation and Variance of \(D_{i,w}\)

The value \(D_{i,w}\) counts the number of elements \(g \in \{0,1\}^{b}\) such that \(g \in S_i\) and \(g \oplus w \in S'_i\). Our goal is to derive two bounds, one on its expected value \(\mathbb {E}\left[ D_{i,w}\right] \) and one on its variance \(\mathbf {Var}\left[ D_{i,w}\right] \), with the randomness chosen over the sets \(S_i\) and \(S'_i\) which are chosen uniform from \(\{0,1\}^{b}\) without replacement. Note that this corresponds with world \(\mathcal {O}\). We will, again, do so for the two different cases: equal permutations (for which \(S_i\) and \(S'_i\) are identical) in Sect. 5.1.2 and different permutations (for which \(S_i\) and \(S'_i\) are independent) in Section 5.1.3. The proofs share common analysis, which is first given in Section 5.1.1.

The proof is based on Lemma 4 of Bhattacharya and Nandi  [13], that considers a variant of \(\mathsf {SoP}\) where a single output is summed with multiple other outputs, but where all outputs are still from the same permutation. We look at the special case where it is summed with just one value, but extend the analysis to the case of different independent permutations.

5.1.1 General Analysis

Let \(I_g\) be the random variable that is 1 if \(g \in S_i\) and \(g \oplus w \in S'_i\), and 0 otherwise. Note that \(D_{i,w} = \sum _{g \in \{0,1\}^{b}} I_g\). For the expectation we have that

$$\begin{aligned} \mathbb {E}\left[ I_g\right]&= \mathbb {P}\left[ g \in S_i, g \oplus w \in S'_i\right] \\&= \mathbb {P}\left[ g \in S_i\right] \mathbb {P}\left[ g \oplus w \in S'_i \mid g \in S_i\right] \,, \end{aligned}$$

where we have to compute this value separately for equal and different permutations. For the expectation of \(D_{i,w}\) we simply find

$$\begin{aligned} \mathbb {E}\left[ D_{i,w}\right] = \sum _g \mathbb {E}\left[ I_g\right] \,. \end{aligned}$$

We now look at the variance of \(D_{i,w}\). We use the following property:

$$\begin{aligned} \mathbf {Var}\left[ D_{i,w}\right]&= \mathbf {Var}\left[ \sum _g I_g\right] \nonumber \\&= \sum _g \mathbf {Var}\left[ I_g\right] + \sum _{g \ne g'} \mathbf {Cov}\left( I_g,I_{g'}\right) \,, \end{aligned}$$
(30)

where

$$\begin{aligned} \mathbf {Cov}\left( I_g,I_{g'}\right)&= \mathbb {E}\left[ I_g I_{g'}\right] - \mathbb {E}\left[ I_g\right] \mathbb {E}\left[ I_{g'}\right] \\&= \mathbb {E}\left[ I_g\right] \mathbb {P}\left[ I_{g'} = 1 \mid I_g = 1\right] - \mathbb {E}\left[ I_g\right] \mathbb {E}\left[ I_{g'}\right] \,. \end{aligned}$$

First, we will argue that \(\mathbf {Cov}\left( I_g,I_{g'}\right) \leqslant 0\) whenever \(g' \ne g\oplus w\). Indeed, if this condition is satisfied, we have that \(g'\), \(g' \oplus w\), g and \(g \oplus w\) are mutually distinct, and thus that

$$\begin{aligned} \mathbb {P}\left[ I_{g'} = 1 \mid I_g = 1\right]&= \mathbb {P}\left[ g' \in S_i, g' \oplus w \in S'_i \mid g \in S_i, g \oplus w \in S'_i\right] \\&\leqslant \mathbb {P}\left[ g' \in S_i, g' \oplus w \in S'_i\right] \\&= \mathbb {E}\left[ I_{g'}\right] \,. \end{aligned}$$

For the derivation of the inequality, we have used the following observation. On the one hand, for equal permutations, \(S_i\) and \(S'_i\) are identical, so the inequality is satisfied as the probability of having two specific elements in a set of fixed size decreases when it is known that two other elements are already in it. On the other hand, for different permutations, \(S_i\) and \(S'_i\) are independent, so the inequality boils down to two independent cases with one element instead of two. Henceforth, we obtained that \(\mathbf {Cov}\left( I_g,I_{g'}\right) \leqslant 0\) whenever \(g' \ne g\oplus w\).

Having eliminated the case of \(\mathbf {Cov}\left( I_g,I_{g'}\right) \) for \(g' \ne g\oplus w\), we can proceed as follows for the second term of (30):

$$\begin{aligned} \sum _{g \ne g'} \mathbf {Cov}\left( I_g,I_{g'}\right)&\leqslant \sum _g \mathbf {Cov}\left( I_g,I_{g \oplus w}\right) \\&= \sum _g \mathbb {E}\left[ I_g\right] \mathbb {P}\left[ I_{g \oplus w} = 1 \mid I_g = 1\right] - \mathbb {E}\left[ I_g\right] \mathbb {E}\left[ I_{g \oplus w}\right] \\&\leqslant \sum _g \mathbb {E}\left[ I_g\right] - \mathbb {E}\left[ I_g\right] \mathbb {E}\left[ I_{g \oplus w}\right] \\&= \sum _g \mathbb {E}\left[ I_g^2\right] - \mathbb {E}\left[ I_g\right] ^2 \\&= \sum _g \mathbf {Var}\left[ I_g\right] \,. \end{aligned}$$

Concluding,

$$\begin{aligned} \mathbf {Var}\left[ D_{i,w}\right]&\leqslant 2 \sum _g \mathbf {Var}\left[ I_g\right] \\&= 2 \sum _g \mathbb {E}\left[ I_g\right] (1 - \mathbb {E}\left[ I_g\right] ) \\&\leqslant 2 \sum _g \mathbb {E}\left[ I_g\right] \\&= 2 \cdot \mathbb {E}\left[ D_{i,w}\right] \,. \end{aligned}$$

5.1.2 Equal Permutations

In this case we have that \(S_i\) and \(S'_i\) are identical. This means that for \(w \ne 0^b\)

$$\begin{aligned} \mathbb {P}\left[ g \in S_i\right] \mathbb {P}\left[ g \oplus w \in S'_i \mid g \in S_i\right] = \frac{s_i(s_i-1)}{2^b(2^b-1)} \,. \end{aligned}$$

Hence, we have obtained:

$$\begin{aligned} \mathbb {E}_{\mathcal {O}}\left[ D_{i,w}\right]&= \frac{s_i(s_i-1)}{2^b-1}\,,\end{aligned}$$
(31)
$$\begin{aligned} \mathbf {Var}_{\mathcal {O}}\left[ D_{i,w}\right]&\leqslant \frac{2s_i(s_i-1)}{2^b-1} \leqslant \frac{2 s_i s'_i}{2^b}\,. \end{aligned}$$
(32)

5.1.3 Different Permutations

Now \(S_i\) and \(S'_i\) are independent, and hence

$$\begin{aligned} \mathbb {P}\left[ g \in S_i\right] \mathbb {P}\left[ g \oplus w \in S'_i \mid g \in S_i\right] = \frac{s_i s'_i}{2^{2b}} \,. \end{aligned}$$

Hence, we have obtained:

$$\begin{aligned} \mathbb {E}_{\mathcal {O}}\left[ D_{i,w}\right]&= \frac{s_i s'_i}{2^b}\,,\end{aligned}$$
(33)
$$\begin{aligned} \mathbf {Var}_{\mathcal {O}}\left[ D_{i,w}\right]&\leqslant \frac{2 s_i s'_i}{2^b}\,. \end{aligned}$$
(34)

5.2 (29) for Equal Permutations

From (28) the number of valid choices for \(U_i\) and \(V_i\) is equal to \(2^b - 2s_i + D_{i,w}\), as \(s_i = s'_i\) for equal permutations. Furthermore, the total number of possible choices is \(2^b - s_i\) for \(U_i\) and \(2^b - s_i - 1\) for \(V_i\). This means that

$$\begin{aligned} \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right]&= \frac{2^b - 2s_i + D_{i,w}}{(2^b-s_i)(2^b-s_i-1)} \\&= \frac{(2^b-1) - s_i - (s_i-1) + D_{i,w}}{((2^b-1) - (s_i-1))((2^b-1) - s_i)} \,. \end{aligned}$$

As \(0^b\) is not possible in our modified ideal world, we have that \(\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] = 1/(2^b-1)\), which results in

$$\begin{aligned}&\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2 \\&= \left( \frac{(2^b-1) - s_i - (s_i-1) + D_{i,w}}{((2^b-1) - (s_i-1))((2^b-1) - s_i)} - \frac{1}{2^b-1}\right) ^2 \\&= \left( \frac{D_{i,w} - s_i(s_i-1)/(2^b-1)}{(2^b-s_i)(2^b-s_i-1)} \right) ^2 \\&\leqslant \frac{4(D_{i,w} - s_i(s_i-1)/(2^b-1))^2}{2^{4b}} \,, \end{aligned}$$

using that \(s_i < 2^{b-2}\). We know from (3132) that \(\mathbb {E}_{\mathcal {O}}\left[ D_{i,w}\right] = s_i(s_i-1)/(2^b-1)\) and \(\mathbf {Var}_{\mathcal {O}}\left[ D_{i,w}\right] \leqslant 2s_i^2/2^b\) for any \(w \in \{0,1\}^{b}\setminus \{0^b\}\), hence

$$\begin{aligned}&\mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \\&\leqslant \frac{4}{2^{3b}} \cdot \mathbb {E}_{\mathcal {O}}\left[ D_{i,w} - \frac{s_i(s_i-1)}{2^b-1}\right] \\&= \frac{4}{2^{3b}} \cdot \mathbf {Var}_{\mathcal {O}}\left[ D_{i,w}\right] \\&\leqslant \frac{8 s_i s'_i}{2^{4b}} \,. \end{aligned}$$

5.3 (29) for Different Permutations

From (28) the number of valid choices for \(U_i\) and \(V_i\) is equal to \(2^b - s_i - s'_i + D_{i,w}\). Furthermore, the total number of possible choices is \(2^b - s_i\) for \(U_i\) and \(2^b - s'_i\) for \(V_i\). This means that

$$\begin{aligned} \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right]&= \frac{2^b - s_i - s'_i + D_{i,w}}{(2^b-s_i)(2^b-s'_i)} \,. \end{aligned}$$

As \(u_i \ne v_i\), all values in \(\{0,1\}^{b}\) are possible in the ideal world, hence \(\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] = 1/2^b\). This results in

$$\begin{aligned}&\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2 \\&= \left( \frac{2^b - s_i - s'_i + D_{i,w}}{(2^b-s_i)(2^b-s'_i)} - \frac{1}{2^b}\right) ^2 \\&= \left( \frac{D_{i,w} - s_is'_i}{(2^b-s_i)(2^b-s'_i)}\right) ^2 \\&\leqslant \frac{4(D_{i,w} - s_i s'_i / 2^b)^2}{2^{4b}} \,, \end{aligned}$$

using that \(s_i,s'_i < 2^{b-2}\). We know from (3334) that \(\mathbb {E}_{\mathcal {O}}\left[ D_{i,w}\right] = s_i s'_i/2^b\) and \(\mathbf {Var}_{\mathcal {O}}\left[ D_{i,w}\right] \leqslant 2 s_i s'_i/2^b\) for any \(w \in \{0,1\}^{b}\), hence

$$\begin{aligned}&\mathbb {E}_{\mathcal {O}}\left[ \frac{\left( \mathbb {P}_{\mathcal {O}}\left[ w_i = w \mid \textit{\textbf{U}}_{i-1}, \textit{\textbf{V}}_{i-1}\right] - \mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] \right) ^2}{\mathbb {P}_{\mathcal {R}}\left[ w_i = w \mid \textit{\textbf{w}}_{i-1}\right] }\right] \\&\leqslant \frac{4}{2^{3b}} \cdot \mathbb {E}_{\mathcal {O}}\left[ \left( D_{i,w} - \frac{s_i s'_i}{2^b}\right) ^2\right] \\&= \frac{4}{2^{3b}} \cdot \mathbf {Var}_{\mathcal {O}}\left[ D_{i,w}\right] \\&\leqslant \frac{8 s_i s'_i}{2^{4b}} \,. \end{aligned}$$

6 Proof of Theorem 2

The proof of Theorem 2 is very similar to the proof of Theorem 1, but with a few minor differences. First of all, the steps 4.1 and 4.2 remain basically the same and can be modified in a straightforward way. The step 4.3 is slightly different, as truncation is applied to two different permutations. This leads to the term \(2\mathbf {Adv}_{\mathsf {Trunc}}^{\mathrm {prf}}(q)\) instead of the old \(\mathbf {Adv}_{\mathsf {Trunc}}^{\mathrm {prf}}(2q)\). Furthermore, the step 4.4 becomes obsolete as we do not have to limit the range in the case of two independent permutations. This means that the term \(q/2^n\) vanishes. Finally, the final step 4.5 remains roughly the same. In fact, as there are two independent permutations, \(\textit{\textbf{u}}\) and \(\textit{\textbf{v}}\) can be viewed separately. We might be able to use this information to improve some constants, but the gain is limited to those. We do not go into such detail and just reuse the old ones.

7 Application to GCM-SIV

GCM-SIV is a nonce misuse resistant authenticated encryption scheme of Gueron and Lindell, for which various versions exist [29, 32, 33, 39]. We consider the most recent one, that is also specified in internet draft IETF RFC  [44]. It is built on top of a block cipher \(E:\{0,1\}^{\kappa }\times \{0,1\}^{n}\rightarrow \{0,1\}^{n}\), and the internet draft considers an instantiation with AES-128 (where \(\kappa =n=128\)) or AES-256 (where \(\kappa =256\) and \(n=128\)).

If E is instantiated with AES-128, the first step of an evaluation of GCM-SIV is to derive two 128-bit subkeys \(k_1\parallel k_2\in \{0,1\}^{256}\) based on key k and nonce \(\nu \) as in (6):

$$\begin{aligned} \begin{aligned} k_1&= \mathsf {left}_{n/2}(E_k(\nu \Vert 0)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 1))\,,\\ k_2&= \mathsf {left}_{n/2}(E_k(\nu \Vert 2)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 3))\,. \end{aligned} \end{aligned}$$
(35)

Then, the associated data, message, and nonce are properly fed to the GHASH universal hash function (keyed with \(k_1\)), its outcome is encrypted using \(E_{k_2}\), and the resulting value is set as tag. This tag is, subsequently, set as input to counter mode based on \(E_{k_2}\) to obtain a keystream that is added to the plaintext to obtain the ciphertext. If E, on the other hand, is instantiated with AES-256, the procedure is the same but with a 128-bit and a 256-bit subkey \(k_1\parallel k_2\in \{0,1\}^{384}\) as derived in (6):

$$\begin{aligned} \begin{aligned} k_1&= \mathsf {left}_{n/2}(E_k(\nu \Vert 0)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 1))\,,\\ k_2&= \mathsf {left}_{n/2}(E_k(\nu \Vert 2)) \parallel \cdots \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 5))\,. \end{aligned} \end{aligned}$$
(36)

We refer to [52, Fig. 3] for a clean picture of this algorithm.

The isolated character of the key derivation function in GCM-SIV is also well-reflected in the security bound of GCM-SIV. The security bound of GCM-SIV as outlined by Mennink and Neves  [52, Theorem 3], which is in turn taken from Iwata and Seurin  [41], consists of two separated terms:

  • A term upper bounding the PRF security of the key derivation function, namely

    $$\begin{aligned} \mathbf {Adv}_{\mathsf {Trunc}_{n/2}}^{\mathrm {prf}}(c \cdot q) + \mathbf {Adv}_{E}^{\mathrm {prp}}(c\cdot q, t)\,, \end{aligned}$$
    (37)

    where q is the number of invocations of the key derivation function, and where \(c=4\) for the 128-bit keyed variant and \(c=6\) for the 256-bit keyed variant;

  • A term that describes the security of GCM-SIV as an authenticated encryption scheme once \(k_1\) and \(k_2\) are uniformly random. This term is irrelevant for current discussion.

Now, if we would replace the truncation in the key derivation of GCM (Eqs. (35) and (36)) by \(\mathsf {STH}\), we would get

$$\begin{aligned} k_1&= \mathsf {left}_{n/2}(E_k(\nu \Vert 0)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 1))\,,\\ k_2&= \mathsf {right}_{n/2}(E_k(\nu \Vert 0) \oplus E_k(\nu \Vert 1)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 2)) \end{aligned}$$

for the 128-bit keyed variant, and

$$\begin{aligned} k_1 =&\,\mathsf {left}_{n/2}(E_k(\nu \Vert 0)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 1))\,,\\ k_2 =&\,\mathsf {right}_{n/2}(E_k(\nu \Vert 0) \oplus E_k(\nu \Vert 1)) \parallel \\&\, \mathsf {left}_{n/2}(E_k(\nu \Vert 2)) \parallel \mathsf {left}_{n/2}(E_k(\nu \Vert 3)) \parallel \mathsf {right}_{n/2}(E_k(\nu \Vert 2) \oplus E_k(\nu \Vert 3)) \end{aligned}$$

for the 256-bit keyed variant. When we use \(\mathsf {STH}\), we see that for the derivation of a 256-bit subkey the underlying block cipher E is called three times instead of four times, and for the derivation of a 384-bit subkey it is called four times instead of six times. As for security, the original bound of Iwata and Seurin  [41] (see also [52, Theorem 3]) carries over with (37) replaced by

$$\begin{aligned} \mathbf {Adv}_{\mathsf {STH}_{n/2}}^{\mathrm {prf}}(2\cdot q) + \mathbf {Adv}_{E}^{\mathrm {prp}}(c\cdot q, t)\,, \end{aligned}$$

where \(c=3\) for the 128-bit keyed variant and \(c=4\) for the 256-bit keyed variant. As the PRF security of \(\mathsf {STH}_{n/2}\) (Theorem 1) is similar to the PRF security of truncation (Lemma 2), there is no significant loss in security. In particular, when we allow for a maximum advantage of \(2^{-32}\), we are able to derive approximately \(2^{64}\) different keys for both instantiations, even when \(t \gg 2^{64}\). Hence the security does not reduce when using the more efficient \(\mathsf {STH}\) version.

We conclude by noting that this only discusses the key derivation in isolation. As bijectivity in the key derivation is not an issue in the bigger picture of GCM-SIV, one can get away by simply taking untruncated block ciphers  [17]. However, there are many more applications where replacing block cipher evaluations by \(\mathsf {STH}\) truly lead to security gains, most notably Wegman-Carter and counter mode encryption, as also outlined in Sect. 1.