The development of large-scale programmable quantum hardware has opened the door to testing a fundamental question in the theory of computation: can quantum computers outperform classical ones for certain tasks? This idea, termed quantum computational advantage, has motivated the design of novel algorithms and protocols to demonstrate advantage with minimal quantum resources such as qubit number and gate depth [AA11, FH19, BMS16, LBR17, HM17, TER18, BIS+18, BFN+19, AC17, NRK+18]. Such protocols are naturally characterized along two axes: the computational speedup and the ease of verification. The former distinguishes whether a quantum algorithm exhibits a polynomial or super-polynomial speedup over the best known classical one. The latter classifies whether the correctness of the quantum computation is efficiently verifiable by a classical computer. Along these axes lie three broad paths to demonstrating advantage: 1) sampling from entangled quantum many-body wavefunctions, 2) solving a deterministic problem, e.g. prime factorization, via a quantum algorithm, and 3) proving quantumness through interactive protocols.
Sampling-based protocols directly rely on the classical hardness of simulating quantum mechanics [AA11, BMS16, BIS+18, BFN+19, AC17, NRK+18]. The “computational task” is to prepare and measure a generic complex many-body wavefunction with little structure. As such, these protocols typically require minimal resources and can be implemented on near-term quantum devices [AAB+19, ZWD+20]. The correctness of the sampling results, however, is exponentially difficult to verify. This has an important consequence: in the regime beyond the capability of classical computers, the sampling results cannot be explicitly checked, and quantum computational advantage can only be inferred (e.g. extrapolated from simpler circuits).
Algorithms in the second class of protocols are naturally broken down by whether they exhibit polynomial or super-polynomial speed-ups. In the case of polynomial speed-ups, there exist notable examples that are provably faster than any possible classical algorithm [BGK18, BGK+20]. However, polynomial speed-ups are tremendously challenging to demonstrate in practice, due to the slow growth of the separation between classical and quantum run-times 11 1 They also have some other caveats: a provable speedup of O(1) quantum complexity over O(n) classical complexity is promising, but just reading the input may require O(n) time, hiding the computational speedup in practice.. Accordingly, the most attractive algorithms for demonstrating advantage tend to be those with a super-polynomial speed-up, including Abelian hidden subgroup problems such as factoring and discrete logarithms [SHO97]. The challenge is that for all known protocols of this type, the quantum circuits required to demonstrate advantage are well beyond the capabilities of near-term experiments.
The final class of protocols demonstrates quantum advantage through an interactive proof [BCM+21, BKV+20, ABE+17, WAT99, KW00, KM03, FV15, MFI+20]. At a high level, this type of protocol involves multiple rounds of communication between the classical verifier and the quantum prover; the prover must give self-consistent responses despite not knowing what the verifier will ask next. This requirement of self-consistency rules out a broad range of classical cheating strategies and can imbue “hardness” into questions that would otherwise be easy to answer. To this end, interactive protocols expand the space of computational problems that can be used to demonstrate quantum advantage; from a more pragmatic perspective, this can enable the realization of efficiently verifiable quantum advantage on near-term quantum hardware.
Recently, a beautiful interactive protocol was introduced that can operate both as a test for quantum advantage and as a generator of certifiable quantum randomness [BCM+21]. The core of the protocol is a two-to-one function, f, built on the computational problem known as learning with errors (LWE) [REG09]. The demonstration of advantage leverages two important properties of the function: first, it is claw-free, meaning that it is computationally hard to find a pair of inputs (x0,x1) such that f(x0)=f(x1). 22 2 “Claw-free” is often used to refer to a pair of functions f0,f1 such that for appropriate x0,x1 we have f0(x0)=f1(x1). Here, we use the slightly more general idea of a single 2-to-1 function f for which it is hard to find x0,x1 such that f(x0)=f(x1). This is a special case of a “collision-resistant function,” which could potentially be many-to-one. We also note that a claw-free pair of functions can be converted into a single claw-free function by defining f(b||x)=fb(x), where || denotes concatenation.. Second, there exists a trapdoor: given some secret data t, it becomes possible to efficiently invert f and reveal the pair of inputs mapping to any output. (See Section 5.7.5 for an overview of trapdoor claw-free functions). However, to fully protect against cheating provers, the protocol requires a stronger version of the claw-free property called the adaptive hardcore bit, namely, that for any input x0 (which may be chosen by the prover), it is computationally hard to find even a single bit of information about x1 33 3 To be precise, it is hard to find both x0 and the parity of any subset of the bits of x1.. The need for an adaptive hardcore bit within this protocol severely restricts the class of functions that can operate as verifiable tests of quantum advantage.
Here, we propose and analyze a novel interactive quantum advantage protocol that removes the need for an adaptive hardcore bit, with essentially zero overhead in the quantum circuit and no extra cryptographic assumptions. We present four main results. First, we demonstrate how an idea from tests of Bell’s inequality can serve the same cryptographic purpose as the adaptive hardcore bit [BEL64]. In essence, our interactive protocol is a variant of the CHSH (Clauser, Horne, Shimony, Holt) game [CHS+69] in which one player is replaced by a cryptographic construction. Normally, in CHSH, two quantum parties are asked to produce correlations that would be impossible for classical devices to produce. If space-like separation is enforced to rule out communication between the two parties, then the correlations constitute a proof of quantumness. In our case, the space-like separation is replaced by the computational hardness of a cryptographic problem. In particular, the quantum prover holds a qubit whose state depends on the cryptographic secret in the same way that the state of one CHSH player’s qubit depends on the secret measurement basis of the other player. An alternative interpretation, from the perspective of Bell’s theorem, is that the protocol can be thought of as a “single-detector Bell test”—the cryptographic task generates the same single-qubit state as would be produced by entangling a second qubit and measuring it with another detector. As in the CHSH game, a quantum device can pass the verifier’s test with probability ∼85%, but a classical device can only succeed with probability at most 75%. This finite gap in success probabilities is precisely what enables a verifiable test of quantum advantage.
Second, by removing the need for an adaptive hardcore bit, our protocol accepts a broader landscape of functions for interactive tests of quantum advantage (see Table 5.1 and Methods). We populate this list with two new constructions. The first is based on the decisional Diffie-Hellman problem (DDH) [DH76, PW08, FGK+10]; the second utilizes the function fN(x)=x2modN with N the product of two primes, which forms the backbone of the Rabin cryptosystem [RAB79, GMR88]. On the one hand, DDH is appealing because the elliptic-curve version of the problem is particularly hard for classical computers [MIL86, KOB87, BAR16]. On the other hand, x2modN can be implemented significantly more efficiently, while its hardness is equivalent to factoring. We hope that these two constructions will provide a foundation for the search for more TCFs with desirable properties (small key size and efficient quantum circuits).
Third, we describe two innovations that facilitate our protocol’s use in practice: a way to significantly reduce overhead arising from the reversibility requirement of quantum circuits, and a scheme for increasing noisy devices’ probability of passing the test. Normally, quantum implementations of classical functions like the TCFs used in this protocol have some overhead, due to the need to make the circuit reversible in order to be consistent with unitarity [BEN89, LS90, AKN98, BIC+04, KTR14]. Our protocol exhibits the surprising property that it permits a measurement scheme to discard so-called “garbage bits” that arise during the computation, allowing classical circuits to be converted into quantum ones with essentially zero overhead. In the case of a noisy quantum device, the protocol also enables an inherent post-selection scheme for detecting and removing certain types of quantum errors. With this scheme it is possible for quantum devices to trade off low quantum fidelities for an increase in the overall runtime, while still passing the cryptographic test. We note that these constructions are likely applicable to other TCF-based quantum cryptography protocols as well, and thus may be of independent interest for tasks such as certifiable quantum random number generation.
Finally, focusing on the TCF x2modN, we provide explicit quantum circuits—both asymptotically optimal (requiring only O(nlogn) gates and O(n) qubits), as well as those aimed for near-term quantum devices. We show that a verifiable test of quantum advantage can be achieved with ∼103 qubits and a gate depth ∼105 (see Methods). We also co-design a specific implementation of x2modN optimized for a programmable Rydberg-based quantum computing platform. The native physical interaction corresponding to the Rydberg blockade mechanism enables the direct implementation of multi-qubit-controlled arbitrary phase rotations without the need to decompose such gates into universal two-qubit operations [SAF16, LKS+19, GKG+19, MCS+20, BL20]. Access to such a native gate immediately reduces the gate depth for achieving quantum advantage by an order of magnitude.
Problem |
|
|
|
|
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LWE [BCM+21] | ✓ | ✓ | ✓ | n2log2n | ||||||||||
x2modN | ✓ | ✓ | ✗ | nlogn | ||||||||||
Ring-LWE [BKV+20] | ✓ | ✓ | ✗ | nlog2n | ||||||||||
Diffie-Hellman | ✓ | ✓ | ✗ | n3log2n | ||||||||||
Shor’s alg. | — | — | — | n2logn |
The use of trapdoor claw-free functions for quantum cryptographic tasks was pioneered in two recent breakthrough protocols: (i) giving classical homomorphic encryption for quantum circuits [MAH20] and (ii) for generating cryptographically certifiable quantum randomness from an untrusted black-box device [BCM+21]; this latter work also introduced the notion of an adaptive hardcore bit and serves as an efficiently verifiable test of quantum advantage. Remarkably, the scheme was further extended to allow a classical server to cryptographically verify the correctness of arbitrary quantum computations [MAH18]; it has also been applied to remote state preparation with implications for secure delegated computation [GV19].
Recently, an improvement to the practicality of TCF-based proofs of quantumness was provided in the random oracle model (ROM)—a model of computation in which both the quantum prover and classical verifier can query a third-party “oracle,” which returns a random (but consistent) output for each input. In that work, the authors provide a protocol that both removes the need for the adaptive hardcore bit, and also reduces the interaction to a single round [BKV+20]. Because the security of the protocol is proven in the ROM, implementing this protocol in practice requires applying the random oracle heuristic, in which the random oracle is replaced by a cryptographic hash function, but the hardness of classically defeating the protocol is taken to still hold 44 4 Replacing the random oracle with a hash function is termed a heuristic rather than an assumption because the security of this procedure generally holds in practice but is not provable—in fact, there exist constructions that are provably secure in the random oracle model but trivially insecure when instantiated with a hash function [CGH04].. Only contrived cryptographic schemes have ever been broken by attacking the random oracle heuristic [CGH04, KM15], so it seems to be effective in practice and the ROM protocol has significant potential for use as a practical tool for benchmarking untrusted quantum servers. On the other hand, for a robust experimental test of the foundational complexity theoretic claims of quantum computing—that quantum mechanics allows for algorithms that are superpolynomially faster than classical Turing machines—we desire the complexity-theoretic backing of the speedup to be as strong as possible (i.e. provable in the “standard model” of computation [AC17]), which is the goal pursued in the present work. With that said, we emphasize that the various optimizations described below—including the TCF families based on DDH and x2modN, as well as the schemes for postselection and discarding garbage bits—can be applied to the ROM protocol as well.
Our full protocol is shown diagrammatically in Figure 5.1. It consists of three rounds of interaction between the prover and verifier (with a “round” being a challenge from the verifier, followed by a response from the prover). The first round generates a multi-qubit superposition over two bit strings that would be cryptographically hard to compute classically. The second round maps this superposition onto the state of one ancilla qubit, retaining enough information to ensure that the resulting single-qubit state is also hard to compute classically. The third round takes this single qubit as input to a CHSH-type measurement, allowing the prover to generate a bit of data that is correlated with the cryptographic secret in a way that would not be possible classically. Having described the intuition behind the protocol, we now lay out each round in detail.
The goal of the first round is to generate a superposition over two colliding inputs to the trapdoor claw-free function (TCF). It begins with the verifier choosing an instance fi of the TCF along with the associated trapdoor data t; fi is sent to the prover. As an example, in the case of x2modN, the “index” i is the modulus N, and the trapdoor data is its factorization, p,q. The prover now initializes two registers of qubits, which we denote as the x and y registers. On these registers, they compute the entangled superposition |ψ⟩=∑x|x⟩x|fi(x)⟩y, over all x in the domain of fi. The prover then measures the y register in the standard basis, collapsing the state to (|x0⟩+|x1⟩)x|y⟩y, with y=f(x0)=f(x1). The measured bitstring y is then sent to the verifier, who uses the secret trapdoor to compute x0 and x1 in full.
At this point, the verifier randomly chooses to either request a projective measurement of the x register, ending the protocol, or to continue with the second and third rounds. In the former case, the prover communicates the result of that measurement, yielding either x0 or x1, and the verifier checks that indeed f(x)=y. In the latter case, the protocol proceeds with the final two rounds.
The second round of interaction converts the many-qubit superposition |ψ⟩=|x0⟩x+|x1⟩x into a single-qubit state {|0⟩b,|1⟩b,|+⟩b,|−⟩b} on an ancilla qubit b. The final state of b depends on the values of both x0 and x1. The round begins with the verifier choosing a random bitstring r of the same length as x0 and x1, and sending it to the prover. Using a series of CNOT gates from the x register to b, the prover computes the state |r⋅x0⟩b|x0⟩x+|r⋅x1⟩b|x1⟩x, where r⋅x denotes the binary inner product. Finally, the prover measures the x register in the Hadamard basis, storing the result as a bitstring d which is sent to the verifier. This measurement disentangles x from b without collapsing b’s superposition. At the end of the second round, the prover’s state is (−1)d⋅x0|r⋅x0⟩b+(−1)d⋅x1|r⋅x1⟩b, which is one of {|0⟩,|1⟩,|+⟩,|−⟩}. Crucially, it is cryptographically hard to predict whether this state is one of {|0⟩,|1⟩} or {|+⟩,|−⟩}.
The final round of our protocol can be understood in analogy to the CHSH game [CHS+69]. While the prover cannot extract the polarization axis from their single qubit (echoing the no-signaling property of CHSH), they can make a measurement that is correlated with it. This measurement outcome ultimately constitutes the proof of quantumness. In particular, the verifier requests a measurement in an intermediate basis, rotated from the Z axis around Y, by either θ=π/4 or −π/4. Because the measurement basis is never perpendicular to the state, there will always be one outcome that is more likely than the other (specifically, with probability cos2(π/8)≈0.85). The verifier returns Accept if this “more likely” outcome is the one received.
In the next section, we prove that a quantum device can cause the verifier to Accept with substantially higher probability than any classical prover. A full test of quantum advantage would consist of running the protocol many times, until it can be established with high statistical confidence that the device has exceeded the classical probability bound.
We now prove completeness (the noise-free quantum success probability) and soundness (an upper bound on the classical success probability). Recall that after the first round of the protocol, the verifier chooses to either request a standard basis measurement of the first register, or to continue with the second and third rounds. In the proofs below, we analyze the prover’s success probability across these two cases separately. We denote the probability that the verifier will accept the prover’s string x in the first case as px, and the probability that the verifier will accept the single-qubit measurement result in the second case as pCHSH.
An error-free quantum device honestly following the interactive protocol will cause the verifier to return Accept with px=1 and pCHSH=cos2(π/8)≈0.85.
If the verifier chooses to request a projective measurement of x after the first round, an honest quantum prover succeeds with probability px=1 by inspection.
If the verifier chooses to instead perform the rest of the protocol, the prover will hold one of {|0⟩,|1⟩,|+⟩,|−⟩} after round 2. In either measurement basis the verifier may request in round 3, there will be one outcome that occurs with probability cos2(π/8), which is by construction the one the verifier accepts. Thus, an honest quantum prover has pCHSH=cos2(π/8)≈0.85. ∎
Assume the function family used in the interactive protocol is claw-free. Then, px and pCHSH for any classical prover must obey the relation
px+4pCHSH−4<ϵ(n) | (5.1) |
where ϵ is a negligible function of n, the length of the function family’s input strings.
We prove by contradiction. Assume that there exists a classical machine A for which px+4pCHSH−4≥μ(n), for a non-negligible function μ. We show that there exists another algorithm B that uses A as a subroutine to find a pair of colliding inputs to the claw-free function, a contradiction.
Given a claw-free function instance fi, B acts as a simulated verifier for A. B begins by supplying fi to A, after which A returns a value y, completing the first round of interaction. B now chooses to request the projective measurement of the x register, and stores the result as x0. Letting px0 be the probability that x0 is a valid preimage, by definition of px we have px0=px.
Next, B rewinds the execution of A, to its state before x0 was requested. Crucially, rewinding is possible because A is a classical algorithm. B now proceeds by running A through the second and third rounds of the protocol for many different values of the bitstring r (Fig. 1), rewinding each time.
We now show that, for r selected uniformly at random, B can extract the value of the inner product r⋅x1 with probability pr⋅x1≥1−2(1−pCHSH). B begins by sending r to A, and receiving the bitstring d. B then requests the measurement result in both the θ=π/4 and θ=−π/4 bases, by rewinding in between. Supposing that both the received values are “correct” (i.e. would be accepted by the real verifier), they uniquely determine the single-qubit state |ψ⟩∈{|0⟩,|1⟩,|+⟩,|−⟩} that would be held by an honest quantum prover. This state reveals whether r⋅x0=r⋅x1, and because B already holds x0, B can compute r⋅x1. We may define the probability (taken over all randomness except the choice of θ) that the prover returns an accepting value in the cases θ=π/4 and θ=−π/4 as pπ/4 and p−π/4 respectively. Then, via union bound, the probability that both are indeed correct is pr⋅x1≥1−(1−pπ/4)−(1−p−π/4). Considering that pCHSH=(pπ/4+p−π/4)/2, we have pr⋅x1≥1−2(1−pCHSH).
Now, we show that extracting r⋅x1 in this way allows x1 to be determined in full even in the presence of noise, by rewinding many times and querying for specific (correlated) choices of r. In particular, the above construction is a noisy oracle to the encoding of x1 under the Hadamard code. By the Goldreich-Levin theorem [GL89], list decoding applied to such an oracle will generate a polynomial-length list of candidates for x1. If the noise rate of the oracle is noticeably less than 1/2, x1 will be contained in that list; B can iterate through the candidates until it finds one for which f(x1)=y.
By Lemma 1 in the Methods, for a particular iteration of the protocol, the probability that list decoding succeeds is bounded by px1>2pr⋅x1−1−2μ′(n), for a noticeable function μ′(n) of our choice 55 5 The oracle’s noise rate is not simply pr⋅x1: that is the probability that any single value r⋅x1 is correct, but all of the queries to the oracle are correlated (they are for the same iteration of the protocol, and thus the same value of y).. Setting μ′(n)=μ(n)/4 and combining with the previous result yields px1>1−4(1−pCHSH)−μ(n)/2.
Finally, via union bound, the probability that B returns a claw is
PB≥1−(1−px0)−(1−px1)>px+4pCHSH−4−μ(n)/2 |
and via the assumption that px+4pCHSH−4>μ(n) we have
PB>μ(n)/2 |
a contradiction. ∎
If we let px=1, the bound requires that pCHSH<3/4+ϵ(n) for a classical device, while pCHSH≈0.85 for a quantum device, matching the classical and quantum success probabilities of the CHSH game. In Section 5.7.7, we provide an example of a classical algorithm saturating the bound with px=1 and pCHSH=3/4.
In this section we describe two variations on the protocol, the goal of both of which is to remove the need for the “preimage” test (Step 6a of Fig. 5.1). The main benefit of doing so is that it simplifies and improves the classical bound, to simply p≤3/4+ϵ(n), where p now is the overall probability that the prover succeeds (equivalent to pCHSH in the normal protocol, because px no longer exists). A secondary benefit is that it slightly simplifies the experimental implementation by making the protocol less complicated.
The idea is simple: in Step 6b of Fig. 5.1, instead of choosing a single random bitstring r, the verifier chooses two, r0 and r1. Then, in Step 7b, instead of using the single r for both x0 and x1, the prover instead computes |r0⋅x0⟩b|x0⟩x+|r1⋅x1⟩b|x1⟩x—a different inner product for each of the preimages. Applying the proof of Theorem 3 to this scheme, the responses of the classical machine A can be used to reconstruct whether r0⋅x0=r1⋅x1 (where originally we reconstructed simply whether r⋅x0=r⋅x1). The key insight is that the truth value of this new equality is equal to (r0||r1)⋅(x0||x1), where || denote concatenation. This fact can be used to construct a noisy oracle for the inner product of x0||x1 with arbitrary strings, to which the Goldreich-Levin theorem can be applied to find x0||x1, fully revealing both x0 and x1. (This should be compared to the original proof, which could only decode x0⊕x1 via the Goldreich-Levin theorem, and thus required the preimage test to supply x0 or x1 and thus reveal the claw). Since x0 and x1 can both be reconstructed from only the CHSH portion of the protocol, the “preimage” test is not necessary for classical hardness and can be removed.66 6 This variant of the protocol was published in [BGK+23].
The downside of this variation of the protocol is that the prover needs to somehow be able to distinguish x0 from x1, so that the appropriate inner product can be taken with each. For many TCFs, such as the one based on LWE [BCM+21] and the DDH-based TCF we present in this chapter, this is not a problem—there is an extra qubit in the preimages which is in the state |0⟩ for x0 and |1⟩ for x1. However for x2modN, it is not so straightforward. Via the Jacobi symbol it is technically possible to distinguish the two preimages, because it is a fact of x2modN that one preimage will have Jacobi symbol +1 and the other −1. However actually computing the Jacobi symbol is very expensive, much moreso than computing x2modN itself, defeating our goal of having an efficient implementation! Another somewhat less expensive strategy is to switch to the pair of functions f0(x)=x2modN and f1(x)=4x2modN, with their domain defined as the set of quadratic residues less than N (instead of the set of integers [0,N/2] that were used before). By splitting into two functions we get the desired “marker” qubit distinguishing the two preimages, but we run into the problem of generating a uniform superposition of quadratic residues modulo N. To our knowledge the best way to generate such a superposition is to start with the set of all integers less than N, and square them. Then, another square must be taken to actually implement the TCF. So, it seems that using this TCF would require a quantum circuit twice as large as the original protocol using x2modN—a tradeoff that is probably not worth it for the extra simplicity of removing the preimage test. That being said, if a function other than x2modN is used which does have the extra qubit, this variation is almost certainly the right choice.
We also note that we learned via personal correspondence with Eitan Porat, Zvika Brakerski, and Thomas Vidick that they found that the original protocol is actually classically hard without the preimage test. The intuitive idea is that we can learn more information from the “measurement results” than just whether r⋅x0=r⋅x1. In particular, when that equality holds, we also get access to the value of r⋅x0 (and r⋅x1, since they are equal). With this extra information it is possible to use a more complicated scheme based on the Goldreich-Levin theorem to decode x0 and x1 in full, proving the hardness of passing just the CHSH portion directly from the hardness of finding claws. However, apparently the classical bound is not quite as powerful due to the more complicated decoding process.
The existence of a finite gap between the classical and quantum success probabilities implies that our protocol can tolerate a certain amount of noise. A direct implementation of our interactive protocol on a noisy quantum device would require an overall fidelity of ∼83% in order to exceed the classical bound 77 7 This number comes from solving the classical bound (Equation 5.1) for circuit fidelity F, with px=F and pCHSH=12+F/2.. To allow devices with lower fidelities to demonstrate quantum advantage, our protocol allows for a natural tradeoff between fidelity and runtime, such that the classical bound can, in principle, be exceeded with only a small [e.g. 1/poly(n)] amount of coherence in the quantum device 88 8 This is true even if the coherence is exponentially small in n. Of course, with arbitrarily low coherence the runtime may become excessively large such that quantum advantage cannot be demonstrated—the point is that regardless of runtime, the classical probability bound can be exceeded with a device that has arbitrarily low circuit fidelity..
The key idea is based upon postselection. For most TCFs, there are many bitstrings of the correct length that are not valid outputs of f. Thus, if the prover detects such a y value in step 3 (Fig. 1), they can simply discard it and try again 99 9 This scheme will only remove errors in the first round of the protocol, but fortunately, one expects the overwhelming majority of the quantum computation, and thus also the majority of errors, to occur in that round.. In principle, the verifier can even use their trapdoor data to silently detect and discard iterations of the protocol with invalid y 1010 10 This procedure does not leak data to a classical cheater, because the verifier does not communicate which runs were discarded. Furthermore, it does not affect the soundness of Theorem 3, because the machine B in that theorem’s proof can simply iterate until it encounters a valid y.. Since y is a function of x0 and x1, one might hope that this postselection scheme also rejects states where x0 or x1 has become corrupt. Although this may not always be the case, we demonstrate numerically that this assumption holds for a specific implementation of x2modN in the following subsection. One could also compute a classical checksum of x0 and x1 before and after the main circuit to ensure that they have not changed during its execution. Assuming that such bit-flip errors are indeed rejected, the possibility remains of an error in the phase between |x0⟩ and |x1⟩. In Section 5.7.9, we demonstrate that a prover holding the correct bitstrings but with an error in the phase can still saturate the classical bound; if the prover can avoid phase errors even a small fraction of the time, they will push past the classical threshold.
Focusing on the function f(x)=x2modN, we now explicitly analyze the effectiveness of the postselection scheme. Let m be the length of the outputs of this function. In this case, approximately 1/4 of the bitstrings of length m are valid outputs, so one would naively expect to reject about 3/4 of corrupted bitstrings. By introducing additional redundancy into the outputs of f and thus increasing m, one can further decrease the probability that a corrupted y will incorrectly be accepted. As an example, let us consider mapping x2modN to the function (kx)2modk2N for some integer k. This is particularly convenient because the prover can validate y by simply checking whether it is a multiple of k2. Moreover, the mapping adds only logk bits to the size of the problem, while rejecting a fraction 1−1/k2 of corrupted bitstrings.
We perform extensive numerical simulations demonstrating that postselection allows for quantum advantage to be achieved using noisy devices with low circuit fidelities (Fig. 2). We simulate quantum circuits for (kx)2modk2N at a problem size of n=512 bits. Assuming a uniform gate fidelity across the circuit, we analyze the success rate of a quantum prover for k=3a and a={0,1,2,3}. For these simulations we use our implementation of the Karatsuba algorithm (see Section 5.5.1) because it is the most efficient in terms of gate count and depth. The choice of k=3a, and details of the simulation, are explained in Section 5.7.9.
For a=0, the circuit implements our original function x2modN, where in the absence of postselection, an overall circuit fidelity of F∼0.83 is required to achieve quantum advantage. As depicted in Fig. 5.2(a), even for a=0, our postselection scheme improves the advantage threshold down to F∼0.51. For a=2, circuit fidelities with F≳0.1 remain well above the quantum advantage threshold, while for a=3 the required circuit fidelity drops below 1%.
However, there is a tradeoff. In particular, one expects the overall runtime to increase for two reasons: (i) there will be a slight increase in the circuit size for a>0 and (ii) one may need to re-run the quantum circuit many times until a valid y is measured. Somewhat remarkably, a runtime overhead of only 4.7x already enables quantum advantage to be achieved with an overall circuit fidelity of 10% [Fig. 5.2(b)]. Crucially, this increase in runtime is overwhelmingly due to re-running the quantum circuit and does not imply the need for longer experimental coherence times.
The central computational step in our interactive protocol (i.e. step 2, Fig. 5.1) is for the prover to apply a unitary of the form:
Ufi∑x|x⟩x|0⊗m⟩y=∑x|x⟩x|fi(x)⟩y, | (5.2) |
where fi(x) is a classical function and m is the length of the output register. This type of unitary operation is ubiquitous across quantum algorithms, and a common strategy for its implementation is to convert the gates of a classical circuit into quantum gates. Generically, this process induces substantial overhead in both time and space complexity owing to the need to make the circuit reversible to preserve unitarity [BEN89, LS90]. This reversibility is often achieved by using an additional register, g, of so-called “garbage bits” and implementing: U′fi∑x|x⟩x|0⊗m⟩y|0⊗l⟩g=∑x|x⟩x|fi(x)⟩y|gi(x)⟩g. For each gate in the classical circuit, enough garbage bits are added to make the operation injective. In general, to maintain coherence, these bits cannot be discarded but must be “uncomputed” later, adding significant complexity to the circuits.
A particularly appealing feature of our protocol is the existence of a measurement scheme to discard garbage bits, allowing for the direct mapping of classical to quantum circuits with no overhead. Specifically, we envision the prover measuring the qubits of the g register in the Hadamard basis and storing the results as a bitstring h, yielding the state,
|ψ⟩=∑x(−1)h⋅gi(x)|x⟩x|fi(x)⟩y. | (5.3) |
The prover has avoided the need to do any uncomputation of the garbage bits, at the expense of introducing phase flips onto some elements of the superposition. These phase flips do not affect the protocol, so long as the verifier can determine them. While classically computing h⋅gi(x) is efficient for any x, computing it for all terms in the superposition is infeasible for the verifier. However, our protocol provides a natural way around this. The verifier can wait until the prover has collapsed the superposition onto x0 and x1, before evaluating gi(x) only on those two inputs 1111 11 This is true because gi(x) is the result of adding extra output bits to the gates of a classical circuit, which is efficient to evaluate on any input..
Crucially, the prover can measure away garbage qubits as soon as they would be discarded classically, instead of waiting until the computation has completed. If these qubits are then reused, the quantum circuit will use no more space than the classical one. This feature allows for significant improvements in both gate depth and qubit number for practical implementations of the protocol (see last rows of Table I in Methods). We note that performing many individual measurements on a subset of the qubits is difficult on some experimental systems, which may make this technique challenging to use in practice. However, recent hardware advances have demonstrated these “intermediate measurements” in practice with high fidelity, for example by spatially shuttling trapped ions [ZKL+21, RBL+21]. We thus expect that the capability to perform partial measurements will not be a barrier in the near term. This issue can also be mitigated somewhat by collecting ancilla qubits and measuring them in batches rather than one-by-one, allowing for a direct trade-off between ancilla usage and the number of partial measurements.
Before moving on to proposals for the physical implementation of this protocol, I would like to briefly summarize some of my unsuccessful efforts to find new constructions for trapdoor claw-free functions, in hope that it can be helpful for anyone trying to do so in the future. Broadly, the goal is to come up with a TCF that can be implemented in as small a quantum circuit as possible—primarily in terms of number of qubits and number of gates. Other potentially important statistics include circuit depth (parallelism) and spatial locality of the gates.
We will focus on the x2modN-based TCF in the later sections of this chapter because it seems to strike the best balance in achieving the goals above, but it is not perfect because the modulus N needs to be quite large for the problem to be classically hard—which has negative consequences for both the qubit and gate counts. For example, considering just qubit count for the moment, if we desire the security of a 1024-bit modulus, there is a hard lower bound of 1024 qubits required to implement the circuit (and in practice, the circuit will probably require a considerable amount more than that). This should be compared to the fact that in the average case, circuits of fewer than 100 qubits with sufficient depth are infeasible to classically simulate—so there is a large gap between the hardness of simulation and the hardness of the cryptography. Ideally, we would make that gap as small as possible. The DDH-based TCF also proposed in this chapter has the potential to improve the gap considerably: when implemented using elliptic curve cryptography, the group elements can be as small as a couple hundred bits long and the hardness assumption remains secure. Unfortunately, the gate count required to implement that TCF is dramatically worse than for x2modN, and that is why we do not focus our efforts on building circuits for it.
Given these considerations, I expended a considerable effort in looking for other cryptographic assumptions that could be used to build a trapdoor claw-free function. Coming up with new, more efficient TCFs directly from the ground up is a daunting pursuit: finding ways to make public-key cryptography more efficient is of central concern for classical cryptography, so it has been a subject of intense research for years. So instead of trying to break new ground there, a more modest goal is to take other existing schemes for public-key cryptography which do not have the precise structure of a TCF, and build TCFs out of them.
In my efforts to do so, one promising candidate seemed to be the Learning Parity with Noise (LPN) problem, which has found use for classical cryptography in devices with very limited computational power such as RFID cards. The structure of the LPN problem is similar to that of LWE, but the linear algebra takes place over the field F2 of binary numbers instead of integers modulo some large q. [PIE12] To be explicit, consider a binary matrix A∈{0,1}m×n, with, say, m=2n. For a secret string s∈{0,1}n and “error” vector e∈{0,1}m, consider the “noisy” image of s defined as y=As+e. The LPN hardness assumption states that for appropriate setting of the problem parameters, given only y and A it is computationally hard (even for a quantum computer) to recover s unless A has some special structure.1212 12 When I first learned about LPN I got extremely interested in exploring the classical hardness of the problem. I wrote the first (to my knowledge) GPU-accelerated solver for it, and ended up breaking the world record for the largest instance that had been solved. After about a year I was unseated by another GPU-based implementation. The competition can be found here: https://decodingchallenge.org/syndrome, I encourage the reader to try their hand at it! Obviously this is the case if the noise vector e is overwhelming; the problem is interesting because this seems to hold even when e is quite sparse (most entries are zero). One can see the potential here for simplicity of implementation: performing the linear algebra requires only addition and multiplication of numbers in F2, which corresponds simply to XOR and AND gates. This is dramatically less complicated than the addition and multiplication circuits for integers modulo some large value q, which are required to implement LWE.
The challenge is to figure out how to build a TCF out of this hardness assumption. Considering the similarity of the LWE and LPN problems, an obvious idea is to follow the structure of the LWE TCF, and define two functions roughly as
f0(x)= | Ax | (5.4) | ||
f1(x)= | Ax+y | (5.5) |
Using the definition of y, we see that f1(x)=A(x+s)+e, and thus that for a pair (x0,x1) where x0=x1+s, we have f0(x0)=f1(x1)+e—that is, it is almost a claw, aside from the error vector e (which has most entries set to zero). But for the protocol to work, we need an exact collision, rather than an approximate one. In LWE, this is done by adding extra error e′ to the output of both f0 and f1, to “smear out” the values. If the distribution of e′ is sufficiently wider than the distribution of e, then e disappears into the noise and the probability distributions have good overlap, yielding collisions. Unfortunately, despite considerable effort, it does not seem that it is possible to do the same trick with LPN. The problem stems from the same reason that LPN seemed promising: the linear algebra is over F2 instead of Fq. Intuitively, because each value can only be 0 or 1, there is simply no “room” to have a wider probability distribution for the elements of an extra noise vector e′. (In fact, the LWE TCF requires q to be very large precisely for this reason). Perhaps there is some other scheme to create exact collisions from these near-collisions in LPN, like rounding the outputs somehow, but I was never able to find one.
Looking at the problem more broadly, it actually seems very unlikely that it is possible to create perfect collisions in this way, because it turns out doing so would break the assumption of post-quantum hardness of LPN, which is widely believed to hold. The reason is because this pair of functions could be used as an oracle for Simon’s algorithm, which would allow a quantum device to very efficiently find s. [SIM97] The only hope seems to be the fact that Simon’s algorithm requires the functions to perfectly collide all but an exponentially small fraction of the time, so perhaps if the collisions are not perfect, the LPN assumption would not be broken. However, even broadening the search to look for such “noisy” TCFs based on LPN has yet to yield any useful constructions. One last idea is that maybe there is a way to use LPN in an entirely different manner to create a TCF—but for that, it’s not even clear where to start.
As just discussed, while all of the trapdoor, claw-free functions listed in Table 5.1 can be utilized within our interactive protocol, each has its own set of advantages and disadvantages. For example, the TCF based on the Diffie-Hellman problem (described in the Methods) already enables a demonstration of quantum advantage at a key size of 160 bits (with a hardness equivalent to 1024 bit integer factorization [BAR16]); however, building a circuit for this TCF requires a quantum implementation of Euclid’s algorithm, which is challenging [HJN+20]. Thus, we focus on designing quantum circuits implementing Rabin’s function, x2modN.
In Chapter 7 we present what to our knowledge are the most highly optimized circuits known for x2modN. Here, we present four more basic circuits, that exhibit the range of possible implementations of x2modN and provide a good comparison for the optimizations in that chapter. For the circuits presented here, implementations in Python using the Cirq library are included as supplementary files 1313 13 Code is available at https://github.com/GregDMeyer/quantum-advantage and is archived on Zenodo [MEY22]. The first two are quantum implementations of classical circuits for the Karatsuba and “schoolbook” classical integer multiplication algorithms, where we leverage the reversibility optimizations described in Section 5.3.5 (see Section 5.7.8 for details of their implementation). The latter pair, which we call the “phase circuits” and describe below, are intrinsically quantum algorithms that use Ising interactions to directly compute x2modN in the phase. Using those circuits, we propose a near-term demonstration of our interactive protocol on a Rydberg-based quantum computer [LKS+19, BL20]; crucially, the so-called “Rydberg blockade” interaction natively realizes multi-qubit controlled phase rotations, from which the entire circuits shown in Figure 3 are built (up to single qubit rotations). A comparison of approximate gate counts for each of the four circuits can be seen in Table I in the Methods. Of the circuits explored here, the Karatsuba algorithm is the most efficient in total gates and circuit depth, while the phase circuits are most efficient in terms of qubit usage and measurement complexity. Chapter 7 manages to combine the benefits of both, yielding circuits with gate counts better than the Karatsuba circuits here and qubit usage and measurement complexity comparable to the phase circuits.
We now describe the two circuits, amenable to near-term quantum devices, that utilize quantum phase estimation to implement the function f(x)=x2modN. The intuition behind our approach is as follows: we will compute x2/N in the phase and transfer it to an output register via an inverse quantum Fourier transform [DRA00, BEA03]; the modulo operation occurs automatically as the phase wraps around the unit circle, avoiding the need for a separate reduction step.
In order to implement ∑x|x⟩x|x2modN⟩y, we design a circuit to compute:
(I⊗IQFT)~UwN(I⊗H⊗m)|x⟩|0⊗m⟩=|x⟩|w⟩ | (5.6) |
where H is a Hadamard gate, IQFT represents an inverse quantum Fourier transform, w≡x2/N=0.w1w2⋯wm is an m-bit binary fraction 1414 14 We must take m>n+O(1) to sufficiently resolve the value x2modN in post-processing, and ~UwN is the diagonal unitary,
~UwN|x⟩|z⟩=exp(2πix2Nz)|x⟩|z⟩. | (5.7) |
The simplest circuit to implement ~UwN simply decomposese x and z in binary, and performs a digit-by-digit multiplication using the schoolbook algorithm:
exp(2πix2Nz)=∏i,j,kexp(2πi2i+j+kNxixjzk), | (5.8) |
With this, one immediately finds that ~UwN is equivalent to applying a series of controlled-controlled-phase rotation gates of angle,
ϕijk=2π2i+j+kN(mod2π). | (5.9) |
Here, the control qubits are i,j in the x register, while the target qubit is k in the y register. Crucially, the value of this phase for any i,j,k can be computed classically when the circuit is compiled.
Figure 5.3 shows two explicit circuits to implement ~UwN, one optimizing for qubit count, and the other optimizing for gate count. The first circuit [Fig. 5.3(a)] takes advantage of the fact that the output register is measured immediately after it is computed; this allows one to replace the m output qubits with a single qubit that is measured and reused m times. Moreover, by replacing groups of doubly-controlled gates with a Toffoli and a series of singly-controlled gates, one ultimately arrives at an implementation, which requires n3/2+O(n2) gates, but only n+O(1) qubits. We note that this does require individual measurement and re-use of qubits, which has been a challenge for experiments; recent experiments however have demonstrated this capability [ZKL+21, RBL+21].
The second circuit [Fig. 5.3(b)], which optimizes for gate count, leverages the fact that ϕijk (Eqn. 5.9) only depends on i+j+k, allowing one to combine gates with a common sum. In this case, one can define ℓ=i+j and then, for each value of ℓ, simply “count” the number of values of i,j for which both control qubits are 1. By then performing controlled gates off of the qubits of the counter register, one can reduce the total gate complexity by a factor of n/logn, leading to a implementation with 2n2logn+O(n2) gates.
Motivated by recent advances in the creation and control of many-body entanglement in programmable quantum systems [ZPH+17, AAB+19, SSW+21, EWL+21], we propose an experimental implementation of our interactive protocol based upon neutral atoms coupled to Rydberg states [BL20]. We envision a three dimensional system of either alkali or alkaline-earth atoms trapped in an optical lattice or optical tweezer array [Fig. 5.4(a)] [WZC+15, WKW+16, KWG+18]. To be specific, we consider 87Rb with an effective qubit degree of freedom encoded in hyperfine states: |0⟩=|F=1,mF=0⟩ and |1⟩=|F=2,mF=0⟩. Gates between atoms are mediated by coupling to a highly-excited Rydberg state |r⟩, whose large polarizability leads to strong van der Waals interactions. This microscopic interaction enables the so-called Rydberg “blockade” mechanism—when a single atom is driven to its Rydberg state, all other atoms within a blockade radius, Rb, become off-resonant from the drive, thereby suppressing their excitation [Fig. 5.4(a,b)] [SAF16].
Somewhat remarkably, this blockade interaction enables the native implementation of all multi-qubit-controlled phase gates depicted in the circuits in Figure 5.3. In particular, consider the goal of applying a CkRℓϕ gate; this gate applies phase rotations, {ϕ1,ϕ2,…,ϕℓ}, to target qubits {j1,j2,…jℓ} if all k control qubits {i1,i2,…ik} are in the |1⟩ state [Fig. 5.4(d)]. Experimentally, this can be implemented as follows: (i) sequentially apply (in any order) resonant π-pulses on the |0⟩↔|r⟩ transition for the k desired control atoms, (ii) off-resonantly drive the |1⟩↔|r⟩ transition of each target atom with detuning Δ and Rabi frequency Ω for a time duration T=2π/(Ω2+Δ2)1/2 [Fig. 5.4(c)], (iii) sequentially apply [in the opposite order as in (i)] resonant −π-pulses (i.e. π-pulses with the opposite phase) to the k control atoms to bring them back to their original state. The intuition for why this experimental sequence implements the CkRℓϕ gate is straightforward. The first step creates a blockade if any of the control qubits are in the |0⟩ state, while the second step imprints a phase, ϕ=π(1−Δ/√Δ2+Ω2), on the |1⟩ state, only in the absence of a blockade. Note that tuning the values of ϕi for each of the target qubits simply corresponds to adjusting the detuning and Rabi frequency of the off-resonant drive in the second step [Fig. 5.4(c,d)].
Demonstrations of our protocol can already be implemented in current generation Rydberg experiments, where a number of essential features have recently been shown, including: 1) the coherent manipulation of individual qubits trapped in a 3D tweezer array [WZC+15, WKW+16], 2) the deterministic loading of atoms in a 3D optical lattice [KWG+18], and 3) fast entangling gate operations with fidelities, F≥0.974 [LKS+19, GKG+19, MCS+20]. In order to estimate the number of entangling gates achievable within decoherence time scales, let us imagine choosing a Rydberg state with a principal quantum number n≈70. This yields a strong van der Waals interaction, V(→r)=C6/r6, with a C6 coefficient ∼(2π)880 GHz⋅μm6 [LWN+12]. Combined with a coherent driving field of Rabi frequency Ω∼(2π)1−10 MHz, the van der Waals interaction can lead to a blockade radius of up to, Rb=(C6/Ω)1/6∼10μm. Within this radius, one can arrange ∼102 all-to-all interacting qubits, assuming an atom-to-atom spacing of approximately, a0≈2μm 1515 15 We note that this spacing is ultimately limited by a combination of the optical diffraction limit and the orbital size of n≈70 Rydberg states.. In current experiments, the decoherence associated with the Rydberg transition is typically limited by a combination of inhomogeneous Doppler shifts and laser phase/intensity noise, leading to 1/T2∼10−100 kHz [dBL+18, LKS+19, LSF+21b]. Taking everything together, one should be able to perform ∼103 entangling gates before decoherence occurs (this is comparable to the number of two-qubit entangling gates possible in other state-of-the-art platforms [AAB+19, SBT+18]). While this falls short of enabling an immediate full-scale demonstration of classically verifiable quantum advantage, we hasten to emphasize that the ability to directly perform multi-qubit entangling operations significantly reduces the cost of implementing our interactive protocol. For example, the standard decomposition of a Toffoli gate uses 6 CNOT gates and 7 T and T† gates, with a gate depth of 12 [NC11, SM09, BBC+95]; an equivalent three qubit gate can be performed in a single step via the Rydberg blockade mechanism.
The interplay between classical and quantum complexities ultimately determines the threshold for any quantum advantage scheme. Here, we have proposed a novel interactive protocol for classically verifiable quantum advantage based upon trapdoor claw-free functions; in addition to proposing two new TCFs [Table 5.1], we also provide explicit quantum circuits that leverage the microscopic interactions present in a Rydberg-based quantum computer. Our work allows near-term quantum devices to move one step closer toward a loophole-free demonstration of quantum advantage and also has opened the door to a number of promising future directions.
First, the proof of soundness contained in this chapter only applies to classical adversaries. Since the work in this chapter was originally published, a work by several colleagues and myself has extended the cryptographic proofs to the quantum case. In particular, we show that when the protocol from this work is instantiated with a quantum secure TCF like the one based off of LWE, it can be used to certify certain facts about the inner workings of the quantum device, with implications for quantum cryptographic applications such as certifiable random number generation or even the verification of arbitrary computations. [BGK+23] Second, our work has motivated the search for new trapdoor claw-free functions, as discussed in Section 5.4. At least one new construction has been discovered since this work was published; ideally more will be found as the search continues. [AMR22] More broadly, one could also attempt to build modified protocols, which simplify either the requirements on the cryptographic function or the interactions; interestingly, recent work has demonstrated that using random oracles can remove the need for interactions in a TCF-based proof of quantumness [BKV+20], or even remove the need for a TCF entirely! [YZ22] Finally, while we have focused our experimental discussions on Rydberg atoms, a number of other platforms also exhibit features that facilitate the protocol’s implementation. For example, both trapped ions and cavity-QED systems can allow all-to-all connectivity, while superconducting qubits can be engineered to have biased noise [PSG+20]. This latter feature would allow noise to be concentrated into error modes detectable by our proposed post-selection scheme.
In this section we prove a bound on the probability that list decoding will succeed for a particular value of y, given an oracle’s noise rate over all values of y. Recall that by the Goldreich-Levin theorem [GL89], list decoding of the Hadamard code is possible if the noise rate is noticeably less than 1/2.
Consider a binary-valued function over two inputs g:Y×{0,1}n→{0,1}, and a noisy oracle G to that function. Assuming some distribution of values y∈Y and r∈{0,1}n, define ϵ≡Pry,r[G(y,r)≠g(y,r)] as the “noise rate” of the oracle. Now define the conditional noise rate for a particular y∈Y as
ϵy≡Prr[G(y,r)≠g(y,r)] | (5.10) |
Then, the probability that ϵy is less than 1/2−μ(n) for any positive function μ, over randomly selected y, is
pgood≡Pry[ϵy<1/2−μ(n)]≥1−2ϵ−2μ(n). | (5.11) |
Let S⊆Y be the set of y values for which ϵy<1/2−μ(n). Then by definition we have
ϵ=pgood⋅ϵy∈S+(1−pgood)⋅ϵy∉S | (5.12) |
Noting that we must have ϵy≥1/2−μ(n) for y∉S by definition, we may minimize the right hand side of Equation 5.12, yielding the bound
ϵ>pgood⋅0+(1−pgood)⋅(1/2−μ(n)) | (5.13) |
Rearranging this expression we arrive at
pgood>1−2ϵ−2μ(n) |
which is what we desired to show. ∎
Here we present two trapdoor claw-free function families (TCFs) for use in the protocol of this paper. These families are defined by three algorithms: Gen, a probabilistic algorithm which selects an index i specifying one function in the family and outputs the corresponding trapdoor data t; fi, the definition of the function itself; and T, a trapdoor algorithm which efficiently inverts fi for any i, given the corresponding trapdoor data t. Here we provide the definitions of the function families; proofs of their cryptographic properties are included in the supplementary information. In these definitions we use a security parameter λ following the notation of cryptographic literature; λ is informally equivalent to the “problem size” n defined in the main text as the length of the TCF input string.
“Rabin’s function” fN(x)=x2modN, with N the product of two primes, was first used in the context of public-key cryptography and digital signatures [RAB79, GMR88]. We use it to define the trapdoor claw-free function family FRabin, as follows.
Function generation
Gen(1λ)
Randomly choose two prime numbers p and q of length λ/2 bits, with pmod4≡qmod4≡3mod4 1616 16 In practice, p and q must be selected with some care such that Fermat factorization and Pollard’s p−1 algorithm [POL74] cannot be used to efficiently factor N classically. Selecting p and q in the same manner as for RSA encryption would be effective [RSA78]..
Return N=pq as the function index, and the tuple (p,q) as the trapdoor data.
Function definition
fN:[N/2]→[N] is defined as
fN(x)=x2modN | (5.14) |
The domain is restricted to [N/2] to remove extra trivial collisions of the form (x,−x).
Trapdoor
The trapdoor algorithm is the same as the decryption algorithm of the Rabin cryptosystem [RAB79]. On input y and key (p,q), the Rabin decryption algorithm returns four integers (x0,x1,−x0,−x1) in the range [0, N). x0 and x1 can then be selected by choosing the two values that are smaller than N/2. See proof in supplementary information for an overview of the algorithm.
We now present a trapdoor claw-free function family FDDH based on the decisional Diffie-Hellman problem (DDH). DDH is defined for a multiplicative group G; informally, the DDH assumption states that for a group generator g and two integers a and b, given g, ga, and gb it is computationally hard to distinguish gab from a random group element. We expand on a known DDH-based trapdoor one-way function construction [PW08, FGK+10], adding the claw-free property to construct a TCF.
Function generation
Gen(1λ)
Choose a group G of order q∼O(2λ), and a generator g for that group.
For dimension k>log2q choose a random invertible matrix M∈Zk×kq.
Compute gM=(gMij)∈Gk×k (element-wise exponentiation).
Choose a secret vector s∈{0,1}k; compute the vector gMs (where Ms is the matrix-vector product, and again the exponentiation is element-wise).
Publish the pair (gM,gMs), retain (g,M,s) as the trapdoor data.
Function definition
Let d be a power of two with d∼O(k2). We define the function fi as fi(b||x):=fi,b(x), where || denotes concatenation, for a pair of functions fi,b:Zkd→Gk:
fi,0(x) | =gMx | (5.15) | ||
fi,1(x) | =gMxgMs=gM(x+s) | (5.16) |
Trapdoor
The algorithm takes as input the trapdoor data (g,M,s) and a value y=gMx0=gM(x1+s), and returns the claw (x0,x1).
T((g,M,s),y)
Compute M−1 using M.
Compute gM−1Mx0=gx0.
Take the discrete logarithm of each element of gx0, yielding x0. Crucially, this is possible because the elements of x are in Zd and d=poly(n), so the discrete logarithm can be computed in polynomial time by brute force.
Compute x1=x0−s
Return (x0,x1)
A comparison of the resource requirements for computing x2modN, for various problem sizes and circuit designs, is presented in Table 5.2. These counts are generated in the “abstract circuit” model, in which error correction, qubit routing, and other practical considerations are not included. For schoolbook and Karatsuba circuits, circuits are decomposed into a Clifford+T gate set. For the “phase” circuits, we allow controlled arbitrary phase rotations, as we expect these circuits to be appropriate for hardware (physical) qubits where these gates are native. Accordingly, we do not provide T gate counts for those circuits.
Circuit | Qubits | Gates (CCRϕ/ Toffoli allowed) | Gates (Clifford + T) | T Gates | Depth | Qubit measmts. |
---|---|---|---|---|---|---|
n=128 (takes seconds on a desktop [44]) | ||||||
Qubit-optimized phase | 128 | 1.1×106 | — | — | 1.1×106 | 128 |
Gate-optimized phase | 264 | 4.3×105 | — | — | 6.3×104 | 0 |
Schoolbook | 515 | 1.4×105 | 9.1×105 | 3.9×105 | 1.9×104 | 3.5×104 |
Karatsuba | 942 | 1.3×105 | 7.7×105 | 3.3×105 | 2.0×103 | 3.4×104 |
n=400 (takes hours on a desktop [44]) | ||||||
Qubit-optimized phase | 400 | 3.3×107∗ | — | — | 3.3×107∗ | 400 |
Gate-optimized phase | 812 | 4.2×106∗ | — | — | 6.2×105∗ | 0 |
Schoolbook | 1603 | 1.3×106 | 8.7×106 | 3.6×106 | 5.9×104 | 3.3×105 |
Karatsuba | 3051 | 8.8×105 | 5.4×106 | 2.3×106 | 5.3×104 | 2.4×105 |
n=829 (record for factoring [ZIM20]) | ||||||
Qubit-optimized phase | 829 | 3.0×108∗ | — | — | 2.9×108∗ | 829 |
Gate-optimized phase | 1671 | 1.8×107∗ | — | — | 2.6×106∗ | 0 |
Schoolbook | 3319 | 5.6×106 | 3.8×107 | 1.6×107 | 1.2×105∗ | 1.4×106 |
Karatsuba | 5522 | 3.0×106 | 1.8×107 | 7.7×106 | 1.1×105∗ | 8.0×105 |
n=1024 (exceeds factoring record) | ||||||
Qubit-optimized phase | 1024 | 5.6×108∗ | — | — | 5.5×108∗ | 1024 |
Gate-optimized phase | 2061 | 2.7×107∗ | — | — | 4.0×106∗ | 0 |
Schoolbook | 4097 | 8.3×106 | 5.7×107 | 2.4×107 | 1.5×105∗ | 2.1×106 |
Karatsuba | 6801 | 4.3×106 | 2.6×107 | 1.1×107 | 1.4×105∗ | 1.1×106 |
Other algs. at n=1024 | ||||||
Rev. schoolbook † | 8192 | — | 6.4×108 | 2.2×108 | 1.1×108 | 0 |
Rev. Karatsuba † | 12544 | — | 5.7×108 | 1.9×108 | 2.4×107 | 0 |
Shor’s alg. ‡ | 3100 | — | — | 1.9×109∗ | — | — |
Here we prove the cryptographic properties of the trapdoor claw-free functions (TCFs) presented in the Methods section of the main text. We base our definitions on the Noisy Trapdoor Claw-free Function family (NTCF) definition given in Definition 3.1 of Ref. [BCM+21], with certain modifications such as removing the adaptive hardcore bit requirement and the “noisy” nature of the functions.
We emphasize that in the definitions below, we define security only against classical attackers. Both the x2modN and DDH constructions could be trivially defeated by a quantum adversary via Shor’s algorithm; since the purpose of the protocol in this paper is to demonstrate quantum capability, this type of adversary is allowed.
We also note that the TCF definition allows the 2-to-1 property to be “imperfect”—that is, we allow the fraction of pre-images which have a colliding pair to be less than 1. In the protocol, the verifier may simply discard any runs in which the prover supplied an output y value that is not part of a claw, that is, does not have two corresponding inputs. This will not affect the prover’s ability to pass the classical threshold (since these runs are counted neither for or against the prover); it will only possibly affect the number of iterations of the protocol required to exceed the classical bound with the desired statistical significance. In the definition below we require the fraction of “good” inputs be at least a constant (which we set to 0.9); in principle the fraction could be as low as 1/poly(λ) without interfering with the protocol’s effectiveness.
We use the following definition of a Trapdoor Claw-free Function family:
Let λ be a security parameter, I a set of function indices, and Xi and Yi finite sets for each i∈I. A family of functions
F={fi:Xi→Yi}i∈I |
is called a trapdoor claw free (TCF) family if the following conditions hold:
Efficient Function Generation. There exists an efficient probabilistic algorithm Gen which generates a key i∈I and the associated trapdoor data ti:
(i,ti)←Gen(1λ) |
Trapdoor Injective Pair. For all indices i∈I, the following conditions hold:
Injective pair: Consider the set Ri of all tuples (x0,x1) such that fi(x0)=fi(x1). Let X′i⊆Xi be the set of values x which appear in the elements of Ri. For all x∈X′i, x appears in exactly one element of Ri; furthermore, there exists a value λc such that for all λ>λc, |X′i|/|Xi|>0.9.
Trapdoor: There exists an efficient deterministic algorithm T such that for all y∈Yi and (x0,x1) such that fi(x0)=fi(x1)=y, T(ti,y)=(x0,x1).
Claw-free. For any non-uniform probabilistic polynomial time (nu-PPT) classical Turing machine A, there exists a negligible function ϵ(⋅) such that
Pr[fi(x0)=fi(x1)∧x0≠x1|(x0,x1)←A(i)]<ϵ(λ) |
where the probability is over both choice of i and the random coins of A.
Efficient Superposition. There exists an efficient quantum circuit that on input a key i prepares the state
1√|Xi|∑x∈Xi|x⟩|fi(x)⟩ |
In this section we prove that the function family FRabin (defined in Methods) is a TCF by demonstrating each of the properties of Definition 1. Most of the properties follow directly from properties of the Rabin cryptosystem [RAB79]; we reproduce several of the arguments here for completeness.
The function family FRabin is trapdoor claw-free, under the assumption of hardness of integer factorization.
We demonstrate each of the properties of Definition 1:
Efficient Function Generation. Sampling large primes to generate p,q and N is efficient [RAB79].
Trapdoor Injective Pair.
Injective pair: By definition of the function, Yi is the set of quadratic residues modulo N. For any y∈Yi, consider the two values a<p/2 and b<q/2 such that a2≡ymodp and b2≡ymodq. These values exist because y is a quadratic residue modulo pq, therefore it is also a quadratic residue modulo p and q. Define c≡1modp≡0modq and d≡0modp≡1modq. The following four values x in the range [0,N) have x2≡ymodN: ac+bd,ac−bd,−ac+bd,−ac−bd. Exactly two of these values are in the domain [N/2] of the TCF, and constitute the injective pair; moreover, these two values will be unique as long as a,b≠0. Thus we may define the set X′i={x∈[N/2]|x≢0modp∧x≢0modq}. There exist exactly ((p−1)+(q−1))/2 multiples of p or q in the set of integers Xi=[N/2], thus |X′i|/|Xi|=1−((p−1)+(q−1))/N. Recall that p,q are defined to have length λ/2; if we let λc=12, then p,q>25=32. Since 1−(31+31)/322>0.9 and |X′i|/|Xi| increases monotonically with λ, we have |X′i|/|Xi|>0.9 for all λ>λc.
Trapdoor: Because p and q were selected to have p≡q≡3mod4, a and b in the expressions above can always be computed as a=y(p+1)/4modp and b=y(q+1)/4modq, and then the preimages can be computed as defined above.
Claw-free. We show that knowledge of a claw x0,x1 can be used directly to factor N. Writing the claw as (ac+bd,ac−bd) using the values a,b,c,d from above, we have x0+x1=2ac. Because c=0modq, gcd(x0+x1,N)=q can be efficiently computed, which then also yields p=N/q. Thus, an algorithm that could be used efficiently to find claws could be equally used to efficiently factor N, which we assume to be hard.
Efficient Superposition. The set of preimages Xi is the set of integers [N/2]. A uniform superposition ∑x∈Xi|x⟩ may be computed by generating a uniform superposition of all bitstrings of length n (via Hadamard gate on every qubit), and then evaluating a comparator circuit that generates the state ∑|x⟩|x<N/2⟩ where |x<N/2⟩ is a bit on an ancilla. If this ancilla is then measured and the result is |1⟩, the state is collapsed onto the superposition ∑x∈Xi|x⟩ (if the result is |0⟩ the process should simply be repeated). Then a multiplication circuit to an empty register may be executed to generate the desired state ∑x∈X|x⟩|x2modN⟩.
∎
We now prove that FDDH (defined in Methods) forms a trapdoor claw-free function family.
The function family FDDH is trapdoor claw-free, under the assumption of hardness of the decisional Diffie-Hellman problem for the group G.
We demonstrate each of the properties of Definition 1:
Efficient Function Generation. Each step of Gen is efficient by inspection.
Trapdoor Injective Pair.
Injective pair: First we note that the matrix M is chosen to be invertible, thus f0 and f1 are one-to-one. Therefore for all x0∈Xi, at most one other preimage x1∈Xi has fi(x0)=fi(x1). Furthermore, since colliding pairs have the structure (0||x′0),(1||x′1) with x′0=x′1+s and s∈{0,1}k, the only preimages that will not form part of a colliding pair are those where x′0 has a zero element at an index where s is nonzero, or x′1 has an element equal to d−1 where s is nonzero (the vector element will be outside of the range of vector elements for the other vector). Thus |X′k|/|Xk|>(1−1d)k. Since d∼O(k2) and k∼O(λ), we have limλ→∞|X′k|/|Xk|=1 with |X′k|/|Xk| monotonically increasing. Therefore, there exists a value λc such that |X′k|/|Xk|>0.9 for all λ>λc. (We note that if we set k=λ and d=k2, then λc=10.)
Trapdoor: The steps of the algorithm T are efficient by inspection. Crucially, the discrete logarithm of each vector element is possible by brute force, because the elements of x0 only take values up to polynomial in λ.
Claw-free. An algorithm which could efficiently compute a claw (0||x′0,1||x′1) could then trivially compute the secret vector s=x′0−x′1. For any matrix M′, the existence of an algorithm to uniquely determine s from (gM′,gM′s) would directly imply an algorithm for determining whether M′ has full rank. But DDH implies it is computationally hard to determine whether a matrix M′ is invertible given gM′ [PW08, FGK+10]. Therefore DDH implies the claw-free property.
Efficient Superposition. Because d is a power of two, a superposition of all possible preimages x can be computed by applying Hadamard gates to every qubit in a register all initialized to |0⟩. The function f can then be computed by a quantum circuit implementing a classical algorithm for the group operation of G.
∎
In this section, we provide a brief overview of the cryptographic concepts upon which this work relies.
Foundational to the field of cryptography is the idea of a one-way function. Informally, this type of function is easy to compute, but hard to invert. Here, “easy” means that the function can be evaluated in time polynomial in the length of the input. By “hard” we mean that the cost of the best algorithm to invert the function is superpolynomial in the length of the input. In practice, for a given one-way function we desire that there exists a particular problem size (input length) for which the function can be evaluated fast enough that it is not overly costly to use, but for which inversion would be infeasible for even an adversary with large (but realistic) computing power. One way functions can be used directly to construct many useful cryptographic schemes, including pseudorandom number generators, private-key encryption, and secure digital signatures.
In this work, we rely on a specific type of one-way function called a trapdoor claw-free function (TCF). This class of functions has two additional features.
First, it has a trapdoor. This means that while the function is hard to invert in general, with the knowedge of some secret data (the trapdoor key) inversion becomes easy. This secret data should be easy to generate when the function is chosen (from a large family of similar functions), but should be hard to find given just the description of the function itself. For example, in this work we describe the function x2modN, with N the product of two primes. The trapdoor is the factorization of N. It is easy to generate this function along with the trapdoor, by simply selecting two primes and multiplying them together. However, under the assumption of hardness of integer factorization, given only the function description (namely the value N) it is computationally hard to find the trapdoor (the factors p and q).
The second additional feature of a TCF is that it is claw-free. This means that the function is two-to-one (has two inputs that map to each output), but it is computationally hard to find two such colliding inputs without the trapdoor. Note that if it were possible to invert the function it would be trivial to find a collision (by picking an input, computing the function to get the output corresponding to it, and then inverting the function to find the second input mapping to that output). However the claw-free property is a bit stronger than the hardness of inversion: there exist some two-to-one functions which are one-way but not claw-free.
Importantly, in this work we only require that breaking the claw-free property is hard classically—indeed, the claw-free property of the DDH and x2modN TCFs described here can be fully broken by quantum computers. However, perhaps surprisingly, we do not require that breaking the claw-free property is easy for a quantum machine. In fact, the claw-free property of the LWE and Ring-LWE based TCFs remains secure even against quantum attacks. This corresponds to a very powerful property of the protocol in this paper, and other related protocols: that a quantum computer can pass the test without actually being able to find a claw. This subtle distinction stems from the fact that the quantum prover generates a superposition over two inputs that collide. No measurement of such a state can yield both superposed values classically in full, but the test is designed to not require both values—just the results of an appropriate measurement of the superposition. A classical cheater, on the other hand, still cannot pass the test because the idea of a superposition does not exist classically.
Here we describe each of the asymptotic circuit complexities listed in Table I of the main text. For these estimates we drop factors of loglogn or less. In all cases, we assume integer multiplication can be performed in time O(nlogn) using the Schonhage-Strassen algorithm.
We emphasize that the value of n necessary to achieve classical hardness in practice varies widely among these functions, and also that the asymptotic complexities here may not be applicable at practical values of n.
LWE [BCM+21, REG09] The LWE cost is dominated by multiplying an O(nlogn)×n matrix of integers by a length n vector. The integers are of length logn, so each multiplication is expected to take approximately O(logn) time. Thus, the evaluation of the entire function requires O(n2log2n) operations.
x2modN [RAB79] The function can be computed in time O(nlogn) using Schonhage-Strassen multiplication algorithm and Montgomery reduction for the modulus.
Ring-LWE [BKV+20, LPR13, dRV+15, RVM+14] Ring-LWE is dominated by the cost of multiplying one polynomial by logn other polynomials. Through Number Theoretic Transform techniques similar to the Schonhage-Strassen algorithm, each polynomial multiplication can be performed in time O(nlogn), so the total runtime is O(nlog2n). We note that integer multiplication and polynomial multiplication can be mapped onto each other, so the runtimes for x2modN and Ring-LWE scale identically except for the fact that Ring-LWE requires logn multiplications instead of O(1).
Diffie-Hellman [DH76, PW08, FGK+10] The Diffie-Hellman based construction defined in Methods requires performing multiplication of a k×k matrix by a vector, with k∼O(n). However, the “addition” operation for the matrix-vector multiply is the group operation of G; we expect this operation to have complexity at least O(nlogn) (for e.g. integer multiplication). The exponentiation operations have exponent at most d∼O(k2), so can be performed in O(logn) group operations. So, for each of the k2 matrix elements one must perform an operation of complexity O(nlog2n), yielding a total complexity of O(n3log2n).
Here we provide an example of a classical algorithm that saturates the probability bound of Theorem 2 of the main text. It has px=1 and pCHSH=34.
For a TCF f:X→Y, consider a classical prover that simply picks some value x0∈X, and then computes y as f(x0), without ever having knowledge of x1. If the verifier requests a projective x measurement, they always return x0, causing the verifier to accept with px=1. In the other case (performing rounds 2 and 3 of the protocol), upon receiving r they compute b0=x0⋅r. The cheating prover now simply assumes that x0⋅r=x1⋅r, and thus that the correct single-qubit state that would be held by a quantum prover is |b0⟩, and returns measurement outcomes accordingly. With probability 12, |b0⟩ is in fact the correct single-qubit state; in this case they can always cause the verifier to accept. On the other hand, if x0⋅r≠x1⋅r, the correct state is either |+⟩ or |−⟩. With probability 12, the measurement outcome reported by the cheating prover will happen to be correct for this state too. Overall, this cheating prover will have pCHSH=(1+12)/2=34.
Thus we see px+4pCHSH−4=1+4⋅34−4=0 which saturates the bound.
Classically, multiplication of large integers is generally performed using recursive algorithms such as Schonhage-Strassen [SS71] and Karatsuba which have complexity as low as O(nlogn). In the quantum setting, the need to store garbage bits at each level of recursion has limited their usefulness [KPF06, PRM18]. There does exist a reversible construction of Karatsuba multiplication that uses a linear number of qubits [GID19a], but due to overhead required for its implementation it does not begin to outperform schoolbook multiplication until the problem size reaches tens of thousands of bits.
Leveraging the irreversibility described in Section IID of the main text, we are able use these recursive algorithms directly, without needing to maintain garbage bits for later uncomputation. We implement both the O(n1.58) Karatsuba multiplication algorithm and the simple O(n2) “schoolbook” algorithm. Due to efficiencies gained from discarding garbage bits, we find that the Karatsuba algorithm already begins to outcompete schoolbook multiplication at problem sizes of under 100 bits. Thus Karatsuba seems to be the best candidate for “full-scale” tests of quantum advantage at problem sizes of n∼500−1000 bits. We also note that the Schonhage-Strassen algorithm scales even better than Karatsuba as O(nlognloglogn). However, even in classical applications it has too much overhead to be useful at these problem sizes. We leave its potential quantum implementation to a future work.
The multiplication algorithms just described do not include the modulo N operation, it must be performed in a separate step. We implement the modulo using only two classical-quantum multiplications and one addition via Montgomery reduction [MON85]. Montgomery reduction does introduce a constant R′ into the product, but this factor can be removed in classical post-processing after y=x2R′modN is measured.
Finally, we note that at the implementation level, optimizing classical circuits for modular integer multiplication has received significant study in the context of performing cryptography on embedded devices and FPGAs [JIW16, MD16, YWL+16]. Mapping such optimized circuits into the quantum context may be a promising avenue for further research.
In this section we describe several details of the post-selection scheme proposed in Section IIC of the main text.
Consider the two states |ψ±⟩=(|x0⟩±|x1⟩)x|y⟩y for some claw (x0,x1) with y=fk(x0)=fk(x1). Note that |ψ+⟩ is the state that would be held by a noise-free prover. Suppose a noisy quantum prover is capable of generating the mixed state
ρδ=(1/2+δ)|ψ+⟩⟨ψ+|+(1/2−δ)|ψ−⟩⟨ψ−|. | (5.17) |
In words, they are able to generate a state that is a superposition of the correct bitstrings, but with the correct phase only 1/2+δ fraction of the time. Here we show that such a prover can exceed the classical threshold of Theorem 2 of the main text, whenever δ>0. We proceed by examining this prover’s behavior during the protocol.
First, we note that if the verifier requests a projective x measurement after Round 1 of the protocol, this prover will always succeed—they simply measure the x register as instructed, and the phase is not relevant. Thus, using the notation of Theorem 2, px=1. With this value set, to exceed the bound we must achieve pCHSH>3/4. Naively performing the rest of the protocol as described in the main text does not exceed the bound when δ is small. However, the noisy prover can exceed the bound if they adjust the angle of their measurements in the third round of protocol (but preserve the sign of the measurement requested by the prover). We now demonstrate how.
Define |ϕ⟩ as the “correct” single-qubit state at the end of Round 2—one of {|0⟩,|1⟩,|+⟩,|−⟩}. Let f↕ be the probability that our noisy prover holds the correct state when |ϕ⟩∈{|0⟩,|1⟩}, and f↔ the corresponding probability when |ϕ⟩∈{|0⟩,|1⟩}. In the first case, the potential phase error of our prover does not affect the single-qubit state, so f↕=1. In the other case, the state is only correct when the phase is correct, so f↔=1/2+δ. We see that our prover will hold the correct single-qubit state with probability greater than 3/4. But, if they naively measure in the prescribed off-diagonal basis θ∈{π/4,−π/4} from the verifier, for small δ their success probability will be less than 3/4. This can be rectified by adjusting the rotation angle of the measurement basis.
Letting ±θ′ define the pair of measurement angles used by the prover in step 3 of the protocol (nominally θ′=|θ|=π/4), we can now express the prover’s success probability pCHSH as
pm=12[cos2(θ′2)f↕+cos2(θ′2−π4)f↔+sin2(θ′2)(1−f↕)+sin2(θ′2−π4)(1−f↔)] | (5.18) |
If the prover measures with θ′=π/4 as prescribed in the protocol, the success rate will be pCHSH≈0.68+O(δ)<3/4. However, if they instead adjust their measurement angle to θ=δ, they instead achieve pCHSH=3/4+3δ2/8−O(δ3), which exceeds the classical bound (provided that δ is large enough to be noticeable).
In practice, both f↕ and f↔ are likely to be less than one; the optimal measurement angle can be determined as
θ′opt=tan−1(2f↔−12f↕−1) | (5.19) |
which is the result of optimizing Equation 5.18 over θ′. In a real experiment, it would be most effective to empirically determine f↕ and f↔ and then use Equation 5.19 to determine the optimal measurement angle.
We now describe the details of the numerical simulation that was used to generate Figure 2 of the main text. For several values of the overall circuit fidelity F, we established a per-gate fidelity as f=F1/Ng where Ng is the number of gates in the x2modN circuit. We then generated a new circuit to compute the function (3ax)2mod32aN for various values of a (see next subsection for an explanation of the choice k=3a). For each gate in the new circuit, with probability 1−f we added a Pauli “error” operator randomly chosen from {X,Y,Z} to one of the qubits to which the gate was being applied.
For the simulation, we randomly chose two primes p and q that multiplied to yield an integer N of length 512 bits. We then randomly chose a large set of colliding preimage pairs, and simulated the circuit separately for each such preimage (which is classically efficient, since the circuits only consist of X, CNOT, and Toffoli gates). The relative phase between each pair of preimages (due to error gates) was tracked explicitly during the simulation. Finally, the expected success rate of the prover was determined by analyzing the correctness of the bitstrings and their relative phase at the end of the circuit.
The primes p and q used to generate Figure 2 of the main text are (in base 10):
Ψp = 113287732919697174280284729511923238986362403955638184856698528941220766063369
Ψq = 98359967382337110635377957241353362183812709461386334819166502848512740692727
In the previous subsection, we map the TCF fN=x2modN to the function f′N=(kx)2modk2N. To achieve this at the implementation level, we may use essentially the same circuit for modular multiplication; the only new requirement is to efficiently generate a superposition of multiples kx in the x register. We generate this superposition by starting with a uniform superposition over values x and then multiplying by k.
Normally, quantum multiplication circuits (like those we use to evaluate x2modN) perform an out-of-place multiplication, where the result is stored in a new register. In this case, however, it is preferable to do the multiplication “in-place,” where the result is stored in the input register itself—this way the y value is computed directly from the input register and thus is more likely to reflect errors that may occur in the input.
In general, performing in-place multiplication is complicated, particularly on a quantum register, because the input is being modified as it is being consumed (not to mention concerns about reversibility). However, multiplication by small constants is much simpler to implement. By setting k to a power of three, we are able to implement the in-place multiplication by performing a sequence of in-place multiplications by 3, which can each be performed quite efficiently (see implementation in the attached Cirq code 1717 17 Code is available at https://github.com/GregDMeyer/quantum-advantage and is archived on Zenodo [MEY22]).
For the dashed “theory prediction” lines of Figure 2 of the main text, we predicted the success probabilities under two assumptions (which the numerical experiments are intended to test). First, among noisy runs where at least one bit flip error occurs, the output bitstring is approximately uniformly distributed. Second, we assume that with at least one phase flip error, the probability that the phase is correct in the final state is 1/2.
Under these assumptions, we compute the predicted success rates px and pCHSH as follows:
For a given overall fidelity F of the original x2modN circuit containing Ng gates, compute a per-gate fidelity f=F1/Ng. Then compute the expected overall fidelity F′ of running the slightly larger (kx)2modk2N containing N′g gates as fN′g.
Using F′ and the given error model (see “Details of simulation and error model” section above), compute three disjoint probabilities: that no errors occur, that only phase errors occur, or that at least one bit flip error (and possibly also phase errors) occurs.
Compute the probability that the output will pass postselection, which includes both cases with no bit flip errors and those that are corrupted but happen to pass postselection by chance.
Normalizing to only those runs that pass postselection, compute px and pCHSH:
px is computed as the probability that no bit flip errors occurred (among those runs that pass postselection). This is a lower bound (that seems intuitively tight); it assumes a negligible probability that the measured pair (x,y) still has y=f(x) despite bit flip errors.
pCHSH is computed by finding the probability that no errors occurred that would affect the single-qubit state at the end of round 2. When the correct single-qubit state should be polarized along Z, this is taken to be the probability that no bit flip errors occurred (phase errors are allowed since they will not affect this state); when the correct state should be polarized along X, it is taken as the probability that no errors at all have occurred. In these “no-error” cases, we compute the verifier’s probability of accepting by applying the adjusted measurement basis described in the first sub-section above, “Quantum prover with no phase coherence saturates the classical bound”. Finally, for the case that there was an error that could affect the single-qubit state, the probability that the verifier receives a correct measurement outcome is taken to be 1/2 (the single-qubit state is taken to be maximally mixed).
Compute the measure of “quantumness” from px and pCHSH.
Compute the estimate runtime by multiplying the increase in quantum circuit size by the expected number of iterations required to pass postselection (which is computed from the analysis above).