Chapter 1 Introduction

All computers are just carefully organized sand.

— Randall Munroe, xkcd.com/1349

In the lobby of the Stata Center at MIT sits a wooden contraption that has an appearance somewhat akin to an oversized pinball machine. The sloped surface contains a number of barriers, holes, and movable toggles that can be set to the left or the right. In a slot at the top sit a set of billiard balls. Passersby may find a clue to the device’s purpose in the labels next to some of the toggles, such as “COUNT,” “MULTIPLY,” and “MEMORY REGISTER.” It is a mechanical calculator called the DIGI-COMP II. By flipping the toggles one can input numbers and choose an arithmetic operation to perform. With the computation set, the user releases a billiard ball down the slope. As it weaves its way through the various elements on the surface of the board the ball will flip various toggles, until it reaches the bottom and causes another ball to release from the top. This next ball will follow a new path due to the action of the previous one, flipping yet more toggles and eventually releasing yet another ball. Eventually the position of the toggles will be such that new balls will stop rolling down the slope, and the operation will halt. In what may seem a miracle to the uninitiated, the user will find that the toggles now show the mathematical result—the sum, difference, product, or quotient of the numbers that were input at the start!

While the physical nature of computing is particularly obvious in the DIGI-COMP II, this is broadly how all computers work. Most of them use electrons instead of billiard balls, but the general principle of operation is the same: data is input via the physical manipulation of part of the device and then the physics of the system evolves forward in time, ultimately resulting in a physical output that can be interpreted by a human (or perhaps cause some other useful effect in the world). Of course, in modern times, these physical processes are easy to forget, or at least ignore: the size of the smallest features in semiconductor chips being manufactured in 2023 is a few nanometers—roughly the diameter of the helical structure of DNA! Yet even if we do not think about it, we all still intuitively know that computations are physical processes happening inside the devices we use every day—a fact that somehow feels simultaneously obvious and surprising. When watching a high-resolution video, we may find that our device gets hot, as energy gets expended by the operations required to update each of the millions of pixels 60 times every second. If we let the video play for long enough, we may need to recharge the device’s battery by physically connecting it to a source of energy, which otherwise could power a vacuum cleaner or a lamp. More directly, if we put a photo album on a USB thumb drive, we know the photos are somehow in it, in a very physical sense—if we carry the drive to a friend’s house, or send it through the mail, or throw it off the Golden Gate Bridge, the photos go with it. Ultimately, computing is physics, and the task of building (and using) computers corresponds to applying the laws of physics to very carefully organize the matter around us in such a way that it can encode and manipulate information.

Hiding the physical nature of computing has been crucially important, however, for facilitating the development of the diversity of computing devices that surround us today. When we open an email, or tap an Instagram ad for a sunflower-shaped dog costume, the abstract result should not depend on the specific physical processes that occur. On any of the myriad models of smartphone, tablet, laptop, or whatever else, which at a low level may operate with dramatically different computational building blocks, we will see the email’s contents, or have 30 fewer dollars in our bank accounts, respectively. This abstraction applies even for most software developers: the advent of compilers in the mid-20th century created a separation between the creation of programs and their execution, allowing code to be written in a universal language like Fortran or C, and only later compiled into the specific, individual operations used by a particular processor. It is not an exaggeration to say that modern computing could not exist without it.

The intellectual foundations of this idea were laid by a brilliant insight of Alan Turing and others in the 1930s. Turing proposed a simple abstract model for a physical system that can compute any function that is computable at all—a device soon termed a “Turing machine.”11 1 To be precise, this depends on one’s definition of “the set of computable functions;” Turing showed that the set of functions his machine could compute was equal to the that of two other contemporaneous definitions of computability. In modern theoretical computer science, computability is in fact usually defined in terms of Turing machines. This led to the concept of Turing completeness: given a computational system, it is Turing complete if it can be configured to simulate a Turing machine, which implies that it can compute any computable function. Over the succeeding decades, this idea was extended to explore not only which functions can be computed, but how efficiently it can be done. Astoundingly, computer scientists found that no matter how they designed a computer (within the laws of physics), Turing’s simple, abstract construction could simulate it efficiently. Viewed from another perspective, they found that the set of mathematical problems that are possible but inefficient for a Turing machine to compute remain stubbornly inefficient no matter what computational system is used. This idea has been termed the extended Church-Turing thesis[AV13]

However, hints that this may not represent the full story began to emerge near the end of the century, when scientists began to consider the prospects of simulating quantum mechanics using computers. In a foundational talk in 1981, Feynman pointed out that unlike in the simulation of classical physics, fully describing the state of a quantum system seems to require manipulating an exponential number of values—a task that certainly is not efficient. He proposed that instead of using standard computing hardware, perhaps such a simulation could be performed efficiently on a new type of computer, itself built from quantum mechanical parts. Such a device would constitute a clear violation of the extended Church-Turing thesis. This led to an extremely provocative further question: are there other problems, not directly related to quantum mechanics, that are hard for Turing machines but can be computed efficiently using a machine built from quantum mechanics? Just 13 years later, Peter Shor stunned the world of computer science by giving a fast quantum algorithm for factoring numbers—a computational problem that had been the target of efforts by number theorists for millennia. (In fact, factoring, and the related discrete logarithm problem, were considered to be hard with such certainty that their hardness has been used as the backbone of much of digital security. This fact has played a large part in the interest of national governments in quantum computing.) After Shor’s work, research efforts in the new field of quantum computing increased dramatically, both with intense experimental work to physically construct such devices and theoretical efforts to develop new algorithms that could run on them.22 2 For in-depth yet personal discussions of the history of quantum computing, we recommend the recent retrospectives by Peter Shor and John Preskill. [SHO22, PRE23] Yet despite the gargantuan efforts made over the past few decades, quantum computing remains stubbornly difficult. As of yet, the physical quantum computers that have been constructed are only beginning to pass the cusp of what can be simulated with modern classical supercomputers, and even so, while a wide array of theoretical applications have been proposed, finding realistic and useful problems for which these first small, noisy quantum machines can meaningfully improve on the performance of classical computers has remained elusive. To put things in perspective, the current record for integer factorization via Shor’s algorithm seems to be the factorization of as in 2012, [MLL+12] but even that record had its validity soon called into question [SSV13].33 3 There are so-called “variational” quantum factoring algorithms which have been used to demonstrate factoring of somewhat larger numbers than 21 on near-term quantum devices. However, these algorithms become inefficient rapidly as the numbers get larger and would not be useful at all for factoring numbers larger than what can already be done with classical computers, even if a large error-free quantum computer could be constructed.

As a result of all of this, we find ourselves at an exciting time for quantum science. In terms of looking for new physics in many-body quantum systems, we are at a juncture where modern supercomputers and cutting-edge experiments can simulate quantum systems of roughly similar complexity. Each technique has its own strengths, and the ability to look for new physical phenomena in numerical simulations and then explore them further with experiments, or vice versa, has proven extremely powerful, leading to the discovery of new phases of matter and new insights into quantum materials. Encouragingly too, both experiments and numerics have improved rapidly over the past few decades. On the quantum computing side, despite the current lack of a “killer application” for the near term, the astounding progress in the precise control of quantum systems holds promise. It has been said that when algorithms can only run in one’s imagination, it is very difficult to develop them; we are finally reaching the point where those designing quantum algorithms can move off the blackboard and play with real quantum devices. With any luck, this will lead to the exploration of new directions that we have not yet imagined.

This dissertation focuses on the implications of the difference in computational power of classical and quantum systems, and how they can interact with each other. For the rest of this chapter, we give an overview of some of the foundational ideas upon which this work relies. The discussion is targeted at a reader who is knowledgeable of physics, but perhaps new to the subfields being discussed. It does not endeavor to go deeply into technical details, but instead to give a broad intuitive landscape of the techniques currently being used to make progress in these fields. Hopefully this can serve as a resource for early graduate students looking to study these topics.

After the introduction, this dissertation is divided into two parts. In the first part, we examine how, despite the complexity-theoretic hardness of classically simulating quantum mechanics, the tools of modern supercomputing can be used to simulate quantum systems of sufficiently large size that meaningful scientific insight can be produced. In Chapter 2, we present a numerical library called dynamite, which has a simple and intuitive programmatic interface for performing numerical simulations of many-body quantum systems, but is very powerful, with the capability to accelerate the computations via massive parallelism on supercomputers and graphics processing units (GPUs). In Chapter 3 we present the implementation of algorithmic and practical innovations targeting a specific many-body physics problem, that of “many-body localization” (MBL). Our approach dramatically reduces the memory usage compared to previous works, enabling the analysis of larger physical systems with less resources. In an amusing connection to the physicality of computing, for both these chapters the work required moving past the compiler abstractions described earlier, instead incorporating knowledge of the physical characteristics and low-level operations of the compute hardware upon which the code would run in order to eke out as much performance as possible.

In the second part, we focus on a subtlety of the ongoing race to build quantum computers: if a quantum device claims to be able to perform computations that no classical computer can, how can we verify that it actually has done so? What if we do not trust the quantum device, either because its behavior is not well characterized, or because we do not trust the humans behind it? In Chapter 4, we examine a proposal from 2008 for a “proof of quantumness” by which a quantum prover can convince a skeptical classical verifier of its capability to perform computations that would be infeasible classically. We break the protocol, showing that there in fact exists an algorithm by which a classical cheater could reproduce the behavior of an honest quantum prover in the protocol, which destroys the test’s guarantee that the prover is quantum. In Chapter 5 we propose a new proof of quantumness protocol in which classical cheating is provably as hard as factoring integers, avoiding the pitfalls described in the previous chapter. The new protocol can be implemented with less quantum resources than Shor’s algorithm, yielding potential for its real-world use in the time before quantum computers are capable of running algorithms as complex as Shor’s. In Chapter 6, we present the results of a first, small-scale demonstration of that proof of quantumness protocol (and another related protocol) on an ion-trap quantum computer. The results represent a technological step forward, as they involve performing measurements in the middle of a set of quantum operations and then continuing operation afterwards. These mid-circuit measurements have historically proven to be extremely technically challenging to implement, and open the door to important new paradigms of quantum computing such a real-time error correction. In Chapter 7, we present a novel construction for quantum circuits implementing integer multiplication—a key ingredient for both the proofs of quantumness just described, and for Shor’s algorithm itself. We show that the math behind fast multiplication algorithms which have been known for decades in the classical setting can be applied to an inherently quantum multiplication method which uses the phases of quantum states to perform arithmetic. Furthermore we show that a considerable fraction of the work can be performed in classical pre-computation when the quantum circuits are being compiled. The construction dramatically reduces the number of extra qubits used, while remaining competitive in the number of quantum operations required. Finally, in Chapter 8, we give concluding remarks and look towards the future.

1.1 Numerical studies of many-body quantum systems

The study of many-body physics has been known for centuries to be challenging, even in the classical setting. Long before the advent of quantum mechanics, in studying the collective motion of the Sun, Moon, and Earth, physicists found (and eventually proved) that the dynamics of just three masses interacting via Newtonian gravitation did not have a general closed form solution. Worse, many of the physical systems relevant to our everyday lives have far more than 3 particles—a teaspoon of water contains roughly molecules! Clearly, directly solving for the dynamics of each individual molecule is out of the question.

Until the mid- to late-20th century, there were only two main strategies available for treating many-body systems. The first is to find specific cases that are solvable, and then leverage them to find approximate “nearby” solutions as well. (For the three-body orbit problem, an example of this is approximating one of the bodies as having zero mass in comparison to the other two, which is frequently appropriate in real astrophysical scenarios.) The second, which is appropriate for systems of very many particles and is used in the fields of statistical and fluid mechanics, is to not focus on the dynamics of each individual particle but to instead find a description of their collective behavior as a whole. These tools allowed physicists to make an astounding amount of progress, but certain questions remained impervious to these methods.

In the past several decades, a powerful new tool has come onto the scene: modern computers, capable of performing billions of mathematical operations per second, with which the dynamics of many interacting particles can be computed numerically! In the classical case, thousands or even millions of particles can be simulated at once, providing a good approximation of the thermodynamic limit of infinite system size. Quantum many-body physics, however, presents yet another challenge: the exponential size of the Hilbert space as a function of the number of particles. Consider of a collection of quantum particles, each with just a two-dimensional local Hilbert space (for example, spin- particles without any other degrees of freedom). Their collective wavefunction is a vector44 4 Technically a ray, since the normalization is not physically important. in a Hilbert space of dimension —over 1,000 for a collection of just 10 particles, and over 1,000,000 for 20 particles. This is the practical manifestation of the exponential complexity of classically simulating quantum mechanics discussed earlier in the introduction: informally, the cost of simulating 1,000,000 classical particles can be compared to the cost of simulating just 20 quantum ones.

Considering this fact, and Feynman’s point that quantum computers may be better suited for simulating quantum mechanics than classical ones, one might ask why classical simulation of quantum mechanics is worth it in the first place. There are three main benefits, by which numerical study has proven invaluable in the study of quantum many-body physics. First, a classical simulation allows one to interrogate the behavior of a quantum system under ideal conditions. The noise that is ever-present in experiments (and has particularly been the bane of those attempting to build quantum computers) can simply be turned off in a simulation. This can be hugely helpful in assessing whether an observed effect is real or simply due to imperfections of the experiment. Secondly, the development process is usually much faster. Depending on the software used for simulation, it can be possible to arbitrarily adjust the interactions between particles, or the physical geometry of the system, or any of a number of other parameters, at the press of a button—adjustments which may require considerable effort to implement in an experiment, if they are possible at all. (Indeed, the goal of the software library we present in Chapter 2 is precisely to make such simulations as easy and quick as possible!) Finally, and most importantly, classical simulation gives access to data which simply is inaccessible in a real quantum experiment. Performing measurements on quantum states causes wavefunction collapse—by Holevo’s theorem, even though spin-1/2 particles have a state vector of dimension , the number of classical bits of information that can be extracted by measuring that wavefunction is only ! Meanwhile, in a classical simulation, arbitrary functions of the state vector can be computed. This is particularly important for “non-observable” quantities such as the entanglement entropy of a state, which in an experiment cannot always be estimated with good statistics without performing an exponential number of trials.

1.1.1 How to represent a quantum state in a classical computer

Having hopefully convinced the reader of the merits of pursuing classical simulation of quantum many-body physics, as difficult as it may be, we now move on to discussing the first necessary step to performing numerical simulations: representing a quantum state on a classical computer. Much work has been done exploring various strategies for this. Here we discuss two that are most commonly used: storing the state vector explicitly as an array of complex numbers, and tensor network methods. We note that there exist a wide range of other techniques whose use is somewhat less widespread. These include, for example, stabilizer states, which can efficiently represent states generated by quantum circuits of Clifford gates (and “nearby” states) [GOT98, AG04]; and neural network states, in which a neural network is trained to produce the coefficients of the state vector [CT17, VMB+22].

In the below, we will consider quantum systems consisting of two-level systems. The Hilbert space is mathematically represented by the space . We consider a pure quantum state on this Hilbert space, that we desire to represent numerically. We do not cover here the various techniques for numerically representing and computing with mixed states; however, we note that models with noise can be simulated with pure states, for example via the formalism of “quantum trajectories.” [DAL14]

Vectors of coefficients

The most straightforward way to represent such a state numerically is by picking a set of basis states that span the Hilbert space, and storing an array of complex numbers such that . The obvious challenge with this strategy is that the number of that must be stored is equal to the dimension of the Hilbert space, which is exponential in . To give a sense of scale, with 1 Terabyte of RAM (achievable with a few nodes of a modern computer cluster), and complex numbers stored as a pair of 8 byte floating point numbers, one can represent a system of spins—not a huge number, but certainly large enough to see many-body collective behavior in certain systems. Despite the limitation to moderate system sizes, the power of this representation is that it can describe entirely arbitrary quantum states. This benefit is often worth the exponential cost—indeed, this is how states are represented in the numerical work of Chapters 2 and 3.

From an information-theoretic perspective, an exponential number of classical bits is simply required to represent arbitrary quantum states, because the space of quantum states is exponentially large. Fortunately, in many physical situations, the states of interest are likely to not be entirely arbitrary but to belong to some subspace of the larger Hilbert space, and if we can find a more efficient way of representing the states in that subspace it becomes possible to handle larger system sizes. One application of this idea, which is used widely, is to use conservation laws to divide the Hilbert space into the direct sum of several smaller subspaces. As an example, the Hamiltonian governing the dynamics of a system of quantum spins might conserve their total magnetization in a particular direction. In that case we can ensure that we choose a set of basis states for which that magnetization operator is diagonal, and break the Hilbert space into “sectors” each with a different magnetization. In this case, for spin 1/2 particles with zero total magnetization, our 1 Terabyte of RAM can store the state vector of 38 spins with the total magnetization set to zero, as opposed to the 35 spins that were possible in the general case. (The zero magnetization subspace has the largest dimension; for different values things improve even more.) Note that this specific conservation law is implemented in both Chapters 2 and 3; a number of other conservation laws are frequently used in physics studies and several more are implemented in the dynamite package presented in Chapter 2.

Even with conservation laws that allow us to ignore a large fraction of the Hilbert space, the number of coefficients that must be stored usually remains superpolynomial in the system size. We now move on to discussing a way of representing quantum states using only a polynomial amount of data—so called “tensor networks.”

Tensor networks

In the previous section, we reduced the size of the effective Hilbert space by only considering states with a given value of a conserved quantity; the intuition behind tensor network representations of quantum states is to focus only on states with low entanglement. This idea is well-motivated because the states encountered in physics studies frequently have this property—for example, this is the case for the ground states of a large class of Hamiltonians. The challenge, of course, is to find a way of representing low-entanglement states such that they can be stored, and manipulated, efficiently. Tensor networks attempt to do just that.

A number of extensive, pedagogical introductions to tensor network methods have been written to which I would direct any reader interested in deeply exploring this topic; [ORÚ14, ORÚ19, CPS+21] instead of creating yet another (probably of worse quality) I instead here will give a high-level overview of their structure, as I like to view it. My hope is that it can be helpful in building intuition for those whose mathematical style is similar to my own.

The broad idea is based off of the Schmidt decomposition. Consider a quantum system with two parts, which we denote and , with local Hilbert spaces and of dimension and respectively. In general, we may write the global state of the system in terms of any orthogonal sets of basis vectors and , as . The power of the Schmidt decomposition is that it shows how to construct particular orthonormal bases and , and coefficients , such that the sum only needs to run over a single index:

(1.1)

Viewed another way, it finds a basis for and such that the are only nonzero when . At first this seems too good to be true: we are representing a quantum state on the global Hilbert space, which has dimension , using only coefficients! Alas, there is no free lunch; the trick is that the basis vectors and are themselves dependent on , and so must be stored explicitly. The real benefit of the Schmidt decomposition comes when we consider entanglement.

Recall the definition of the von Neumann entanglement entropy of subsystem , denoted as . It is a function of , the reduced density matrix of subsystem when has been removed via a partial trace.

(1.2)

It is straightforward to see from the definition of the partial trace, and the orthonormality of the basis vectors and , that the Schmidt decomposition diagonalizes the density matrix:

(1.3)

and thus the entanglement entropy can be written straightforwardly in terms of the :

(1.4)

Careful inspection of this expression, combined with the fact that , yields a powerful fact: informally, if the entanglement entropy is small, then only a few of the are non-negligible! This suggests the following approximation: for some cutoff , simply drop the terms of Eq. 1.1 for which .

(1.5)

If most of the are smaller than , we may only need to store a small number of tuples . The intuition here is really nice: if there is no entanglement between the two parts, then the state is trivially the tensor product of a pure state on each part: . The Schmidt decomposition allows us to see that this is the most extreme case of a more general fact, that low-entanglement states are well approximated by a linear combination of just a few tensor products of that form.

Let’s now look at how this can be applied to our system of two-level spins, arranged in a 1D chain. Let system be the leftmost spin, and system be the remainder of the chain. Since has dimension two, applying the Schmidt decomposition, we will get two tuples where the two are vectors each of dimension 2 and the two are vectors of dimension . The key to making this useful is that we may now apply the Schmidt decomposition again, to the vectors ! Letting our new subsystem be the second-to-leftmost spin and be the remaining spins on the right, we now get a total of tuples —two from each of the two . The are once again each of dimension 2, and the are now of dimension . We may continue this plan across the entire chain of spins, ultimately decomposing our state into a collection of sets of basis vectors, each of dimension 2. The benefit is obvious—we are never storing any vectors larger than dimension 2! The downside is that without any truncation, the number of such basis vectors grows exponentially with the distance from the end of the spin chain. However as discussed above, for states with low entanglement, we can drop most of the vectors since their associated are small. In fact, for states without extensive entanglement, we need only keep a constant number of vectors on each spin to achieve a good approximation of . This constant is called the “bond dimension” and is usually denoted by .

The construction just described is called a matrix product state (MPS). The standard way of viewing it is as a tensor network: a set of tensors, each one representing the set of basis vectors associated with each spin, with “bonds” between them corresponding to the shared indices in the sum of Eq. 1.1. It turns out the linear chain of tensors we have described so far is just one of a large class of methods for representing quantum states via tensor networks. This has yielded new ways of representing states with different physical geometry of the interactions, and different entanglement structures—well-known examples include Projected Entangled Pair States (PEPS) and Multiscale Entanglement Renormalization Ansatz (MERA). [CPS+21, VID07] Moving past the case of a 1D physical system is a challenging pursuit, with surprising pitfalls such as constructions that can efficiently represent certain quantum states accurately, but for which actually computing any quantities of interest is exponentially computationally hard! This is an active area of research, and we direct the interested reader to any of the extensive reviews on the subject. [ORÚ14, ORÚ19, CPS+21]

1.1.2 Simulating quantum dynamics

Having discussed various ways of representing quantum states on classical computers, we now turn to performing computations on them. The most obvious operation of interest is the numerical simulation of their evolution through time. Consider a quantum system with some Hamiltonian (with the argument making explicit that the Hamiltonian may depend on time). This time evolution is represented mathematically by the time evolution unitary that corresponds to the solution to the time-dependent Schrodinger equation

(1.6)

For a time-independent Hamiltonian, the solution has a straightforward form:

(1.7)

For a time-dependent Hamiltonian, things are a bit more complicated. A classic mistake made by those new to the field is to write as

which at first glance looks great—the time evolution should correspond to the cumulative effect of the Hamiltonian acting from time to time . (It feels like a rite of passage to screw this up! I certainly did at least once early in my research.) But it is incorrect, which can be seen as follows. Consider a simple time-dependent Hamiltonian which consists of a static Hamiltonian acting for time followed by another static Hamiltonian acting for time . The evolution over one cycle of time is (correctly) described by the unitary

(1.8)

which is crucially not necessarily equal to —that equality only holds if and commute! To account for the fact that may not commute with itself for all times , one must use what is called the time-ordered exponential, which is denoted thus:

(1.9)

The precise definition and use of the time-ordered exponential is out of the scope of this introduction, but it turns out we will not need it for our numerical purposes: during my research, the best strategy has essentially always been to simply break up the time-dependent Hamiltonian into a piecewise time-independent one, computing each piece separately as in Equation 1.8. Thus for the rest of this section we will discuss how numerical time evolution under a static Hamiltonian is performed.

The most straightforward way of implementing time evolution is to simply create a numerical representation of as a matrix, and compute via the matrix exponential (Eq. 1.7). This can be done via a matrix exponential algorithm (e.g. via the linalg.expm() function of the SciPy library); however, in my experience, for larger system sizes it is actually faster to solve for the eigendecomposition , where is a matrix whose columns are the eigenvectors and is the diagonal matrix of the eigenvalues. The unitary can then be computed by exponentiating , which is very straightforward because it is diagonal—one simply exponentiates each eigenvalue separately. This strategy is particularly efficient if the unitary is required at many different times , because the eigendecomposition only needs to be performed once.

The benefit of explicitly computing the unitary is that, well, one has an explicit representation of it! With that one can easily determine its action on arbitrary state vectors, or even on mixed states. Explicitly computing is very costly, however: it is a square matrix of the same dimension as the Hilbert space, meaning that even just storing it requires computer memory proportional to the Hilbert space dimension squared—without the use of conservation laws, it becomes impractical for systems of more than just 16 spins or so. Fortunately, it is almost always overkill—the only instance I have ever encountered in which it has been necessary to explicitly compute is in the study of Floquet systems, where the eigenvalues of represent “quasi-energies” that are physically relevant. In virtually all other cases, what we are really interested in is the action of on a state.

The hope that computing for some state may be more efficient than computing in full is well-motivated if we consider the Taylor expansion of in :

(1.10)

If is small, the norm of each term of the expansion is exponentially smaller that the last, and we may obtain a good approximation of with only a small number of terms. Importantly, it is possible to compute this expansion while only storing three vectors: one to hold the result, and two more in which , , , etc. are computed in alternating fashion. (Depending on the specifics of the situation, it may even be possible to reduce this to fewer than three, if the multiplications of the state by can be performed in-place).

Of course, we are usually interested in time evolution for longer times than those for which this expansion is well-controlled. In that case, we may consider the following exact decomposition

(1.11)

That is, we can break down the total evolution over a time into evolutions of time . If is sufficiently large, each of these smaller time evolutions are in the regime in which the above expansion is well-controlled, and thus the error is well-controlled across the entire evolution. This idea forms the backbone of a number of algorithms for time evolution of states, whether they are represented explicitly as a vector of coefficients, or via matrix product states.

In Time-evolving Block Decimation, an algorithm for time-evolving matrix product states, it is observed that if the expansion in Eq. 1.10 is truncated after the first order in , the operators that need to be applied have support on at most two sites at once—and thus only need to be applied to at most two tensors of the matrix product state, which is efficient for matrix product states in which the bond dimension is not too large. As long as the entanglement remains small throughout the time evolution, terms of the Schmidt decomposition can continuously be truncated throughout the process (as in Eq. 1.5) while maintaining a good approximation of the state. [ORÚ19]

For states stored as vectors of coefficients, as in Section 1.1.1, the strategy of Eq. 1.10 can be improved through the use of so-called Krylov subspace methods. This type of algorithm forms the backbone of the numerical package dynamite presented in Chapter 2, and a detailed exposition is provided there; for now, we will give the broad intuition. Instead of explicitly computing the sum Eq. 1.10, Krylov methods compute an orthonormal basis for the subspace up to some cutoff (say, ). Then, the matrix is projected into this small subspace and is computed explicitly in the small dimensional space. Using this, the vector is computed and then projected back into the original Hilbert space. It is clear by inspection that this strategy will do at least as well as Eq. 1.10 for the same order of approximation , but it turns out that in practice it usually does much better because it includes the parts of high-order terms of that also fall in the subspace. As before, it may be useful for larger to break the evolution down into evolutions of a shorter time , and compute each of these shorter evolutions with the method just described.

1.1.3 Eigensolving

Another operation that is widely relevant for physics studies is solving for the eigenvalues and eigenvectors of a Hamiltonian. The eigenvalues determine the energy levels of the system, and eigenstates provide insight into its physical characteristics. In many cases the ground state and first few excited states are particularly important, as they determine the system’s behavior at low temperature.

Once again, the most straightforward way to find the eigenvalues and eigenvectors of a quantum operator is to numerically construct the Hamiltonian and then apply a generic matrix eigensolver (e.g. numpy.eigh()). But like in the time evolution case, this is simultaneously very expensive, and usually overkill. In situations where we only need the low-lying states, we can use the variational method. The idea is that for a given Hamiltonian , the energy of any state is lower-bounded by the energy of the ground state—and if we can optimize to have as low an energy as possible, we can find a good approximation of the ground state energy (and hopefully of the ground state itself).

For matrix product states, the classic way of doing this is via the Density Matrix Renormalization Group (DMRG) algorithm. Informally, the idea of DMRG is to sweep back and forth across the tensors of the matrix product state, minimizing the energy at each step by locally optimizing over each one. By doing several sweeps in this way, the state becomes a better and better approximation of the true ground state; if the true ground state has low entanglement (and thus can be well-represented by a matrix product state), in practice it can usually be converged with only a few sweeps. [ORÚ14, ORÚ19, CPS+21]

For states stored as vectors of coefficients, we may use any of a number of optimization algorithms to attempt to find the vector with the lowest value of . The most frequently used methods build off of the intuition of the power method for eigensolving. Consider a uniformly random vector, which can be written in the basis of the eigenvectors of (which are as of yet unknown): . Observe that repeatedly multiplying this vector by exponentially enhances the eigenvector corresponding to the largest magnitude eigenvalue: . By shifting the zero point of energy we may ensure that the largest magnitude eigenvalue is the most negative one, and thus this method will converge the ground state! The power method is not usually used directly, because there exist algorithms which are based on the same intuition but converge even more quickly—a class of methods called iterative eigensolvers. These include the Lanczos algorithm, Krylov subspace methods more generally (as described earlier for time evolution), and others. Chapters 2 and 3 make extensive use of these iterative-type eigensolvers, and detailed descriptions of how they work can be found there.

1.2 Quantum advantage

As it stands, the central concern of quantum computing is, of course, whether it can outperform classical computing—a goal termed “quantum advantage.” But determining where and if quantum advantage is possible, and before that, even clearly defining it, is a surprisingly subtle pursuit. On the theoretical side, a wide array of algorithms have been discovered that solve various problems in asymptotically fewer operations on a quantum computer versus a classical one.55 5 In fact, this point is subtle too. Explicit proofs of a lower bound of the cost of solving a problem on a classical computer are hard to come by, so these algorithms usually correspond to speedups over the best known classical algorithm. But translating these algorithms into a speedup observable in practice has proved very difficult. A main challenge is that modern classical computers are extremely fast, in terms of the number of operations they can perform per second. Today’s top supercomputers can achieve exascale performance: , or one quintillion, 64-bit floating point operations per second! Meanwhile, most modern quantum experiments are limited to a few hundred or thousand quantum gates total before noise destroys the quantum state. Even ignoring noise, the cycle time of modern quantum computers is something between and gates per second depending on the platform, dramatically slower than their classical counterparts.

To demonstrate a speedup in practice, the algorithmic gain from using quantum hardware must outcompete this extreme disadvantage in cycle time. It is clear that quantum algorithms which only reduce the number of operations by a polynomial amount—say requiring operations when the classical computer needs —will not close the gap without revolutionary improvements in quantum technology. Instead, near term quantum advantage requires algorithms which provide a superpolynomial speedup, ideally the fully exponential speedup corresponding with the classical complexity of simulating quantum mechanics itself. A few of the classic quantum algorithms, in particular Shor’s, do exhibit this dramatic superpolynomial improvement over the best known classical algorithm for the same problem. Unfortunately they are quite complicated to implement, and so remain far out of reach of near-term quantum devices. For these reasons, the pursuit of experimental quantum advantage has required the devising of new computational problems and associated quantum algorithms, which are as undemanding as possible for a quantum computer yet as costly as possible for a classical one.

Note that here we will focus on quantum advantage in computational cost, broadly defined but with a focus on runtime. The setting to keep in mind is that of a quantum device connected to a classical computer. The classical computer sends the quantum device commands (usually the quantum gates to perform), and receives some classical data back. A weak version of the goal is that if the quantum device is replaced by a classical machine of comparable resources—say, in size, energy usage, and computation time—it should be unable to reproduce the computation performed by the quantum device. The strongest version of quantum advantage, which is the one pursued by most of the research described next, is that no classical machine, not even the world’s fastest supercomputers, can reproduce the quantum device’s behavior, given any practical amount of computation time and other resources.

1.2.1 The first experimental demonstrations

The problem of finding specific computational tasks for demonstrating quantum advantage in practice did not receive much attention until recently, because quantum devices were so small and noisy that they could be easily simulated by classical computers no matter what operations they performed. But in the past few years, quantum hardware has improved to the point that at least direct classical simulation has become infeasible—leading to the exploration of whether there was some computation, perhaps contrived and not necessarily useful, in which a speedup could be experimentally observed. Theoretical work along these lines led to the conclusion that the most achievable way to show quantum advantage would not be to solve a problem with a deterministic answer (like “factor this integer”) but instead to solve a sampling problem: given some data that defines a probability distribution, the task is to generate (perhaps approximately) samples from that distribution. [AA11, BMS16, LBR17, HM17, TER18, BIS+18, BFN+19, AC17, NRK+18] By defining the target probability distribution to correspond to the distribution of measurement results for a particular quantum state, produced by running a quantum circuit, the problem becomes very naturally suited for a quantum computer. Furthermore, intuitively, the hardness of classically reproducing the sampling results comes directly from the hardness of classically simulating generic quantum circuits. In particular, if the quantum circuit to be run has little structure (for example, if it consists of a series of random quantum gates), there should be no “shortcuts” by which a classical computer can reproduce the results, short of accurately simulating a quantum device.66 6 While the intuition may be clear, a considerable effort was required to give theoretical backing to the hardness of random circuit sampling, and the hardness of approximately sampling from the distribution is still an active area of research. [HZN+20, PZ22, GK21, PCZ22, LLL+21c, LGL+21a, GKC+21, AGL+23] Starting in 2019, a series of experiments were published implementing this idea at scale and marking the first quantum computations to not only outperform the top classical supercomputers, but do so to such an extent that the results could not be reproduced classically at all!77 7 A number of follow-up papers from other research groups demonstrated various improvements to classical techniques for more efficiently solving the sampling problems implemented in the experiments; [HZN+20, PZ22, GK21, PCZ22, LLL+21c, LGL+21a, GKC+21, AGL+23] in the mean time, the quantum experiments have also improved. At this point it seems to be generally accepted that the most recent experiments are truly out of reach of classical computing. [AAB+19, ZWD+20, WBC+21, ZCC+22]

With those results came a subtlety, however: if the output cannot be reproduced classically, how is it possible to verify that it’s actually correct? The papers followed two parallel strategies for handling this. The first is to perform experiments in the so-called “Goldilocks zone,” where finding a classical solution is very difficult but not impossible. That way, the difference in computation time (or another metric such as energy usage) can still constitute quantum advantage, yet via a large classical computation the results can be verified. The second strategy applies to computational problems past the Goldilocks zone, where direct verification is truly infeasible. In that case, experiments resorted to showing that quantum mechanical processes were the only “reasonable” explanation for the observed results. For example, in their landmark paper that began the series of experimental claims to quantum advantage, Google showed that their device performed as expected for a set of “nearby” computations that were possible to classically simulate, extrapolating that the device probably would perform as desired when running the classically hardest computations as well. In a similar vein, the second paper to claim quantum advantage supported their results by “ruling out alternative [classical] hypotheses.” [ZWD+20]

1.2.2 Efficiently-verifiable tests

But what if there is some classical explanation that we have not considered? Or, what if we do not have any ability to look “inside” the device running our computations—for example, if we want to test the power of a quantum cloud computing service being offered over the internet? To demonstrate quantum advantage past the Goldilocks regime in these scenarios requires setting up an asymmetry in the computational problem: it should be hard to classically solve, but easy to classically verify. We can frame this idea via a structure from classical complexity theory called an interactive proof. Here, a quantum prover desires to demonstrate its capability to skeptical classical verifier. Due to this connection, this type of test has been termed a “proof of quantumness.”

Shor’s algorithm actually provides a straightforward protocol for achieving these goals: the classical verifier chooses two large prime numbers (in a way that would be secure for RSA encryption), multiplies them together, and sends the result to the quantum prover. If the prover can find the factors and return them, the verifier can be confident that the prover is truly quantum, to the same level of confidence that it is believed that classically factoring numbers is hard. We have already discussed why this protocol begs to be improved: the inherent challenges in running Shor’s algorithm make it totally infeasible to run at scale on today’s quantum devices. So, recent excitement has focused on whether the goal of creating an efficiently-verifiable “proof of quantumness” can be achieved in a way that is compatible with near-term devices.

There seem to be two direct paths towards achieving this. The first is to take a sampling problem, like the ones that were first used to show (non-efficiently-verifiable) quantum advantage, and somehow add structure to it so that the output can be efficiently verified. The second is to take a cryptographic problem, like factoring, and somehow “strip it down” to its core, such that it hopefully could require less resources than the full machinery of an algorithm like Shor’s.

Adding structure to sampling problems

The challenge with the first approach is that the classical hardness of sampling problems at some level depends on the fact that they do not have much structure. To my knowledge, there has only been one protocol ever proposed that attempts to create a proof of quantumness in this way. It actually came long before even the non-verifiable tests of quantum advantage were proposed. In 2008, a paper was released introducing a new quantum complexity class called IQP (“Instantaneous Quantum Polynomial” time, a class of quantum circuits in which all gates commute with each other). The authors realized that the structure of IQP computations seemed to lend itself to a particular sampling problem, which could be set up such that the underlying measurement distribution frequently yielded bitstrings with a special, efficiently-checkable relationship to a secret string that should only be known to the verifier. Since is secret, and simulating IQP circuits is classically hard, [BJS11, BMS16] only a real quantum computer should be able to reliably produce bitstrings having this special relationship to . The protocol remained known for over a decade, and experiments began to undertake efforts to implement it. [CCL+21] Unfortunately, it turned out that the concern raised above, that adding structure might compromise the classical hardness, was real. In 2019 I found a classical algorithm by which the secret string can be recovered in its entirety, destroying the classical hardness claim of the protocol. The original protocol, and the algorithm to break it, are described in Chapter 4.

Protocols based on cryptography

While that first approach does not seem to have led to any further advances, progress has been made on the second—simplifying cryptographic problems to make them more feasible. This may initially seem surprising, considering that “make factoring easier for near term devices” is an obvious and intensely sought-after goal in quantum computing! The key observation is that actually fully factoring numbers is overkill. All we need is to do something that classical computers can’t, and perhaps that something can be based on the hardness of factoring, without needing to return the factors themselves.

Intuition for how this might work can be found in the cryptographic concept of the zero-knowledge proof, which allows a prover to demonstrate that they know a particular fact or value to the verifier, without actually revealing any further information about it. As an example, consider the following (classical) protocol by which a prover can demonstrate that they know the discrete logarithm of a value without revealing it. [CEv88] Suppose , , and are publicly known integers, such that , is a large prime, and is a generator for the multiplicative group of integers modulo . The prover wants to demonstrate that they secretly hold a value such that . They first choose a random integer such that , compute , and send to the verifier. The verifier now randomly chooses to ask for either the value or the value . Upon receipt, either can be easily checked: with knowledge of , the verifier checks that indeed is equal to ; with , the verifier checks that . We observe two guarantees: first, the verifier gets no extra information from their receipt of only or —in either case, the information about is perfectly statistically hidden by the randomness in . Second, if the prover can consistently answer correctly over many repetitions of the above protocol, the verifier should be convinced that the prover knows both and each time, and thus ! Crucially, the prover did not know which question would be asked until after making the commitment . It’s also crucial that a new is chosen for each repetition, otherwise the verifier could easily extract . This protocol is known specifically as a zero-knowledge interactive proof, because of the requirement that several messages to be sent back and forth between the prover and verifier. It has a structure shared by many zero knowledge interactive proofs: first, the prover makes a commitment, and then the verifier makes a query chosen at random from a set. Knowledge of the correct response to a single query is not sufficient to recover anything about the hidden information, but knowledge of the correct responses to all of the queries is sufficient to recover the hidden information in full. By repeating the protocol many times, the verifier becomes confident that the prover knows the answer to all the queries simultaneously, and therefore knows the secret.

There is a really nice way that such a structure can be applied to the quantum setting. In the commitment phase, the prover commits to the claim that they hold a particular quantum state. Then, the verifier’s queries can correspond to different ways of measuring that quantum state. Intuition from quantum state tomography tells us that if “correct”88 8 Following the probability distribution of the committed state. measurement results can be produced for arbitrary measurement bases, the state can be reconstructed. Thus by repeating the protocol many times to ensure the prover always responds correctly regardless of the measurement basis, the verifier can ensure that the prover does indeed hold (or at least have a description of) the state to which they commit. This intuition only goes so far, however, because we desire that the protocols remain convincing even in the adversarial setting, where the prover is not just noisy, but instead is actively trying to fool the verifier by producing false measurement results. In this setting it is necessary to show more than just that the prover’s data could reasonably have come from the committed quantum state. We also must ensure that there are no shortcuts by which a classical cheater could produce results that seem to follow the same distribution!

The first protocol with this structure was introduced to the literature in 2018, by Brakerski et al. [BCM+21] (although the general structure just described was not explicitly presented by the authors in that work). The authors construct a protocol by which the prover can use a cryptographic construction called a trapdoor claw-free function (TCF) to commit to holding a superposition of the form , where and are -bit (classical) strings. They use a cryptographic problem called Learning with Errors (LWE) to construct a TCF with the necessary properties. Importantly, the protocol is set up such that the specific values and are computationally hard for a classical cheater to find (under the LWE assumption). But, the classical verifier has access to some secret information, with which the two values are easy to compute given the prover’s commitment. After the verifier has received the prover’s commitment, they move onto the “query” phase, in which the verifier asks the prover to make one of two simple measurements: either measure all of the qubits in the computational () basis, collapsing the superposition and yielding or , or measure them all in the Hadamard () basis, yielding a measurement result that depends on quantum interference between and . The authors show that a prover that is able to answer both queries consistently would be able to use those answers to break the LWE assumption, thereby bounding the probability by which a classical cheater could pass the protocol and providing a proof of quantumness.

Unfortunately, setting the cryptographic parameters to values that make classical cheating hard causes the quantum circuits to be so large that its implementation is quite far out of reach of near-term devices.99 9 No extensive analysis of the quantum resources needed for this protocol seems to have been published in the academic literature; however, a brief investigation done by myself together with Dr. Andru Gheorghiu (Chalmers University of Technology, Sweden) convinced me that it will not be feasible for some time. For this reason, further studies explored whether cryptographic problems other than LWE could be used in the same, or a similar, protocol. The main challenge is that the protocol requires a very strong cryptographic assumption called the adaptive hardcore bit assumption, which roughly states that given one of the two bitstrings in the superposition (say, above), it is computationally hard to find even a single bit of information about the second bitstring (). To my knowledge, it is not known how to build a TCF with such a strong cryptographic guarantee from anything other than LWE.1010 10 There is one other proposal, based on isogeny-based group actions, that is reasonably conjectured to have the adaptive hardcore bit property. [AMR22] In any case, that construction does not yield any benefits over LWE in terms of practical efficiency. Instead, studies have focused on modifying the protocol itself to relax the required cryptographic assumptions of the TCF.

The first paper to do so constructed a related protocol in the random oracle model, where both the prover and verifier have access to an oracle implementing a random function (the outputs of the function are perfectly random, but consistent when given the same input). [BKV+20] Intuitively, the randomness is used to “scramble” the values and before measurement, removing the extra structure a classical cheater could leverage (which created the need for the adaptive hardcore bit requirement). The challenge of using this protocol in practice, of course, is that random oracles do not exist in real life; they can be approximately implemented via the random oracle heuristic which replaces the random oracle with a cryptographic hash function.1111 11 The validity of the random oracle heuristic is a subject which has been explored at length; [CGH04, KM15] the broad consensus in the cryptographic community seems to be that despite the fact that it does not have any theoretical backing, it is fine in practice. Additionally, if the random oracle heuristic is applied, the protocol requires that the cryptographic hash function be evaluated coherently on a superposition of inputs, adding to the quantum circuit size. However, if one is willing to accept that, this protocol yields multiple benefits. First of all, as alluded to earlier, it reduces the cryptographic requirements of the TCF, removing the need for the adaptive hardcore bit property. Taking advantage of this fact, the authors provide a new TCF construction based on a computational problem called Ring-LWE, which is expected to be more efficient to implement than regular LWE. Additionally, the inclusion of the random oracle allows the protocol to use only a single round of messages between the prover and verifier—thus making it non-interactive, which could be useful in certain practical scenarios.

In Chapter 5 of this dissertation, we construct a protocol which removes the need for the adaptive hardcore bit property in the standard model of cryptography—that is, without the need for random oracles. This both makes a stronger demonstration of a fundamental difference in quantum versus classical computational power, and also removes the need for evaluating a cryptographic hash function coherently in addition to the TCF. In that chapter we also introduce two new TCF constructions, whose hardness are based on factoring and the decisional Diffie-Hellman (DDH) problems, respectively. The factoring-based construction requires the quantum prover to compute only coherently, as opposed to the required by Shor’s algorithm, leading to a dramatic reduction in the cost of implementation—yet the hardness of classically cheating remains the same. With the efficient circuit constructions that we describe in Chapter 7, we believe that this protocol is the closest yet to being implemented on real quantum devices in the next few years. As a first step towards this goal, in Chapter 6 we present a first proof-of-concept experiment, in which the protocol is implemented at a small size in a trapped ion quantum computer.

Before concluding this section, I would like to describe two more papers which have taken new approaches to demonstrating quantum computational advantage. Neither seems to have any hope of being implemented on near term devices, but both make new progress in showing what types of cryptography can be used to build these protocols—and hopefully, they can lead to new constructions that are indeed cheaper to implement. The first is a protocol by Yamakawa and Zhandry, which operates in the random oracle model but requires no TCF at all—the random oracle is the only cryptographic tool required! [YZ22] It introduces a clever construction called “quantum state multiplication”, in which two quantum states and are combined into a single state (an operation that is usually impossible, but is made possible by the specific setup in the protocol). The second is a protocol by Kalai et al., which constructs a compiler that takes multi-party interactive proofs (that is, those with multiple provers working together to try to convince the classical verifier of a fact) and turns them into single-prover protocols. [KLV+22]

Moving past proofs of quantumness

While it seems to be an important milestone in the path towards full-scale quantum computing, demonstrating quantum computational advantage will not forever remain a particularly useful task. Fortunately, the protocols discussed in the preceding section represent just a subset of quantum interactive (or sometime non-interactive) protocols, which in general can achieve much more than simply demonstrating quantum computational power. In fact, the first protocol described above, based on the LWE problem, already pursued further applications. Not only was it presented as a test of quantumness, but it also represents a way to use an untrusted quantum device to generate random numbers that are certifiably quantum—that is, true randomness! In a pair of related papers, Mahadev showed that similar constructions could be used to implement classical homomorphic encryption for quantum circuits [MAH20] and even the classical verification of arbitrary quantum computations [MAH18]. Later papers also showed that this type of protocol could be used for other tasks, such as verifiable remote state preparation. [GV19] Finally, it has also been shown (in a paper I co-authored with several colleagues) that some of the quantum advantage protocols described in the previous section can be used unchanged to prove certain facts about the inner workings of a quantum device, leading to implications such as certifiable quantum random number generation directly from those protocols. [BGK+23]