Our argument on the quantum-classical crossover is based on the runtime analysis needed to compute the ground state energy within desired total energy accuracy, denoted as . The primal objective in this section is to provide a framework that elucidates the quantum-classical crosspoint for systems whose spectral gap is constant or polynomially-shrinking. In this work, we choose two models that are widely known due to their profoundness despite the simplicity: the 2d J1-J2 Heisenberg model and 2d Fermi-Hubbard model on a square lattice (see the Method section for their definitions). Meanwhile, it is totally unclear whether a feasible crosspoint exists at all when the gap closes exponentially.
It is important to keep in mind that condensed matter physics often entails extracting physical properties beyond merely energy, such as magnetization, correlation function, or dynamical responses. Therefore, in order to assure that expectation value estimations can done consistently (i.e. satisfy N-representability), we demand that we have the option to measure the physical observable after computation of the ground state energy is done. In other words, for instance the classical algorithm, we perform the variational optimization up to the desired target accuracy ; we exclude the case where one calculates less precise quantum states with energy errors i and subsequently perform extrapolation. The similar requirement is imposed on the quantum algorithm as well.
Among the numerous powerful classical methods available, we have opted to utilize the DMRG algorithm, which has been established as one of the most powerful and reliable numerical tools to study strongly-correlated quantum lattice models especially in one dimension (1d)18,19. In brief, the DMRG algorithm performs variational optimization on tensor-network-based ansatz named Matrix Product State (MPS)20,21. Although MPS is designed to efficiently capture 1d area-law entangled quantum states efficiently22, the efficacy of DMRG algorithm allows one to explore quantum many-body physics beyond 1d, including quasi-1d and 2d systems, and even all-to-all connected models, as considered in quantum chemistry23,24.
A remarkable characteristic of the DMRG algorithm is its ability to perform systematic error analysis. This is intrinsically connected to the construction of ansatz, or the MPS, which compresses the quantum state by performing site-by-site truncation of the full Hilbert space. The compression process explicitly yields a metric called truncation error," from which we can extrapolate the truncation-free energy, E0, to estimate the ground truth. By tracking the deviation from the zero-truncation result EE0, we find that the computation time and error typically obeys a scaling law (See Fig. 2 for an example of such a scaling behavior in 2d J1-J2 Heisenberg model). The resource estimate is completed by combining the actual simulation results and the estimation from the scaling law. [See Supplementary Note 2 for detailed analysis.]
Although the simulation itself does not reach =0.01, learning curves for different bond dimensions ranging from D=600 to D=3000 collapse into a single curve, which implies the adequacy to estimate runtime according to the obtained scaling law. All DMRG simulations are executed using ITensor library61.
We remark that it is judicious to select the DMRG algorithm for 2d models, even though the formal complexity of number of parameters in MPS is expected to increase exponentially with system size N, owing to its intrinsic 1d-oriented structure. Indeed, one may consider another tensor network states that are designed for 2d systems, such as the Projected Entangled Pair States (PEPS)25,26. When one use the PEPS, the bond dimension is anticipated to scale as (D=O(log (N))) for gapped or gapless non-critical systems and D=O(poly(N)) for critical systems27,28,29 to represent the ground state with fixed total energy accuracy of =O(1) (it is important to note that the former would be D=O(1) if considering a fixed energy density). Therefore, in the asymptotic limit, the scaling on the number of parameters of the PEPS is exponentially better than that of the MPS. Nonetheless, regarding the actual calculation, the overhead involved in simulating the ground state with PEPS is substantially high, to the extent that there are practically no scenarios where the runtime of the variational PEPS algorithm outperforms that of DMRG algorithm for our target models.
Quantum phase estimation (QPE) is a quantum algorithm designed to extract the eigenphase of a given unitary U by utilizing ancilla qubits to indirectly read out the complex phase of the target system. More concretely, given a trial state (leftvert psi rightrangle) whose fidelity with the k-th eigenstate (leftvert krightrangle) of the unitary is given as fk=k2, a single run of QPE projects the state to (leftvert krightrangle) with probability fk, and yields a random variable (hat{phi }) which corresponds to a m-digit readout of k.
It was originally proposed by ref. 30 that eigenenergies of a given Hamiltonian can be computed efficiently via QPE by taking advantage of quantum computers to perform Hamiltonian simulation, e.g., (U=exp (-iHtau )). To elucidate this concept, it is beneficial to express the gate complexity for the QPE algorithm as schematically shown in Fig. 3 as
$$begin{array}{r}C sim {C}_{{{{rm{SP}}}}}+{C}_{{{{rm{HS}}}}}+{C}_{{{{{rm{QFT}}}}}^{{dagger} }},end{array}$$
(1)
where we have defined CSP as the cost for state preparation, CHS for the controlled Hamiltonian simulation, and ({C}_{{{{{rm{QFT}}}}}^{{dagger} }}) for the inverse quantum Fourier transformation, respectively (See Supplementary Note 4 for details). The third term ({C}_{{{{{rm{QFT}}}}}^{{dagger} }}) is expected to be the least problematic with ({C}_{{{{{rm{QFT}}}}}^{{dagger} }}=O(log (N))), while the second term is typically evaluated as CHS=O(poly(N)) when the Hamiltonian is, for instance, sparse, local, or constituted from polynomially many Pauli terms. Conversely, the scaling of the third term CSP is markedly nontrivial. In fact, the ground state preparation of local Hamiltonian generally necessitates exponential cost, which is also related to the fact that the ground state energy calculation of local Hamiltonian is categorized within the complexity class of QMA-complete31,32.
Here, ancilla qubits are projected to (leftVert {x}_{1}cdots {x}_{m}rightrangle) which gives a m-digit readout of the ground state (leftvert {psi }_{{{{rm{GS}}}}}rightrangle) of an N-qubit system.
Although the aforementioned argument seems rather formidable, it is important to note that the QMA-completeness pertains to the worst-case scenario. Meanwhile, the average-case hardness in translationally invariant lattice Hamiltonians remains an open problem, and furthermore we have no means to predict the complexity under specific problem instances. In this context, it is widely believed that a significant number of ground states that are of substantial interest in condensed matter problems can be readily prepared with a polynomial cost33. In this work, we take a further step to argue that the state preparation cost can be considered negligible as CSPCHS for our specific target models, namely the gapless spin liquid state in the J1-J2 Heisenberg model or the antiferromagnetic state in the Fermi-Hubbard model. Our argument is based on numerical findings combined with upper bounds on the complexity, while we leave the theoretical derivation for scaling (e.g. Eq. (4)) as an open problem.
For concreteness, we focus on the scheme of the Adiabatic State Preparation (ASP) as a deterministic method to prepare the ground state through a time evolution of period tASP. We introduce a time-dependent interpolating function (s(t):{mathbb{R}}mapsto [0,1](s(0)=0,s({t}_{{{{rm{ASP}}}}})=1)) such that the ground state is prepared via time-dependent Schrdinger equation given by
$$begin{array}{r}idisplaystylefrac{partial }{partial t}leftvert psi (t)rightrangle =H(t)leftvert psi (t)rightrangle ,end{array}$$
(2)
where H(t)=H(s(t))=sHf+(1s)H0 for the target Hamiltonian Hf and the initial Hamiltonian H0. We assume that the ground state of H0 can be prepared efficiently, and take it as the initial state of the ASP. Early studies suggested a sufficient (but not necessary) condition for preparing the target ground state scales as tASP=O(1/f3)34,35,36 where f=1GS(tASP) is the target infidelity and is the spectral gap. This has been refined in recent research as
$${t}_{{{{rm{ASP}}}}}=left{begin{array}{l}O(frac{1}{{epsilon }_{f}^{2}{Delta }^{2}}| log (Delta ){| }^{zeta })(zeta > 1) \ O(frac{1}{{Delta }^{3}}log (1/{epsilon }_{f}))quad end{array}right..$$
(3)
Two conditions independently achieve the optimality with respect to and f. Evidently, the ASP algorithm can prepare the ground state efficiently if the spectral gap is constant or polynomially small as =O(1/N).
For both of our target models, numerous works suggest that =1/237,38,39, which is one of the most typical scalings in 2d gapless/critical systems such as the spontaneous symmetry broken phase with the Goldstone mode and critical phenomena described by 2d conformal field theory. With the polynomial scaling of to be granted, now we ask what the scaling of CSP is, and how does it compare to other constituents, namely CHS and ({C}_{{{{rm{QFT}}}}}^{{dagger} }).
In order to estimate the actual cost, we have numerically calculated tASP required to achieve the target fidelity (See Supplementary Note 3 for details) up to 48 qubits. With the aim of providing a quantitative way to estimate the scaling of tASP in larger sizes, we reasonably consider the combination of the upper bounds provided in Eq. (3) as
$$begin{array}{r}{t}_{{{{rm{ASP}}}}}=Oleft(displaystylefrac{1}{{Delta }^{beta }}log (1/{epsilon }_{f})right).end{array}$$
(4)
Figure 4a, b illustrates the scaling of tASP concerning f and , respectively. Remarkably, we find that Eq. (4) with =1.5 gives an accurate prediction for 2d J1-J2 Heisenberg model. This implies that the ASP time scaling is ({t}_{{{{rm{ASP}}}}}=O({N}^{beta /2}log (1/{epsilon }_{f}))), which yields gate complexity of (O({N}^{1+beta /2}{{{rm{polylog}}}}(N/{epsilon }_{f}))) under optimal simulation for time-dependent Hamiltonians40,41. Thus, CSP proves to be subdominant in comparison to CHS if <2, which is suggested in our simulation. Furthermore, under assumption of Eq. (4), we can estimate tASP to at most a few tens for practical system size of N~100 under infidelity of f~0.1. This is fairly negligible compared to the controlled Hamiltonian simulation that requires dynamics duration to be order of tens of thousands in our target models (One must note that there is a slight different between two schemes. Namely, the time-dependent Hamiltonian simulation involves the quantum signal processing using the block-encoding of H(t), while the qubitization for the phase estimation only requires the block-encoding. This implies that T-count in the former would encounter overhead as seen in the Taylorization technique. However, we confirm that this overhead, determined by the degrees of polynomial in the quantum signal processing, is orders of tens41, so that the required T-count for state preparation is still suppressed by orders of magnitude compared to the qubitization). This outcome stems from the fact that the controlled Hamiltonian simulation for the purpose of eigenenergy extraction obeys the Heisenberg limit as CHS=O(1/), a consequence of time-energy uncertainty relation. This is in contrast to the state preparation, which is not related to any quantum measurement and thus there does not exist such a polynomial lower bound.
(a) Scaling with the target infidelity f for system size of 44 lattice. The interpolation function is taken so that the derivative up to -th order is zero at t=0,tASP. Here we consider the linear interpolation for =0, and for smoother ones we take ({{{{mathcal{S}}}}}_{kappa }) and ({{{{mathcal{B}}}}}_{kappa }) that are defined from sinusoidal and incomplete Beta functions, respectively (see Supplementary Note 3). While smoothness for higher ensures logarithmic scaling for smaller f, for the current target model, we find that it suffices to take s(t) whose derivative vanishes up to =2 at t=0,tASP. (b) Scaling with the spectral gap . Here we perform the ASP using the MPS state for system size of LxLy, where results for Lx=2,4,6 is shown in cyan, blue, and green data points. We find that the scaling exhibits tASP1/ with ~1.5.
As we have seen in the previous sections, the dominant contribution to the quantum resource is CHS, namely the controlled Hamiltonian simulation from which the eigenenergy phase is extracted into the ancilla qubits. Fortunately, with the scope of performing quantum resource estimation for the QPE and digital quantum simulation, numerous works have been devoted to analyzing the error scaling of various Hamiltonian simulation techniques, in particular the Trotter-based methods42,43,44. Nevertheless, we point out that crucial questions remain unclear; (A) which technique is the best practice to achieve the earliest quantum advantage for condensed matter problems, and (B) at which point does the crossover occur?
Here we perform resource estimation under the following common assumptions: (1) logical qubits are encoded using the formalism of surface codes45; (2) quantum gate implementation is based on Clifford+T formalism; Initially, we address the first question (A) by comparing the total number of T-gates, or T-count, across various Hamiltonian simulation algorithms, as the application of a T-gate involves a time-consuming procedure known as magic-state distillation. Although not necessarily, this procedure is considered to dominate the runtime in many realistic setups. Therefore, we argue that T-count shall provide sufficient information to determine the best Hamiltonian simulation technique. Then, with the aim of addressing the second question (B), we further perform high-resolution analysis on the runtime. We in particular consider concrete quantum circuit compilation with specific physical/logical qubit configuration compatible with the surface code implemented on a square lattice.
Let us first compute the T-counts to compare the state-of-the-art Hamiltonian simulation techniques: (randomized) Trotter product formula46,47, qDRIFT44, Taylorization48,49,50, and qubitization40. The former two commonly rely on the Trotter decomposition to approximate the unitary time evolution with sequential application of (controlled) Pauli rotations, while the latter two, dubbed as post-Trotter methods," are rather based on the technique called the block-encoding, which utilize ancillary qubits to encode desired (non-unitary) operations on target systems (See Supplementary Note 5). While post-Trotter methods are known to be exponentially more efficient in terms of gate complexity regarding the simulation accuracy48, it is nontrivial to ask which is the best practice in the crossover regime, where the prefactor plays a significant role.
We have compiled quantum circuits based on existing error analysis to reveal the required T-counts (See Supplementary Notes 4, 6, and 7). From results presented in Table 1, we find that the qubitization algorithm provides the most efficient implementation in order to reach the target energy accuracy =0.01. Although the post-Trotter methods, i.e., the Taylorization and qubitization algorithms require additional ancillary qubits of (O(log (N))) to perform the block encoding, we regard this overhead as not a serious roadblock, since the target system itself and the quantum Fourier transformation requires qubits of O(N) and (O(log (N/epsilon ))), respectively. In fact, as we show in Fig. 5, the qubitization algorithms are efficient at near-crosspoint regime in physical qubit count as well, due to the suppressed code distance (see Supplementary Note 9 for details).
Here, we estimate the ground state energy up to target accuracy =0.01 for 2d J1-J2 Heisenberg model (J2/J1=0.5) and 2d Fermi-Hubbard model (U/t=4), both with lattice size of 1010. The blue, orange, green, and orange points indicate the results that employ qDRIFT, 2nd-order random Trotter, Taylorization, and qubitization, where the circle and star markers denote the spin and fermionic models, respectively. Two flavors of the qubitization, the sequential and newly proposed product-wise construction (see Supplementary Note 5 for details), are discriminated by filled and unfilled markers. Note that Nph here does not account for the magic state factories, which are incorporated in Fig. 7.
We also mention that, for 2d Fermi-Hubbard model, there exists some specialized Trotter-based methods that improve the performance significantly16,17. For instance, the T-count of the QPE based on the state-or-the-art PLAQ method proposed in ref. 17 can be estimated to be approximately 4108 for 1010 system under =0.01, which is slightly higher than the T-count required for the qubitization technique. Since the scaling of PLAQ is similar to the 2nd order Trotter method, we expect that the qubitization remains the best for all system size N.
The above results motivate us to study the quantum-classical crossover entirely using the qubitization technique as the subroutine for the QPE. As is detailed in Supplementary Note 8, our runtime analysis involves the following steps:
Hardware configuration. Determine the architecture of quantum computers (e.g., number of magic state factories, qubit connectivity etc.).
Circuit synthesis and transpilation. Translate high-level description of quantum circuits to Clifford+T formalism with the provided optimization level.
Compilation to executable instructions. Decompose logical gates into the sequence of executable instruction sets based on lattice surgery.
It should be noted that the ordinary runtime estimation only involves the step (II); simply multiplying the execution time of T-gate to the T-count as NTtT. However, we emphasize that this estimation method loses several vital factors in time analysis which may eventually lead to deviation of one or two orders of magnitude. In sharp contrast, our runtime analysis comprehensively takes all steps into account to yield reliable estimation under realistic quantum computing platforms.
Figure 6 shows the runtime of classical/quantum algorithms simulating the ground state energy in 2d J1-J2 Heisenberg model and 2d Fermi-Hubbard model. In both figures, we observe clear evidence of quantum-classical crosspoint below a hundred-qubit system (at lattice size of 1010 and 66, respectively) within plausible runtime. Furthermore, a significant difference from ab initio quantum chemistry calculations is highlighted in the feasibility of system size N~1000 logical qubit simulations, especially in simulation of 2d Heisenberg model that utilizes the parallelization technique for the oracles (See Supplementary Note 8 for details).
Here we show the results for (a) 2d J1-J2 Heisenberg model of J2/J1=0.5 and (b) 2d Fermi-Hubbard model of U/t=4. The blue and red circles are the runtime estimate for the quantum phase estimation using the qubitization technique as a subroutine, whose analysis involves quantum circuit compilation of all the steps (I), (II), and (III). All the gates are compiled under the Clifford+T formalism with each logical qubits encoded by the surface code with code distance d around 17 to 25 assuming physical error rate of p=103 (See Supplementary Note 9). Here, the number of magic state factories nF and number of parallelization threads nth are taken as (nF,nth)=(1,1) and (16,16) for Single" and Parallel," respectively. The dotted and dotted chain lines are estimates that only involve the analysis of step (II); calculation is based solely on the T-count of the algorithm with realistic T-gate consumption rate of 1kHz and 1MHz, respectively. The green stars and purple triangles are data obtained from the actual simulation results of classical DMRG and variational PEPS algorithms, respectively, with the shaded region denoting the potential room for improvement by using the most advanced computational resource (See Supplementary Note 2). Note that the system size is related with the lattice size MM as N=2M2 in the Fermi-Hubbard model.
For concreteness, let us focus on the simulation for systems with lattice size of 1010, where we find the quantum algorithm to outperform the classical one. Using the error scaling, we find that the DMRG simulation is estimated to take about 105 and 109 seconds in 2d Heisenberg and 2d Fermi-Hubbard models, respectively. On the other hand, the estimation based on the dedicated quantum circuit compilation with the most pessimistic equipment (denoted as Single" in Fig. 6) achieves runtime below 105 seconds in both models. This is further improves by an order when we assume a more abundant quantum resource. Concretely, using a quantum computer with multiple magic state factories (nF=16) that performs multi-thread execution of the qubitization algorithm (nTh=16), the quantum advantage can be achieved within a computational time frame of several hours. We find it informative to also display the usual T-count-based estimation; it is indeed reasonable to assume a clock rate of 110kHz for single-thread execution, while its precise value fluctuates depending on the problem instance.
We note that the classical algorithm (DMRG) experiences an exponential increase in the runtime to reach the desired total energy accuracy =0.01. This outcome is somewhat expected, since one must enforce the MPS to represent 2d quantum correlations into 1d via cylindrical boundary condition38,51. Meanwhile, the prefactor is significantly lower than that of other tensor-network-based methods, enabling its practical use in discussing the quantum-classical crossover. For instance, although the formal scaling is exponentially better in variational PEPS algorithm, the runtime in 2d J1-J2 Heisenberg model exceeds 104 seconds already for the 66 model, while the DMRG algorithm consumes only 102 seconds (See Fig. 6a). Even if we assume that the bond dimension of PEPS can be kept constant for larger N, the crossover between DMRG and variational PEPS occurs only above the size of 1212. As we have discussed previously, we reasonably expect (D=O(log (N))) for simulation of fixed total accuracy, and furthermore expect that the number of variational optimization also scales polynomially with N. This implies that the scaling is much worse than O(N); in fact, we have used constant value of D for L=4,6,8 and observe that the scaling is already worse than cubic in our setup. Given such a scaling, we conclude that DMRG is better suited than the variational PEPS for investigating the quantum-classical crossover, and also that quantum algorithms with quadratic scaling on N runs faster in the asymptotic limit.
It is informative to modify the hardware/algorithmic requirements to explore the variation of quantum-classical crosspoint. For instance, the code distance of the surface code depends on p and as (See Supplementary Note 9)
$$d=Oleft(frac{log (N/epsilon )}{log (1/p)}right).$$
(5)
Note that this also affects the number of physical qubits via the number of physical qubit per logical qubit 2d2. We visualize the above relationship explicitly in Fig. 7, which considers the near-crosspoint regime of 2d J1-J2 Heisenberg model and 2d Fermi-Hubbard model. It can be seen from Fig. 7a, b, d, e that the improvement of the error rate directly triggers the reduction of the required code distance, which results in s significant suppression of the number of physical qubits. This is even better captured by Fig. 7c, f. By achieving a physical error rate of p=104 or 105, for instance, one may realize a 4-fold or 10-fold reduction of the number of physical qubits.
The panels denote (a) code distance d and (b) number of physical qubits Nph required to simulate the ground state of 2d J1-J2 Heisenberg model with lattice size of 1010 with J2=0.5. Here, the qubit plane is assumed to be organized as (nF,#thread)=(1,1). The setup used in the maintext, =0.01 and p=103, is indicated by the orange stars. c Focused plot at =0.01. Blue and red points show the results for code distance d and Nph, respectively, where the filled and empty markers correspond to floor plans with (nF,#thread)=(1,1) and (16,16), respectively. (df) Plots for 2d Fermi-Hubbard model of lattice size 66 with U=4, corresponding to (ac) for the Heisenberg model.
The logarithmic dependence for in Eq. (5) implies that the target accuracy does not significantly affect the qubit counts; it is rather associated with the runtime, since the total runtime scaling is given as
$$begin{array}{r}t=Oleft(displaystylefrac{{N}^{2}log (N/epsilon )}{epsilon log (1/p)}right),end{array}$$
(6)
which now shows polynomial dependence on . Note that this scaling is based on multiplying a factor of d to the gate complexity, since we assumed that the runtime is dominated by the magic state generation, of which the time is proportional to the code distance d, rather than by the classical postprocessing (see Supplementary Notes 8 and 9). As is highlighted in Fig. 8, we observe that in the regime with higher , the computation is completed within minutes. However, we do not regard such a regime as an optimal field for the quantum advantage. The runtime of classical algorithms typically shows higher-power dependence on , denoted as O(1/), with ~2 for J1-J2 Heisenberg model and ~4 for the Fermi-Hubbard model (see Supplementary Note 2), which both implies that classical algorithms are likely to run even faster than quantum algorithms under large values. We thus argue that the setup of =0.01 provides a platform that is both plausible for the quantum algorithm and challenging by the classical algorithm.
Panels (a) and (c) show results for 2d J1-J2 Heisenberg model of lattice size 1010 with J2=0.5, while (b) and (d) show results for 2d Fermi-Hubbard model of lattice size 66 with U=4. The floor plan of the qubit plane is assumed as (nF,#thread)=(1,1) and (16,16) for (a, b) and (c, d), respectively. The setup =0.01 and p=103, employed in Fig. 6, is shown by the black open stars.
See the rest here:
Hunting for quantum-classical crossover in condensed matter problems | npj Quantum Information - Nature.com
Read More..