Theoretical framework
We develop QNIR theory starting from general RC theory. RC is a computational paradigm and class of machine learning algorithms that derives from RNNs. RC involves mapping input signals, or time series sequences, into higher dimensional feature spaces provided by the dynamics of a non-linear system with fixed coupling constants, called a reservoir. Having a smaller number of trainable weights confined to a single output layer is a core benefit of RC because it makes training fast and efficient compared to RNNs. RC has a number of properties that should be met28,29 including adequate reservoir dimensionality, nonlinearity, fading memory/echo state property (ESP) and response separability.
For the univariate case, a reservoir, f, is a recurrent function of an input sequence, (u_t), and prior reservoir states, (bar{x}_{t-1}), as
$$begin{aligned} bar{x}_t = f(bar{x}_{t-1},u_t). end{aligned}$$
(1)
As output sequences, (bar{x}_t), training sequences are selected between time-steps (t=t_i) and (t=t_f), and form a training design matrix, (textbf{X}_{tr}). The initial sequence, (t $$begin{aligned} textbf{y} = W^T textbf{X}_{tr}, end{aligned}$$ (2) is trained based on least squares, where (textbf{y}) is the target vector and W is an initial weight vector. The trained model has the form: $$begin{aligned} hat{textbf{y}} = W^T_{opt}textbf{X}, end{aligned}$$ (3) with an optimized weight vector, (W^T_{opt}), to give a predicted sequence, (hat{textbf{y}}), from new sequences, (textbf{X}). Circuit channel diagrams of the QNIR computer in the unrolled view, composed using30. The initial state of the quantum reservoir is (|+rangle ^{otimes n}) and the quantum channels labeled (mathscr {T}_{u_i}) evolve the density operator as in Eq. (4), where N quantum circuits are required for N time steps. A number of output sequences, n, are concatenated from sequential, single-qubit expectation value measurements (langle Z_{i} rangle) on n qubits. For QNIR with artificial noise channels, the RC framework that has been developed is now instantiated in the following way. The density operator evolves in time steps as $$begin{aligned} rho _t = mathscr {T}_{u_t}(rho _{t-1}), end{aligned}$$ (4) where the reservoir map (mathscr {T}_{u_t}) is composed of a sequence unitary quantum gates, (U_i), and associated artificial noise channels, (mathscr {E}_i), that are completely positive and trace preserving (CPTP). The reservoir map can be represented as a composition of quantum channels $$begin{aligned} mathscr {T}_{u_t}(rho _{t-1}) = mathscr {E}_{U_K} circ ldots circ mathscr {E}_{U_2} circ mathscr {E}_{U_1} (rho _{t-1}), end{aligned}$$ (5) where the notation (mathscr {E}_{U_i} = mathscr {E}_i( U_i rho U_i^{dagger } )) is used for clarity and to emphasize that each quantum gate is acted on by a noisy channel and K is the number of noise channels in the time step. We will refer to (mathscr {T}_{u_t}) as a noisy quantum circuit. QNIR requires an initial washout phase, (t The unitary, noiseless part of the quantum circuit is composed of an initial layer of RX gates followed by an entanglement scheme of ({RZ!Z}_{i,j}) gates, which are 2-qubit entangling gates $$begin{aligned} (C!X_{i,j}RZ_j(theta )C!X_{i,j})RX^{otimes n}(theta ) = {RZ!Z}_{i,j}(theta )RX^{otimes n}(theta ), end{aligned}$$ (6) where all (RX(theta )) and (RZ(theta )) rotation gates encode the time series data with a scaling map, (theta =phi (u)). The purpose and structure of the unitary encoding gates is detailed in subsection: Reservoir circuit designs. Single-qubit expectation values, (langle Z_{i} rangle = Tr(Z_i rho )), are measured for all n qubits at each time-step, $$begin{aligned} h_t = [langle Z_{1} rangle ,langle Z_{2} rangle ,ldots ,langle Z_{n} rangle ]^T, end{aligned}$$ (7) as shown in a circuit diagram in Fig. 1. Figure2 depicts that time series values are encoded to all reservoir qubits and (langle Z_{i} rangle) are measured of all qubits, which are concatenated for each time step to give n reservoir feature sequences (q_i = {langle Z_{i} rangle }_{t=0}^N), where N is the number of time steps. In turn, (q_i) form a design matrix (textbf{X}) and the QNIR model is trained as in Eq. (3). A schematic of the full QNIR computer is shown in Fig. 3. This drawing represents many repeats of data encoding of a single value, (u_i), to all reservoir qubits (left) and measurements of single-qubit Z expectation values (right). This two-part process occurs at each time step i to build feature signals by concatenation. Noisy quantum circuits are shown for each time step in Fig. 1. This drawing shows an example of a four-qubit reservoir with fixed, pair-separable dynamics. In this graphic the first layer contains an array of duplicates of a single time series value. Each value in the input array is encoded to all qubits of the reservoir as in Eq. (6). The second layer is a quantum reservoir with arbitrary entanglement scheme, represented by connecting lines between qubit nodes. The Z observable expectation value, (langle Z_{i}rangle), is measured for all qubits. These measurements are repeated and concatenated to build output signals, (q_i). In the final layer, these signals are used in multiple linear regression for time series prediction, as in Eq. (3). It is important in RC and by extension QRC that the reservoir system can capture the temporal dynamics of the target system. To ensure this we implement a reservoir optimization scheme for QNIR. The artificial noise channels, (mathscr {E}_i), of the quantum reservoir circuit are iteratively updated by an optimization routine with an MSE cost function based on the time series prediction performance. This serves to optimize the quantum reservoir for time series prediction. Details of the optimization approach are in subsection: Reservoir noise parameterization. This section is concerned with the architecture and purpose of the unitary gates of the quantum circuit, the high-level structure of the noisy quantum circuits and entanglement scheme. The details of the noise scheme are covered in subsection: Reservoir noise parameterization. The initial state of the quantum reservoir, (|+rangle ^{otimes n}), is prepared by an initial Hadamard gate layer. Continuing with Eq. (6), an n-qubit QNIR circuit has a fixed sequence of quantum gates $$begin{aligned} begin{aligned} U_{b}(u)&= (C!X_{i,j}RZ_j(theta )C!X_{i,j})RX^{otimes n}(theta ) \&= {RZ!Z}_{i,j}(theta )RX^{otimes n}(theta ) end{aligned} end{aligned}$$ (8) where i,j are indices for two qubits that denote the placement of multiple 2-qubit (RZ!Z) entangling gates. The decomposed form of the circuit with (C!X) and RZ gates23 is implemented with noise channels (see subsection: Reservoir noise parameterization). A time series data value, u, is encoded to all (RX(theta )) and (RZ!Z(theta )) gates by angle (theta = phi (u)), where (phi) is a scaling map. To implement the recurrent architecture of QNIR, a set of N quantum circuits are executed for a time series ({u_t}^N_{t=0}). The first circuit encodes ({u_0}), the second circuit encodes ({u_0,u_1}), and the Nth circuit encodes ({u_t}^N_{t=0}) as $$begin{aligned} text {U}_{t=N} = U_{b}(u_N) ldots U_{b}(u_1)U_{b}(u_0). end{aligned}$$ (9) All unitaries (text {U}_t) for arbitrary t constrain the i expectation values to a zero bitstring $$begin{aligned} langle Z_{i} rangle _{t} = langle Phi _0|text {U}^{dagger }_t Z_i text {U}_t |Phi _0rangle = 000..., end{aligned}$$ (10) where (|Phi _0rangle = |+rangle ^{otimes n}) is the initial reservoir state and (Z_i) represents n single-qubit Z measurement operators. It is the action noise that ensures the qubit signals are non-zero, feature sequences, (q_i). Now considering the full QNIR circuits with artificial noise, the noisy quantum circuit for the final iteration, encoding ({u_t}^N_{t=0}), is the quantum channel $$begin{aligned} {varvec{mathscr {T}}}_{N} = {mathscr {T}}_{u_N} {circ } ldots {circ } {mathscr {T}}_{{u}_{2}} {circ } {mathscr {T}}_{{u}_{1}}. end{aligned}$$ (11) The noisy quantum circuit with artificial noise scheme will be detailed in the next subsection: Reservoir noise parameterization. This scheme may further reduce resources by circuit truncation based on a memory criterion29,31,32,33. For (RZ!Z_{i,j}) gates, the degree of entanglement between qubits i and j is a function of (u_t). It is important that the range of magnitudes of the data values is constrained and we observe that values much larger than (2pi) cause undesirable effects. We consider benchmarks that do not require re-scaling. Drawing from the close connection with quantum feature maps23,34,35,36, entanglement schemes are defined by the number and placement, i.e. the architecture, of (RZ!Z) gates in Eq. (6). Common entanglement schemes that could be trialed are full, linear, pair-wise, and what we call pair-separable used inSuzuki et al.11. The pair-separable (PS) and linear entanglement (LE) schemes explored in this work have (RZ!Z) gates indexed as (i,j in {(0,1),(2,3),(4,5),...,(N-1,N)}) and respectively (i,j in {(0,1),(1,2),(2,3),...,(N-1,N)}). To clarify, for an LE scheme, every additional (RZ!Z) gate is in a new circuit layer, increasing the circuit depth each time. The LE scheme creates whole circuit entangled states23. The state vector for a PS entanglement scheme evolves in a product state of qubit pairs, (|psi rangle = bigotimes _{i=1}^{n/2} |phi rangle _i), where (|phi rangle _i) are two-qubit entangled states. The state, (|psi rangle), can be efficiently classically simulated and can be parallelized in classical simulation or on quantum computers37,38. QNIR uses noise as a necessary resource to generate non-trivial feature sequences. We use artificial noise that can be programmed to a quantum computer. Within this scheme, many such artificial noise models can be implemented to produce different effects. To implement a noise scheme, we associate parameterized, single-qubit noise channels with each unitary gate in the quantum circuit, Eq. (6), as shown in Fig. 4. Note that this differs from Kubota et al.12, where noise channels were situated at the end of every time step. In the following, we assume each noise channel depends on a single noise parameter. A 2-qubit quantum circuit channel diagram of an reservoir noise parameterization. Each unitary gate has an associated noise channel represented by (mathscr {E}(p_i)). This represents the novel quantum circuit parameterization approach proposed in this work. This graphic shows the QNIR noise optimization scheme. The quantum model is trained and tested iteratively in a classical optimization loop, where dual annealing or evolutionary optimization are used. The quantum reservoir circuits have a number of gate-associated noise channels, each of which has a single error probability parameter that is iteratively updated. Noise channels are associated with all quantum gates in the reservoir circuit in Fig. 4. Each noise channel (mathscr {E}(p)) is a function of a probability for the noise effect to occur. We use probabilities, (p_i), to parameterize the reservoir for optimization. The number of probability parameters scales linearly with the number of qubits. For pair-separable entanglement reservoir, the number of parameters is (n_{p_i} = frac{7}{2} n), where (n=2,4,6,...), and for linear entangled reservoir (n_{p_i} = 6n-5), where (n=2,3,4,...). QNIR resource-noise optimization is performed through iterative training (Eq. 2) and testing (Eq. 3) of QNIR, giving optimized noise probability parameters, (p_i in textbf{p}) (see Fig. 5). The parameters in the initial parameter vector, (textbf{p}), are probabilities randomly selected from a uniform distribution, (p_i sim U(0,1), forall i). Two optimization approaches were trialed in this work, evolutionary optimization27 and dual annealing39, where the latter is available in the SciPy optimization package40. The mean squared error (MSE) was used as a suitable cost function to measure prediction performance, which is minimized as $$begin{aligned} min _{textbf{p}}; { text {MSE}(hat{textbf{y}}(textbf{p}),textbf{y}) : p_i in [0,1], forall i }, end{aligned}$$ (12) where (hat{textbf{y}} = W^T_{opt} textbf{X}(textbf{p})) is the QNIR test set prediction and (textbf{X}(textbf{p})) are the reservoir signals matrix dependent on noise probabilities (textbf{p}). In this work, we use only reset noise channels that can be simply implemented with a classical ancilla system (see next subsection: Reset noise). We propose a simple hybrid quantum-classical algorithm for a reset noise channel that consists of probabilistically triggering a reset instruction using a classical ancillary system. A deterministic reset instruction is an important element of a quantum instruction set, for the need to reset qubit states. A quantum instruction set is an abstract quantum computer model41,42. In this work we consider a reset to (|0rangle) noise channel given by (mathscr {E}_{PR}(rho ) = p|0rangle langle 0| + (1-p)rho), where p is the reset probability43. (mathscr {E}_{PR}(rho )) is trace-preserving, (Tr(mathscr {E}_{PR}(rho ))=1). Using dynamic circuits, quantum computers can implement a reset instruction with a mid-circuit measurement followed by a classically controlled quantum X gate that depends on the measurement outcome44 (see Fig. 6). For example, this is how a reset is now implemented on IBM quantum computers supported by OpenQASM341. A deterministic RESET instruction (left) is executed with this dynamic circuit. This can be used as a basis for a reset noise channel, (mathscr {E}_{PR}). A single line represents a qubit and a double-line represents a classical bit. A model classical ancillary system (right) would be executed on a classical computer. The classical NOT gate, (X_p), is executed with probability p, which in turn triggers a classical controlled RESET instruction with probability p. In classical computing, execution of a probabilistic instruction is triggered using a random number generator (RNG), such as those widely available in software as PRNGs or in hardware as HRNGs. Here we employ a classical RNG to probabilistically activate a reset, which is identical to reset noise. In this way, artificial reset noise is implemented without ancilla qubits. Ancilla qubits would be an undesirable overhead in the larger scheme presented in this work in which unitary gates require potentially many corresponding noise channels. This hybrid approach may be viable for other noise channels. For example, reset noise can approximate amplitude damping noise to high precision43. View post:
Optimizing quantum noise-induced reservoir computing for ... - Nature.com