October 17, 2025

How quantum state preparation can make or break your hackathon project

Hugo Jaunin

The following is a guest post by Hugo Jaunin, a quantum algorithms engineer, presenting their winning project of one of the challenges of QHack 2024.

Interested in learning more about state preparation? Check out the following resources, and state preparation functions in PennyLane:

Möttonen multi-controlled gates decomposition

Initial state preparation

Matrix product states

Our journey and participation in QHack 2024 started with a simple question: are quantum computers beneficial for solving differential equations? This led our team us down the path of exploring algorithms to solve partial differential equations (PDEs) such as the Black-Scholes equation, later expanding the scope to solve a more complex case---systems of differential equations to simulate a pressure wave for the Airbus-BMW Group Quantum Computing Challenge.

Working on this project, like many others, inevitably required us to use some kind of state preparation algorithm. In this case, to load the initial condition to the PDE. In this blog post, we present the basics of amplitude embedding, and what we ended up using it for our project.

State preparation is the first and essential step for any quantum algorithm. It involves creating a specific quantum state before feeding it to a quantum algorithm. For some algorithms, such as the Variational Quantum Eigensolver (VQE), the state preparation can be trivial, since all qubits are usually set to $|0\rangle$ . However, for many others, it requires constructing complex superpositions or entangled states. For example, Quantum Monte Carlo or some Quantum Machine Learning algorithms require to encode the initial state. In practice, the circuit to create such state will usually take an exponential number of gates, hindering the complexity advantage of the rest of the algorithm. It is therefore crucial to find efficient ways to load data in a quantum state. Note that a completely general state preparation will at least require an exponential number of gates [1][2]. To preserve a potential quantum advantage, one will have to resort to use symmetry in the data, or consider approximations in the state preparation.

In this blog post, we will consider amplitude embedding, meaning that the information will be stored in the amplitudes of the quantum state. More precisely, we will load real value data $x_i$ :

\{x_i\}_i, x_i \in \mathbb{R} \longrightarrow |\psi\rangle=\sum_i x_i |i\rangle,

where $|i\rangle$ denotes the computational basis. Those values can be taken from an array or from a real function $f(x)$ . The latter was utilized in our submission to QHack 2024. The goal was to approximately load a function corresponding to the initial condition of a differential equation, which was later solved by a quantum algorithm. The interest of using this approximate function loader is to greatly reduce the number of gates while keeping a controlled fidelity [3].

State Preparation

Grover-Rudolph algorithm

Before considering the approximate function loader, we will first look at exact state preparation. Let's say we want to implement amplitude embedding with only one qubit. Because of the normalization condition, there is only one degree of freedom, and the state can be written as

\begin{split} |\psi\rangle & =a|0\rangle+\sqrt{1-a^2}|1\rangle \\ & = \cos\frac{\theta}{2}|0\rangle + \sin\frac{\theta}{2} |1\rangle \\ & = Ry(\theta)|0\rangle, \end{split}

with $a\in [-1,1]$ . This embedding can be parametrized using trigonometric functions and an angle $\theta\in[0,2\pi]$ . This state can be obtained by a single rotation on the $y$ axis of angle $\theta=2 \arccos(a)$ . The rotation "distributes" the amplitude between $|0\rangle$ and $|1\rangle$ .

For two qubits, we can proceed similarly: first, distribute the correct amplitude between both halves of the states (let say $\{|00\rangle,|01\rangle\}$ and $\{|10\rangle,|11\rangle\}$ ), and distribute a second time as done before for one qubit. However, as the splitting is not necessarily the same between $\{|00\rangle,|01\rangle\}$ and $\{|10\rangle,|11\rangle\}$ , controlled and anti-controlled rotations must be used to specify in which half we are. The corresponding circuit and the intermediate states are shown in the following figure, with the convention that white-dot controlled gate symbolizes an anti-controlled gate being applied only if the controlled qubit is $|0\rangle$ .

Fig. 1. Circuit and intermediate states for 2 qubits state preparation.

It is easy to generalize this to $n$ qubits. The actual computation of the angles is done by considering how much you need to distribute the amplitude at each bisection. The first angle $\theta^{0}$ is obtained with the ratio of the weight of the first half to the total of the function. Then, the angle $\theta^{1}_0$ (respectively $\theta^{(1)}_1$ ) is obtained with the ratio of the weight of the first quarter to the first half (the third quarter over the second half), and so on. A graphical representation of each step is shown below (this time for loading a continuous function). The $k^{\text{th}}$ step adds twice as many controlled rotations than the preceding step, $k-1$ , with one more controlled qubit. We can clearly see the exponential growth of the number of multi-controlled gates as $n$ increases.

Fig. 2. Sketch of iterations of the state preparation algorithm with the corresponding circuit.

The general formula is

\theta_l^{(k)} = 2\arccos\sqrt{\frac{\sum_{i=l\Delta_k}^{(l+1/2)\Delta_k}x_i^2}{\sum_{i=l\Delta_k}^{(l+1)\Delta_k}x_i^2}}, \hspace{1cm} \theta_l^{(k)} = 2\arccos\sqrt{\frac{\int_{x_{\text{min}}+l\Delta_k}^{x_{\text{min}}+(l+1/2)\Delta_k}x^2\mathrm{d}x}{\int_{x_{\text{min}}+l\Delta_k}^{x_{\text{min}}+(l+1)\Delta_k}x^2\mathrm{d}x}},

for the discrete (continuous) case, with $\Delta_k = 2^{n-k}$ ( $\Delta_k = \frac{x_{\text{max}}-x_{\text{min}}}{2^{k}}$ ) and $l=0,\cdots, 2^{k}-1$ .

Möttonen algorithm

The multi-controlled gates previously mentioned are usually not directly realizable on quantum hardware and therefore need to be transpiled into multiple basis gates. In our case, the multi-controlled NOT needs to be decomposed into multiple CNOT gates [4]. It is therefore crucial to use the least amount of multi-controlled gates as possible. It is in fact possible to completely remove all multi-controlled gates, and only use the same amount of single CNOT gates. This is achieved by the Möttonen algorithm [5, 6]. It is based on the Gray code: a rearrangement of the binary numbers such that two successive values differ in only one bit. The Gray code for 3 bits can be seen at the bottom of the below figure. As a consequence of this reordering, only gates controlled by a single qubit are needed instead of multi-controlled ones, greatly reducing the number of gates. The corresponding circuit is shown below. Another consequence is a small change of variable on the rotation angles (only changing the sign of some angles), denoted by the tilde. This algorithm is still exact and scales exponentially with $n$ .

Fig. 3.(a) Grover-Rudolph and Möttonen circuit for multi-controlled rotations, (b) Gray code for 3 bits.

Approximate function loader

Finally, we reach the approximate function loader. The idea is rather simple: as seen in the previous section on the Grover-Rudolph algorithm, each step takes twice as many controlled rotations as the previous one, and provides more and more details in the distribution. By simply truncating the number of steps, let say by doing only $k$ bisections out of the $n$ , we remove the most costly steps and only lose the smallest details in the distribution. To partially compensate for this loss, the removed angles are clustered into a mean rotation applied without control. The circuit is shown below for 6 qubits, and only 4 bisections.

Fig. 4. Circuit for approximate function loading, with 6 qubits and 4 bisections.

Denoting $\bar{\theta}$ , the average of the angles that should have been in the bisection: $\bar{\theta}^{(k)} = \frac{1}{2^k}\sum_{i=0}^{2^k-1} \theta^{(k)}_i$ . An important result from [3] is theorem 1. It gives the minimum number of bisections $k(\epsilon)$ needed to reach a fidelity of $1-\epsilon$ , depending on the condition number $\eta$ . This $\eta$ only depends on the function to load ( $\eta=\text{sup }\partial_x^2\:\text{log}\:f(x)$ ) and needs to be bounded by $8\pi$ for the theorem to be applicable. Notably, $k(\epsilon)$ is asymptotically independent of $n$ , meaning that for sufficiently smooth functions, it is possible to efficiently load them with a controlled fidelity. To illustrate the potential gain of this approximation, we use the aforementioned function for the Grover-Rudolph algorithm, discretized for 10 qubits. The exact state preparation using Möttonen algorithm takes 2047 CNOT gates, and naturally yields a fidelity of 1. Alternatively, if we only perform 5 bisections instead of 10, the number of CNOT gates drops to 68, with a fidelity of 0.9992. This trade-off is particularly beneficial when dealing with noisy computers. Indeed, it is preferable to have far fewer gates and a slightly reduced maximal achievable fidelity rather than aiming for perfect fidelity that would not be reached anyways because of the circuit depth.

Conclusion

In this blog post, we have presented the start of our journey through quantum state preparation. We have seen the standard Grover-Rudolph algorithm, as well as the optimization via the Möttonen algorithm. While those two methods provide exact results, they come at the cost of an exponential number of gates, hindering a potential quantum advantage. We then turn our attention to approximate function loading for our project, providing us with an efficient way to load a function (under some smoothness condition) with a controlled fidelity. This method showed its real value when adding a noise model to the simulation, greatly reducing the circuit depth, while keeping a good fidelity. State preparation remains a major bottleneck for many algorithms and much research is being carried out in this area.

References

[1] I. L. Chuang, M. A. Nielsen, Quantum Computation and Quantum Information. Cambridge University Press, Chap. 4 (2010).

[2] X. Sun et al., Asymptotically Optimal Circuit Depth for Quantum State Preparation and General Unitary Synthesis. arXiv: 2108.06150 (2023) [quant-ph]. https://arxiv.org/abs/2108.06150

[3] G. Marin-Sanchez, J. Gonzalez-Conde, M. Sanz, Quantum algorithms for approximate function loading. Phys. Rev. Res. 5, 033114 (2023). http://dx.doi.org/10.1103/PhysRevResearch.5.033114

[4] A. Barenco et al., Elementary gates for quantum computation. Phys. Rev. A 52, 3457 (1995). https://doi.org/10.1103/PhysRevA.52.3457

[5] M. Möttönen et al., Quantum Circuits for General Multiqubit Gates. Phys. Rev. Lett. 93, 130502 (2004). https://doi.org/10.1103/PhysRevLett.93.130502

[6] PennyLane team, qml.MottonenStatePreparation. PennyLane Documentation, Version 0.42.3 (2025). https://docs.pennylane.ai/en/stable/code/api/pennylane.MottonenStatePreparation.html

About the author

Hugo Jaunin

During my master’s degree in Physics at École Polytechnique Fédérale de Lausanne (EPFL), I developed a strong interest toward quantum information and computing. After my master thesis in the Computational Quantum Science Laboratory, I began working ...

Last modified: October 17, 2025

About

Software & Documentation

Resources

Topics

Community & Support