PennyLane
Install
Install

Related materials

  • Related contentBarren plateaus in quantum neural networks
  • Related contentAccelerating VQEs with quantum natural gradient
  • Related contentQuantum generative adversarial networks with Cirq + TensorFlow

Contents

  1. Barren Plateaus
  2. Visualizing the problem
  3. A more thorough analysis
  4. References
  5. About the author

Downloads

  • Download Python script
  • Download Notebook
  • View on GitHub
  1. Demos/
  2. Optimization/
  3. Alleviating barren plateaus with local cost functions

Alleviating barren plateaus with local cost functions

Thomas Storwick

Thomas Storwick

Published: September 08, 2020. Last updated: November 05, 2024.

Barren Plateaus

Barren plateaus are large regions of the cost function’s parameter space where the variance of the gradient is almost 0; or, put another way, the cost function landscape is flat. This means that a variational circuit initialized in one of these areas will be untrainable using any gradient-based algorithm.

In “Cost-Function-Dependent Barren Plateaus in Shallow Quantum Neural Networks” 1, Cerezo et al. demonstrate the idea that the barren plateau phenomenon can, under some circumstances, be avoided by using cost functions that only have information from part of the circuit. These local cost functions can be more robust against noise, and may have better-behaved gradients with no plateaus for shallow circuits.

demos/_static/demonstration_assets/local_cost_functions/Cerezo_et_al_local_cost_functions.png

Taken from Cerezo et al. 1.¶

Many variational quantum algorithms are constructed to use global cost functions. Information from the entire measurement is used to analyze the result of the circuit, and a cost function is calculated from this to quantify the circuit’s performance. A local cost function only considers information from a few qubits, and attempts to analyze the behavior of the entire circuit from this limited scope.

Cerezo et al. also handily prove that these local cost functions are bounded by the global ones, i.e., if a global cost function is formulated in the manner described by Cerezo et al., then the value of its corresponding local cost function will always be less than or equal to the value of the global cost function.

In this notebook, we investigate the effect of barren plateaus in variational quantum algorithms, and how they can be mitigated using local cost functions.

We first need to import the following modules.

import pennylane as qml
from pennylane import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import LinearLocator, FormatStrFormatter

np.random.seed(42)

Visualizing the problem

To start, let’s look at the task of learning the identity gate across multiple qubits. This will help us visualize the problem and get a sense of what is happening in the cost landscape.

First we define a number of wires we want to train on. The work by Cerezo et al. shows that circuits are trainable under certain regimes, so how many qubits we train on will effect our results.

wires = 6
dev = qml.device("lightning.qubit", wires=wires, shots=10000)

Next, we want to define our QNodes and our circuit ansatz. For this simple example, an ansatz that works well is simply a rotation along X, and a rotation along Y, repeated across all the qubits.

We will also define our cost functions here. Since we are trying to learn the identity gate, a natural cost function is 1 minus the probability of measuring the zero state, denoted here as \(1 - p_{|0\rangle}.\)

\[C = \langle \psi(\theta) | \left(I - |0\rangle \langle 0|\right) | \psi(\theta) \rangle =1-p_{|0\rangle}\]

We will apply this across all qubits for our global cost function, i.e.,

\[C_{G} = \langle \psi(\theta) | \left(I - |00 \ldots 0\rangle \langle 00 \ldots 0|\right) | \psi(\theta) \rangle = 1-p_{|00 \ldots 0\rangle}\]

and for the local cost function, we will sum the individual contributions from each qubit:

\[C_L = \langle \psi(\theta) | \left(I - \frac{1}{n} \sum_j |0\rangle \langle 0|_j\right)|\psi(\theta)\rangle = 1 - \sum_j p_{|0\rangle_j}.\]

It might be clear to some readers now why this function can perform better. By formatting our local cost function in this way, we have essentially divided the problem up into multiple single-qubit terms, and summed all the results up.

To implement this, we will define a separate QNode for the local cost function and the global cost function.

def global_cost_simple(rotations):
    for i in range(wires):
        qml.RX(rotations[0][i], wires=i)
        qml.RY(rotations[1][i], wires=i)
    return qml.probs(wires=range(wires))

def local_cost_simple(rotations):
    for i in range(wires):
        qml.RX(rotations[0][i], wires=i)
        qml.RY(rotations[1][i], wires=i)
    return [qml.probs(wires=i) for i in range(wires)]

global_circuit = qml.QNode(global_cost_simple, dev, interface="autograd")

local_circuit = qml.QNode(local_cost_simple, dev, interface="autograd")

def cost_local(rotations):
    return 1 - np.sum([i for (i, _) in local_circuit(rotations)]) / wires

def cost_global(rotations):
    return 1 - global_circuit(rotations)[0]

To analyze each of the circuits, we provide some random initial parameters for each rotation.

RX = np.random.uniform(low=-np.pi, high=np.pi)
RY = np.random.uniform(low=-np.pi, high=np.pi)
rotations = [[RX for i in range(wires)], [RY for i in range(wires)]]

Examining the results:

print("Global Cost: {: .7f}".format(cost_global(rotations)))
print("Local Cost: {: .7f}".format(cost_local(rotations)))

qml.drawer.use_style('black_white')
fig1, ax1 = qml.draw_mpl(global_circuit, decimals=2)(rotations)
fig1.suptitle("Global Cost", fontsize='xx-large')
plt.show()

fig2, ax2 = qml.draw_mpl(local_circuit, decimals=2)(rotations)
fig2.suptitle("Local Cost", fontsize='xx-large')
plt.show()
  • Global Cost
  • Local Cost
Global Cost:  0.9694000
Local Cost:  0.4423667

With this simple example, we can visualize the cost function, and see the barren plateau effect graphically. Although there are \(2n\) (where \(n\) is the number of qubits) parameters, in order to plot the cost landscape we must constrain ourselves. We will consider the case where all X rotations have the same value, and all the Y rotations have the same value.

Firstly, we look at the global cost function. When plotting the cost function across 6 qubits, much of the cost landscape is flat, and difficult to train (even with a circuit depth of only 2!). This effect will worsen as the number of qubits increases.

def generate_surface(cost_function):
    Z = []
    Z_assembler = []

    X = np.arange(-np.pi, np.pi, 0.25)
    Y = np.arange(-np.pi, np.pi, 0.25)
    X, Y = np.meshgrid(X, Y)

    for x in X[0, :]:
        for y in Y[:, 0]:
            rotations = [[x for i in range(wires)], [y for i in range(wires)]]
            Z_assembler.append(cost_function(rotations))
        Z.append(Z_assembler)
        Z_assembler = []

    Z = np.asarray(Z)
    return Z

def plot_surface(surface):
    X = np.arange(-np.pi, np.pi, 0.25)
    Y = np.arange(-np.pi, np.pi, 0.25)
    X, Y = np.meshgrid(X, Y)
    fig = plt.figure()
    ax = fig.add_subplot(111, projection="3d")
    surf = ax.plot_surface(X, Y, surface, cmap="viridis", linewidth=0, antialiased=False)
    ax.set_zlim(0, 1)
    ax.zaxis.set_major_locator(LinearLocator(10))
    ax.zaxis.set_major_formatter(FormatStrFormatter("%.02f"))
    plt.show()


global_surface = generate_surface(cost_global)
plot_surface(global_surface)
tutorial local cost functions

However, when we change to the local cost function, the cost landscape becomes much more trainable as the size of the barren plateau decreases.

local_surface = generate_surface(cost_local)
plot_surface(local_surface)
tutorial local cost functions

Those are some nice pictures, but how do they reflect actual trainability? Let us try training both the local and global cost functions. To simplify this model, let’s modify our cost function from

\[C_{L} = 1-\sum p_{|0\rangle},\]

where we sum the marginal probabilities of each qubit, to

\[C_{L} = 1-p_{|0\rangle},\]

where we only consider the probability of a single qubit to be in the 0 state.

While we’re at it, let us make our ansatz a little more like one we would encounter while trying to solve a VQE problem, and add entanglement.

def global_cost_simple(rotations):
    for i in range(wires):
        qml.RX(rotations[0][i], wires=i)
        qml.RY(rotations[1][i], wires=i)
    for i in range(wires - 1):
        qml.CNOT([i, i + 1])
    return qml.probs(wires=range(wires))

def local_cost_simple(rotations):
    for i in range(wires):
        qml.RX(rotations[0][i], wires=i)
        qml.RY(rotations[1][i], wires=i)
    for i in range(wires - 1):
        qml.CNOT([i, i + 1])
    return qml.probs(wires=[0])

global_circuit = qml.QNode(global_cost_simple, dev, interface="autograd")

local_circuit = qml.QNode(local_cost_simple, dev, interface="autograd")

def cost_local(rotations):
    return 1 - local_circuit(rotations)[0]

def cost_global(rotations):
    return 1 - global_circuit(rotations)[0]

Of course, now that we’ve changed both our cost function and our circuit, we will need to scan the cost landscape again.

global_surface = generate_surface(cost_global)
plot_surface(global_surface)

local_surface = generate_surface(cost_local)
plot_surface(local_surface)
  • tutorial local cost functions
  • tutorial local cost functions

It seems our changes didn’t significantly alter the overall cost landscape. This probably isn’t a general trend, but it is a nice surprise. Now, let us get back to training the local and global cost functions. Because we have a visualization of the total cost landscape, let’s pick a point to exaggerate the problem. One of the worst points in the landscape is \((\pi,0)\) as it is in the middle of the plateau, so let’s use that.

rotations = np.array([[3.] * len(range(wires)), [0.] * len(range(wires))], requires_grad=True)
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 100
params_global = rotations
for i in range(steps):
    # update the circuit parameters
    params_global = opt.step(cost_global, params_global)

    if (i + 1) % 5 == 0:
        print("Cost after step {:5d}: {: .7f}".format(i + 1, cost_global(params_global)))
    if cost_global(params_global) < 0.1:
        break
fig, ax = qml.draw_mpl(global_circuit, decimals=2)(params_global)
plt.show()
tutorial local cost functions
Cost after step     5:  1.0000000
Cost after step    10:  1.0000000
Cost after step    15:  1.0000000
Cost after step    20:  1.0000000
Cost after step    25:  1.0000000
Cost after step    30:  1.0000000
Cost after step    35:  1.0000000
Cost after step    40:  1.0000000
Cost after step    45:  1.0000000
Cost after step    50:  1.0000000
Cost after step    55:  1.0000000
Cost after step    60:  1.0000000
Cost after step    65:  1.0000000
Cost after step    70:  1.0000000
Cost after step    75:  1.0000000
Cost after step    80:  1.0000000
Cost after step    85:  1.0000000
Cost after step    90:  1.0000000
Cost after step    95:  1.0000000
Cost after step   100:  1.0000000

After 100 steps, the cost function is still exactly 1. Clearly we are in an “untrainable” area. Now, let us limit ourselves to the local cost function and see how it performs.

rotations = np.array([[3. for i in range(wires)], [0. for i in range(wires)]], requires_grad=True)
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 100
params_local = rotations
for i in range(steps):
    # update the circuit parameters
    params_local = opt.step(cost_local, params_local)

    if (i + 1) % 5 == 0:
        print("Cost after step {:5d}: {: .7f}".format(i + 1, cost_local(params_local)))
    if cost_local(params_local) < 0.05:
        break

fig, ax = qml.draw_mpl(local_circuit, decimals=2)(params_local)
plt.show()
tutorial local cost functions
Cost after step     5:  0.9887000
Cost after step    10:  0.9646000
Cost after step    15:  0.9208000
Cost after step    20:  0.8151000
Cost after step    25:  0.6294000
Cost after step    30:  0.3794000
Cost after step    35:  0.1827000
Cost after step    40:  0.0720000

It trained! And much faster than the global case. However, we know our local cost function is bounded by the global one, but just how much have we trained it?

cost_global(params_local)
1.0

Interestingly, the global cost function is still 1. If we trained the local cost function, why hasn’t the global cost function changed?

The answer is that we have trained the global cost a little bit, but not enough to see a change with only 10000 shots. To see the effect, we’ll need to increase the number of shots to an unreasonable amount. Instead, making the backend analytic by setting shots to None, gives us the exact representation.

_dev = qml.device("lightning.qubit", wires=wires, shots=None)
global_circuit = qml.QNode(global_cost_simple, _dev, interface="autograd")
print(
    "Current cost: "
    + str(cost_global(params_local))
    + ".\nInitial cost: "
    + str(cost_global([[3.0 for i in range(wires)], [0 for i in range(wires)]]))
    + ".\nDifference: "
    + str(
        cost_global([[3.0 for i in range(wires)], [0 for i in range(wires)]])
        - cost_global(params_local)
    )
)
Current cost: 0.9999999999971548.
Initial cost: 0.9999999999999843.
Difference: 2.829514400559674e-12

Our circuit has definitely been trained, but not a useful amount. If we attempt to use this circuit, it would act the same as if we never trained at all. Furthermore, if we now attempt to train the global cost function, we are still firmly in the plateau region. In order to fully train the global circuit, we will need to increase the locality gradually as we train.

def tunable_cost_simple(rotations):
    for i in range(wires):
        qml.RX(rotations[0][i], wires=i)
        qml.RY(rotations[1][i], wires=i)
    for i in range(wires - 1):
        qml.CNOT([i, i + 1])
    return qml.probs(range(locality))

def cost_tunable(rotations):
    return 1 - tunable_circuit(rotations)[0]

tunable_circuit = qml.QNode(tunable_cost_simple, dev, interface="autograd")
locality = 2
params_tunable = params_local
fig, ax = qml.draw_mpl(tunable_circuit, decimals=2)(params_tunable)
plt.show()
print(cost_tunable(params_tunable))

locality = 2
opt = qml.GradientDescentOptimizer(stepsize=0.1)
steps = 600
for i in range(steps):
    # update the circuit parameters
    params_tunable = opt.step(cost_tunable, params_tunable)

    runCost = cost_tunable(params_tunable)
    if (i + 1) % 10 == 0:
        print(
            "Cost after step {:5d}: {: .7f}".format(i + 1, runCost)
            + ". Locality: "
            + str(locality)
        )

    if runCost < 0.1 and locality < wires:
        print("---Switching Locality---")
        locality += 1
        continue
    elif runCost < 0.1 and locality >= wires:
        break
fig, ax = qml.draw_mpl(tunable_circuit, decimals=2)(params_tunable)
plt.show()
  • tutorial local cost functions
  • tutorial local cost functions
0.9953
Cost after step    10:  0.9891000. Locality: 2
Cost after step    20:  0.9693000. Locality: 2
Cost after step    30:  0.9282000. Locality: 2
Cost after step    40:  0.8390000. Locality: 2
Cost after step    50:  0.6754000. Locality: 2
Cost after step    60:  0.4344000. Locality: 2
Cost after step    70:  0.2155000. Locality: 2
Cost after step    80:  0.0911000. Locality: 2
---Switching Locality---
Cost after step    90:  0.9889000. Locality: 3
Cost after step   100:  0.9695000. Locality: 3
Cost after step   110:  0.9394000. Locality: 3
Cost after step   120:  0.8542000. Locality: 3
Cost after step   130:  0.6997000. Locality: 3
Cost after step   140:  0.4731000. Locality: 3
Cost after step   150:  0.2610000. Locality: 3
Cost after step   160:  0.1110000. Locality: 3
---Switching Locality---
Cost after step   170:  0.9907000. Locality: 4
Cost after step   180:  0.9781000. Locality: 4
Cost after step   190:  0.9505000. Locality: 4
Cost after step   200:  0.8807000. Locality: 4
Cost after step   210:  0.7427000. Locality: 4
Cost after step   220:  0.5386000. Locality: 4
Cost after step   230:  0.3042000. Locality: 4
Cost after step   240:  0.1419000. Locality: 4
---Switching Locality---
Cost after step   250:  0.9915000. Locality: 5
Cost after step   260:  0.9812000. Locality: 5
Cost after step   270:  0.9561000. Locality: 5
Cost after step   280:  0.9006000. Locality: 5
Cost after step   290:  0.7869000. Locality: 5
Cost after step   300:  0.5916000. Locality: 5
Cost after step   310:  0.3537000. Locality: 5
Cost after step   320:  0.1672000. Locality: 5
---Switching Locality---
Cost after step   330:  0.9956000. Locality: 6
Cost after step   340:  0.9863000. Locality: 6
Cost after step   350:  0.9671000. Locality: 6
Cost after step   360:  0.9211000. Locality: 6
Cost after step   370:  0.8242000. Locality: 6
Cost after step   380:  0.6524000. Locality: 6
Cost after step   390:  0.4139000. Locality: 6
Cost after step   400:  0.2184000. Locality: 6

A more thorough analysis

Now the circuit can be trained, even though we started from a place where the global function has a barren plateau. The significance of this is that we can now train from every starting location in this example.

But, how often does this problem occur? If we wanted to train this circuit from a random starting point, how often would we be stuck in a plateau? To investigate this, let’s attempt to train the global cost function using random starting positions and count how many times we run into a barren plateau.

Let’s use a number of qubits we are more likely to use in a real variational circuit: n=10. We will say that after 400 steps, any run with a cost function of less than 0.9 (chosen arbitrarily) will probably be trainable given more time. Any run with a greater cost function will probably be in a plateau.

This may take up to 15 minutes.

samples = 10
plateau = 0
trained = 0
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 400
wires = 8

dev = qml.device("lightning.qubit", wires=wires, shots=10000)
global_circuit = qml.QNode(global_cost_simple, dev, interface="autograd")

for runs in range(samples):
    print("--- New run! ---")
    has_been_trained = False

    params_global = np.random.uniform(-np.pi, np.pi, (2, wires), requires_grad=True)

    for i in range(steps):
        # update the circuit parameters
        params_global = opt.step(cost_global, params_global)

        if (i + 1) % 20 == 0:
            print("Cost after step {:5d}: {: .7f}".format(i + 1, cost_global(params_global)))
        if cost_global(params_global) < 0.9:
            has_been_trained = True
            break
    if has_been_trained:
        trained = trained + 1
    else:
        plateau = plateau + 1
    print("Trained: {:5d}".format(trained))
    print("Plateau'd: {:5d}".format(plateau))


samples = 10
plateau = 0
trained = 0
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 400
wires = 8

dev = qml.device("lightning.qubit", wires=wires, shots=10000)
tunable_circuit = qml.QNode(tunable_cost_simple, dev, interface="autograd")

for runs in range(samples):
    locality = 1
    print("--- New run! ---")
    has_been_trained = False

    params_tunable = np.random.uniform(-np.pi, np.pi, (2, wires), requires_grad=True)
    for i in range(steps):
        # update the circuit parameters
        params_tunable = opt.step(cost_tunable, params_tunable)

        runCost = cost_tunable(params_tunable)
        if (i + 1) % 10 == 0:
            print(
                "Cost after step {:5d}: {: .7f}".format(i + 1, runCost)
                + ". Locality: "
                + str(locality)
            )

        if runCost < 0.5 and locality < wires:
            print("---Switching Locality---")
            locality += 1
            continue
        elif runCost < 0.1 and locality >= wires:
            trained = trained + 1
            has_been_trained = True
            break
    if not has_been_trained:
        plateau = plateau + 1
    print("Trained: {:5d}".format(trained))
    print("Plateau'd: {:5d}".format(plateau))
--- New run! ---
Cost after step    20:  1.0000000
Cost after step    40:  1.0000000
Cost after step    60:  1.0000000
Cost after step    80:  1.0000000
Cost after step   100:  1.0000000
Cost after step   120:  1.0000000
Cost after step   140:  1.0000000
Cost after step   160:  1.0000000
Cost after step   180:  1.0000000
Cost after step   200:  1.0000000
Cost after step   220:  1.0000000
Cost after step   240:  1.0000000
Cost after step   260:  1.0000000
Cost after step   280:  1.0000000
Cost after step   300:  0.9999000
Cost after step   320:  0.9999000
Cost after step   340:  1.0000000
Cost after step   360:  1.0000000
Cost after step   380:  1.0000000
Cost after step   400:  0.9999000
Trained:     0
Plateau'd:     1
--- New run! ---
Cost after step    20:  0.9997000
Cost after step    40:  0.9999000
Cost after step    60:  1.0000000
Cost after step    80:  1.0000000
Cost after step   100:  0.9999000
Cost after step   120:  1.0000000
Cost after step   140:  0.9998000
Cost after step   160:  1.0000000
Cost after step   180:  0.9999000
Cost after step   200:  0.9997000
Cost after step   220:  0.9997000
Cost after step   240:  0.9998000
Cost after step   260:  0.9998000
Cost after step   280:  1.0000000
Cost after step   300:  0.9998000
Cost after step   320:  0.9996000
Cost after step   340:  0.9999000
Cost after step   360:  0.9998000
Cost after step   380:  0.9998000
Cost after step   400:  0.9999000
Trained:     0
Plateau'd:     2
--- New run! ---
Cost after step    20:  0.9695000
Cost after step    40:  0.9602000
Cost after step    60:  0.9465000
Cost after step    80:  0.9182000
Trained:     1
Plateau'd:     2
--- New run! ---
Cost after step    20:  0.9996000
Cost after step    40:  1.0000000
Cost after step    60:  0.9999000
Cost after step    80:  0.9995000
Cost after step   100:  0.9999000
Cost after step   120:  0.9994000
Cost after step   140:  0.9998000
Cost after step   160:  0.9996000
Cost after step   180:  0.9996000
Cost after step   200:  0.9995000
Cost after step   220:  0.9995000
Cost after step   240:  0.9997000
Cost after step   260:  0.9998000
Cost after step   280:  0.9999000
Cost after step   300:  0.9998000
Cost after step   320:  0.9993000
Cost after step   340:  0.9995000
Cost after step   360:  0.9996000
Cost after step   380:  0.9993000
Cost after step   400:  0.9994000
Trained:     1
Plateau'd:     3
--- New run! ---
Cost after step    20:  0.9998000
Cost after step    40:  0.9999000
Cost after step    60:  0.9998000
Cost after step    80:  0.9998000
Cost after step   100:  1.0000000
Cost after step   120:  0.9999000
Cost after step   140:  0.9998000
Cost after step   160:  0.9995000
Cost after step   180:  0.9999000
Cost after step   200:  0.9998000
Cost after step   220:  0.9995000
Cost after step   240:  0.9999000
Cost after step   260:  0.9998000
Cost after step   280:  0.9996000
Cost after step   300:  0.9997000
Cost after step   320:  1.0000000
Cost after step   340:  0.9999000
Cost after step   360:  0.9998000
Cost after step   380:  0.9996000
Cost after step   400:  0.9997000
Trained:     1
Plateau'd:     4
--- New run! ---
Cost after step    20:  0.9996000
Cost after step    40:  0.9998000
Cost after step    60:  0.9999000
Cost after step    80:  0.9997000
Cost after step   100:  0.9999000
Cost after step   120:  1.0000000
Cost after step   140:  0.9999000
Cost after step   160:  1.0000000
Cost after step   180:  1.0000000
Cost after step   200:  0.9999000
Cost after step   220:  0.9999000
Cost after step   240:  0.9999000
Cost after step   260:  1.0000000
Cost after step   280:  0.9999000
Cost after step   300:  0.9999000
Cost after step   320:  1.0000000
Cost after step   340:  0.9998000
Cost after step   360:  0.9999000
Cost after step   380:  0.9999000
Cost after step   400:  0.9997000
Trained:     1
Plateau'd:     5
--- New run! ---
Cost after step    20:  0.9987000
Cost after step    40:  0.9976000
Cost after step    60:  0.9964000
Cost after step    80:  0.9970000
Cost after step   100:  0.9977000
Cost after step   120:  0.9959000
Cost after step   140:  0.9955000
Cost after step   160:  0.9938000
Cost after step   180:  0.9949000
Cost after step   200:  0.9937000
Cost after step   220:  0.9934000
Cost after step   240:  0.9896000
Cost after step   260:  0.9900000
Cost after step   280:  0.9848000
Cost after step   300:  0.9802000
Cost after step   320:  0.9804000
Cost after step   340:  0.9695000
Cost after step   360:  0.9536000
Cost after step   380:  0.9280000
Trained:     2
Plateau'd:     5
--- New run! ---
Cost after step    20:  0.9996000
Cost after step    40:  0.9997000
Cost after step    60:  0.9999000
Cost after step    80:  0.9996000
Cost after step   100:  0.9998000
Cost after step   120:  0.9997000
Cost after step   140:  0.9999000
Cost after step   160:  1.0000000
Cost after step   180:  0.9998000
Cost after step   200:  0.9995000
Cost after step   220:  0.9999000
Cost after step   240:  1.0000000
Cost after step   260:  0.9998000
Cost after step   280:  1.0000000
Cost after step   300:  0.9997000
Cost after step   320:  0.9997000
Cost after step   340:  0.9997000
Cost after step   360:  1.0000000
Cost after step   380:  0.9997000
Cost after step   400:  0.9998000
Trained:     2
Plateau'd:     6
--- New run! ---
Cost after step    20:  0.9998000
Cost after step    40:  0.9996000
Cost after step    60:  0.9992000
Cost after step    80:  0.9997000
Cost after step   100:  0.9995000
Cost after step   120:  0.9992000
Cost after step   140:  0.9993000
Cost after step   160:  0.9992000
Cost after step   180:  0.9987000
Cost after step   200:  0.9996000
Cost after step   220:  0.9991000
Cost after step   240:  0.9991000
Cost after step   260:  0.9986000
Cost after step   280:  0.9985000
Cost after step   300:  0.9979000
Cost after step   320:  0.9986000
Cost after step   340:  0.9982000
Cost after step   360:  0.9988000
Cost after step   380:  0.9978000
Cost after step   400:  0.9980000
Trained:     2
Plateau'd:     7
--- New run! ---
Cost after step    20:  0.9958000
Cost after step    40:  0.9941000
Cost after step    60:  0.9891000
Cost after step    80:  0.9852000
Cost after step   100:  0.9750000
Cost after step   120:  0.9596000
Cost after step   140:  0.9232000
Trained:     3
Plateau'd:     7
--- New run! ---
Cost after step    10:  0.6276000. Locality: 1
---Switching Locality---
Cost after step    20:  0.5827000. Locality: 2
---Switching Locality---
Cost after step    30:  0.5426000. Locality: 3
---Switching Locality---
Cost after step    40:  0.5542000. Locality: 4
---Switching Locality---
Cost after step    50:  0.6839000. Locality: 5
---Switching Locality---
---Switching Locality---
Cost after step    60:  0.8324000. Locality: 7
Cost after step    70:  0.6394000. Locality: 7
---Switching Locality---
Cost after step    80:  0.3874000. Locality: 8
Cost after step    90:  0.0986000. Locality: 8
Trained:     1
Plateau'd:     0
--- New run! ---
---Switching Locality---
---Switching Locality---
---Switching Locality---
Cost after step    10:  0.5067000. Locality: 4
---Switching Locality---
Cost after step    20:  0.5451000. Locality: 5
---Switching Locality---
---Switching Locality---
Cost after step    30:  0.7808000. Locality: 7
Cost after step    40:  0.6740000. Locality: 7
Cost after step    50:  0.5561000. Locality: 7
---Switching Locality---
Cost after step    60:  0.4887000. Locality: 8
Cost after step    70:  0.2089000. Locality: 8
Trained:     2
Plateau'd:     0
--- New run! ---
---Switching Locality---
---Switching Locality---
---Switching Locality---
Cost after step    10:  0.7037000. Locality: 4
Cost after step    20:  0.5492000. Locality: 4
---Switching Locality---
Cost after step    30:  0.8289000. Locality: 5
Cost after step    40:  0.5817000. Locality: 5
---Switching Locality---
---Switching Locality---
Cost after step    50:  0.8410000. Locality: 7
Cost after step    60:  0.6219000. Locality: 7
---Switching Locality---
Cost after step    70:  0.7283000. Locality: 8
Cost after step    80:  0.4918000. Locality: 8
Cost after step    90:  0.1767000. Locality: 8
Trained:     3
Plateau'd:     0
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step    10:  0.7776000. Locality: 3
Cost after step    20:  0.5554000. Locality: 3
---Switching Locality---
Cost after step    30:  0.6237000. Locality: 4
---Switching Locality---
Cost after step    40:  0.6767000. Locality: 5
Cost after step    50:  0.4903000. Locality: 5
---Switching Locality---
Cost after step    60:  0.6747000. Locality: 6
Cost after step    70:  0.5928000. Locality: 6
Cost after step    80:  0.5447000. Locality: 6
Cost after step    90:  0.4935000. Locality: 6
---Switching Locality---
Cost after step   100:  0.7910000. Locality: 7
Cost after step   110:  0.6156000. Locality: 7
---Switching Locality---
Cost after step   120:  0.8165000. Locality: 8
Cost after step   130:  0.6776000. Locality: 8
Cost after step   140:  0.4852000. Locality: 8
Cost after step   150:  0.1939000. Locality: 8
Trained:     4
Plateau'd:     0
--- New run! ---
---Switching Locality---
Cost after step    10:  0.8457000. Locality: 2
Cost after step    20:  0.6231000. Locality: 2
---Switching Locality---
Cost after step    30:  0.7126000. Locality: 3
Cost after step    40:  0.5408000. Locality: 3
---Switching Locality---
Cost after step    50:  0.7150000. Locality: 4
---Switching Locality---
Cost after step    60:  0.9484000. Locality: 5
Cost after step    70:  0.8675000. Locality: 5
Cost after step    80:  0.7424000. Locality: 5
Cost after step    90:  0.5702000. Locality: 5
---Switching Locality---
Cost after step   100:  0.9808000. Locality: 6
Cost after step   110:  0.9457000. Locality: 6
Cost after step   120:  0.8652000. Locality: 6
Cost after step   130:  0.6719000. Locality: 6
---Switching Locality---
Cost after step   140:  0.5561000. Locality: 7
---Switching Locality---
Cost after step   150:  0.5357000. Locality: 8
Cost after step   160:  0.2158000. Locality: 8
Trained:     5
Plateau'd:     0
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step    10:  0.8391000. Locality: 3
Cost after step    20:  0.6534000. Locality: 3
---Switching Locality---
Cost after step    30:  0.9679000. Locality: 4
Cost after step    40:  0.9102000. Locality: 4
Cost after step    50:  0.7977000. Locality: 4
Cost after step    60:  0.5852000. Locality: 4
---Switching Locality---
---Switching Locality---
Cost after step    70:  0.7308000. Locality: 6
Cost after step    80:  0.5911000. Locality: 6
---Switching Locality---
Cost after step    90:  0.5643000. Locality: 7
---Switching Locality---
Cost after step   100:  0.8318000. Locality: 8
Cost after step   110:  0.7003000. Locality: 8
Cost after step   120:  0.5667000. Locality: 8
Cost after step   130:  0.3821000. Locality: 8
Cost after step   140:  0.1264000. Locality: 8
Trained:     6
Plateau'd:     0
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step    10:  0.6272000. Locality: 3
---Switching Locality---
Cost after step    20:  0.7447000. Locality: 4
---Switching Locality---
Cost after step    30:  0.5708000. Locality: 5
---Switching Locality---
Cost after step    40:  0.5962000. Locality: 6
---Switching Locality---
---Switching Locality---
Cost after step    50:  0.6477000. Locality: 8
Cost after step    60:  0.2926000. Locality: 8
Trained:     7
Plateau'd:     0
--- New run! ---
---Switching Locality---
Cost after step    10:  0.8113000. Locality: 2
Cost after step    20:  0.5471000. Locality: 2
---Switching Locality---
---Switching Locality---
Cost after step    30:  0.7613000. Locality: 4
Cost after step    40:  0.4689000. Locality: 4
---Switching Locality---
Cost after step    50:  0.5644000. Locality: 5
---Switching Locality---
---Switching Locality---
Cost after step    60:  0.9028000. Locality: 7
Cost after step    70:  0.7716000. Locality: 7
Cost after step    80:  0.4803000. Locality: 7
---Switching Locality---
Cost after step    90:  0.1707000. Locality: 8
Trained:     8
Plateau'd:     0
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step    10:  0.7504000. Locality: 3
---Switching Locality---
Cost after step    20:  0.5293000. Locality: 4
---Switching Locality---
Cost after step    30:  0.6625000. Locality: 5
Cost after step    40:  0.5834000. Locality: 5
Cost after step    50:  0.5361000. Locality: 5
---Switching Locality---
Cost after step    60:  0.6729000. Locality: 6
Cost after step    70:  0.5219000. Locality: 6
---Switching Locality---
---Switching Locality---
Cost after step    80:  0.7785000. Locality: 8
Cost after step    90:  0.5859000. Locality: 8
Cost after step   100:  0.2681000. Locality: 8
Trained:     9
Plateau'd:     0
--- New run! ---
---Switching Locality---
Cost after step    10:  0.6859000. Locality: 2
Cost after step    20:  0.5361000. Locality: 2
---Switching Locality---
Cost after step    30:  0.7822000. Locality: 3
Cost after step    40:  0.5473000. Locality: 3
---Switching Locality---
Cost after step    50:  0.5477000. Locality: 4
---Switching Locality---
Cost after step    60:  0.9658000. Locality: 5
Cost after step    70:  0.9145000. Locality: 5
Cost after step    80:  0.7916000. Locality: 5
Cost after step    90:  0.5346000. Locality: 5
---Switching Locality---
Cost after step   100:  0.6039000. Locality: 6
---Switching Locality---
Cost after step   110:  0.7382000. Locality: 7
Cost after step   120:  0.6301000. Locality: 7
Cost after step   130:  0.5456000. Locality: 7
---Switching Locality---
Cost after step   140:  0.7332000. Locality: 8
Cost after step   150:  0.6814000. Locality: 8
Cost after step   160:  0.5575000. Locality: 8
Cost after step   170:  0.2818000. Locality: 8
Trained:    10
Plateau'd:     0

In the global case, anywhere between 70-80% of starting positions are untrainable, a significant number. It is likely that, as the complexity of our ansatz—and the number of qubits—increases, this factor will increase.

We can compare that to our local cost function, where every single area trained, and most even trained in less time. While these examples are simple, this local-vs-global cost behaviour has been shown to extend to more complex problems.

References

1(1,2)

Cerezo, M., Sone, A., Volkoff, T., Cincio, L., and Coles, P. (2020). Cost-Function-Dependent Barren Plateaus in Shallow Quantum Neural Networks. arXiv:2001.00550

About the author

Thomas Storwick
Thomas Storwick

Thomas Storwick

Thomas is a graduate student at the University of Waterloo, studying 2-dimensional materials. He is also the co-creator of bloqit, an open source game for building quantum intuition in a fun and competitive way.

Total running time of the script: (5 minutes 20.430 seconds)

Share demo

Ask a question on the forum

Related Demos

Barren plateaus in quantum neural networks

Accelerating VQEs with quantum natural gradient

Quantum generative adversarial networks with Cirq + TensorFlow

QJIT compilation with Qrack and Catalyst

Gaussian transformation

Coherent Variational Quantum Linear Solver

Quantum computation with neutral atoms

Introduction to Geometric Quantum Machine Learning

Variational Quantum Thermalizer

Learning dynamics incoherently: Variational learning using classical shadows

PennyLane

PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Built by researchers, for research. Created with ❤️ by Xanadu.

Research

  • Research
  • Performance
  • Hardware & Simulators
  • Demos
  • Quantum Compilation
  • Quantum Datasets

Education

  • Teach
  • Learn
  • Codebook
  • Coding Challenges
  • Videos
  • Glossary

Software

  • Install PennyLane
  • Features
  • Documentation
  • Catalyst Compilation Docs
  • Development Guide
  • API
  • GitHub
Stay updated with our newsletter

© Copyright 2025 | Xanadu | All rights reserved

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

Privacy Policy|Terms of Service|Cookie Policy|Code of Conduct