Note

Click here to download the full example code

# Alleviating barren plateaus with local cost functions¶

*Author: Thomas Storwick (tstorwick@gmail.com). Posted: 9 Sep 2020. Last updated: 28 Jan 2021.*

## Barren Plateaus¶

Barren plateaus are large regions of the cost function’s parameter space where the variance of the gradient is almost 0; or, put another way, the cost function landscape is flat. This means that a variational circuit initialized in one of these areas will be untrainable using any gradient-based algorithm.

In “Cost-Function-Dependent Barren Plateaus in Shallow Quantum Neural
Networks” [1], Cerezo et al. demonstrate the
idea that the barren plateau
phenomenon can, under some circumstances, be avoided by using cost functions that only have
information from part of the circuit. These *local* cost functions can be
more robust against noise, and may have better-behaved gradients with no
plateaus for shallow circuits.

Many variational quantum algorithms are constructed to use global cost functions. Information from the entire measurement is used to analyze the result of the circuit, and a cost function is calculated from this to quantify the circuit’s performance. A local cost function only considers information from a few qubits, and attempts to analyze the behavior of the entire circuit from this limited scope.

Cerezo et al. also handily prove that these local cost functions are bounded by the global ones, i.e., if a global cost function is formulated in the manner described by Cerezo et al., then the value of its corresponding local cost function will always be less than or equal to the value of the global cost function.

In this notebook, we investigate the effect of barren plateaus in variational quantum algorithms, and how they can be mitigated using local cost functions.

We first need to import the following modules.

```
import pennylane as qml
from pennylane import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import LinearLocator, FormatStrFormatter
np.random.seed(42)
```

## Visualizing the problem¶

To start, let’s look at the task of learning the identity gate across multiple qubits. This will help us visualize the problem and get a sense of what is happening in the cost landscape.

First we define a number of wires we want to train on. The work by Cerezo et al. shows that circuits are trainable under certain regimes, so how many qubits we train on will effect our results.

```
wires = 6
dev = qml.device("default.qubit", wires=wires, shots=10000, analytic=False)
```

Next, we want to define our QNodes and our circuit ansatz. For this simple example, an ansatz that works well is simply a rotation along X, and a rotation along Y, repeated across all the qubits.

We will also define our cost functions here. Since we are trying to learn the identity gate, a natural cost function is 1 minus the probability of measuring the zero state, denoted here as \(1 - p_{|0\rangle}\).

We will apply this across all qubits for our global cost function, i.e.,

and for the local cost function, we will sum the individual contributions from each qubit:

It might be clear to some readers now why this function can perform better. By formatting our local cost function in this way, we have essentially divided the problem up into multiple single-qubit terms, and summed all the results up.

To implement this, we will define a separate QNode for the local cost function and the global cost function.

```
def global_cost_simple(rotations):
for i in range(wires):
qml.RX(rotations[0][i], wires=i)
qml.RY(rotations[1][i], wires=i)
return qml.probs(wires=range(wires))
def local_cost_simple(rotations):
for i in range(wires):
qml.RX(rotations[0][i], wires=i)
qml.RY(rotations[1][i], wires=i)
return [qml.probs(wires=i) for i in range(wires)]
global_circuit = qml.QNode(global_cost_simple, dev)
local_circuit = qml.QNode(local_cost_simple, dev)
def cost_local(rotations):
return 1 - np.sum(local_circuit(rotations)[:,0])/wires
def cost_global(rotations):
return 1 - global_circuit(rotations)[0]
```

To analyze each of the circuits, we provide some random initial parameters for each rotation.

```
RX = np.random.uniform(low=-np.pi, high=np.pi)
RY = np.random.uniform(low=-np.pi, high=np.pi)
rotations = [[RX for i in range(wires)], [RY for i in range(wires)]]
```

Examining the results:

```
print("Global Cost: {: .7f}".format(cost_global(rotations)))
print("Local Cost: {: .7f}".format(cost_local(rotations)))
print("--- Global Circuit ---")
print(global_circuit.draw())
print("--- Local Circuit")
print(local_circuit.draw())
```

Out:

```
Global Cost: 0.9999000
Local Cost: 0.8373000
--- Global Circuit ---
0: ──RX(-0.788)──RY(2.83)──╭┤ Probs
1: ──RX(-0.788)──RY(2.83)──├┤ Probs
2: ──RX(-0.788)──RY(2.83)──├┤ Probs
3: ──RX(-0.788)──RY(2.83)──├┤ Probs
4: ──RX(-0.788)──RY(2.83)──├┤ Probs
5: ──RX(-0.788)──RY(2.83)──╰┤ Probs
--- Local Circuit
0: ──RX(-0.788)──RY(2.83)──┤ Probs
1: ──RX(-0.788)──RY(2.83)──┤ Probs
2: ──RX(-0.788)──RY(2.83)──┤ Probs
3: ──RX(-0.788)──RY(2.83)──┤ Probs
4: ──RX(-0.788)──RY(2.83)──┤ Probs
5: ──RX(-0.788)──RY(2.83)──┤ Probs
```

With this simple example, we can visualize the cost function, and see the barren plateau effect graphically. Although there are \(2n\) (where \(n\) is the number of qubits) parameters, in order to plot the cost landscape we must constrain ourselves. We will consider the case where all X rotations have the same value, and all the Y rotations have the same value.

Firstly, we look at the global cost function. When plotting the cost function across 6 qubits, much of the cost landscape is flat, and difficult to train (even with a circuit depth of only 2!). This effect will worsen as the number of qubits increases.

```
def generate_surface(cost_function):
Z = []
Z_assembler = []
X = np.arange(-np.pi, np.pi, 0.25)
Y = np.arange(-np.pi, np.pi, 0.25)
X, Y = np.meshgrid(X, Y)
for x in X[0, :]:
for y in Y[:, 0]:
rotations = [[x for i in range(wires)], [y for i in range(wires)]]
Z_assembler.append(cost_function(rotations))
Z.append(Z_assembler)
Z_assembler = []
Z = np.asarray(Z)
return Z
def plot_surface(surface):
X = np.arange(-np.pi, np.pi, 0.25)
Y = np.arange(-np.pi, np.pi, 0.25)
X, Y = np.meshgrid(X, Y)
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
surf = ax.plot_surface(X, Y, surface, cmap="viridis", linewidth=0, antialiased=False)
ax.set_zlim(0, 1)
ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter("%.02f"))
plt.show()
global_surface = generate_surface(cost_global)
plot_surface(global_surface)
```

However, when we change to the local cost function, the cost landscape becomes much more trainable as the size of the barren plateau decreases.

```
local_surface = generate_surface(cost_local)
plot_surface(local_surface)
```

Those are some nice pictures, but how do they reflect actual trainability? Let us try training both the local and global cost functions. To simplify this model, let’s modify our cost function from

where we sum the marginal probabilities of each qubit, to

where we only consider the probability of a single qubit to be in the 0 state.

While we’re at it, let us make our ansatz a little more like one we would encounter while trying to solve a VQE problem, and add entanglement.

```
def global_cost_simple(rotations):
for i in range(wires):
qml.RX(rotations[0][i], wires=i)
qml.RY(rotations[1][i], wires=i)
qml.broadcast(qml.CNOT, wires=range(wires), pattern="chain")
return qml.probs(wires=range(wires))
def local_cost_simple(rotations):
for i in range(wires):
qml.RX(rotations[0][i], wires=i)
qml.RY(rotations[1][i], wires=i)
qml.broadcast(qml.CNOT, wires=range(wires), pattern="chain")
return qml.probs(wires=[0])
global_circuit = qml.QNode(global_cost_simple, dev)
local_circuit = qml.QNode(local_cost_simple, dev)
def cost_local(rotations):
return 1 - local_circuit(rotations)[0]
def cost_global(rotations):
return 1 - global_circuit(rotations)[0]
```

Of course, now that we’ve changed both our cost function and our circuit, we will need to scan the cost landscape again.

```
global_surface = generate_surface(cost_global)
plot_surface(global_surface)
local_surface = generate_surface(cost_local)
plot_surface(local_surface)
```

It seems our changes didn’t significantly alter the overall cost landscape. This probably isn’t a general trend, but it is a nice surprise. Now, let us get back to training the local and global cost functions. Because we have a visualization of the total cost landscape, let’s pick a point to exaggerate the problem. One of the worst points in the landscape is \((\pi,0)\) as it is in the middle of the plateau, so let’s use that.

```
rotations = np.array([[3.] * len(range(wires)), [0.] * len(range(wires))])
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 100
params_global = rotations
for i in range(steps):
# update the circuit parameters
params_global = opt.step(cost_global, params_global)
if (i + 1) % 5 == 0:
print("Cost after step {:5d}: {: .7f}".format(i + 1, cost_global(params_global)))
if cost_global(params_global) < 0.1:
break
print(global_circuit.draw())
```

Out:

```
Cost after step 5: 1.0000000
Cost after step 10: 1.0000000
Cost after step 15: 1.0000000
Cost after step 20: 1.0000000
Cost after step 25: 1.0000000
Cost after step 30: 1.0000000
Cost after step 35: 1.0000000
Cost after step 40: 1.0000000
Cost after step 45: 1.0000000
Cost after step 50: 1.0000000
Cost after step 55: 1.0000000
Cost after step 60: 1.0000000
Cost after step 65: 1.0000000
Cost after step 70: 1.0000000
Cost after step 75: 1.0000000
Cost after step 80: 1.0000000
Cost after step 85: 1.0000000
Cost after step 90: 1.0000000
Cost after step 95: 1.0000000
Cost after step 100: 1.0000000
0: ──RX(3)──RY(0)──╭C──────────────────╭┤ Probs
1: ──RX(3)──RY(0)──╰X──╭C──────────────├┤ Probs
2: ──RX(3)──RY(0)──────╰X──╭C──────────├┤ Probs
3: ──RX(3)──RY(0)──────────╰X──╭C──────├┤ Probs
4: ──RX(3)──RY(0)──────────────╰X──╭C──├┤ Probs
5: ──RX(3)──RY(0)──────────────────╰X──╰┤ Probs
```

After 100 steps, the cost function is still exactly 1. Clearly we are in an “untrainable” area. Now, let us limit ourselves to the local cost function and see how it performs.

```
rotations = np.array([[3. for i in range(wires)], [0. for i in range(wires)]])
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 100
params_local = rotations
for i in range(steps):
# update the circuit parameters
params_local = opt.step(cost_local, params_local)
if (i + 1) % 5 == 0:
print("Cost after step {:5d}: {: .7f}".format(i + 1, cost_local(params_local)))
if cost_local(params_local) < 0.05:
break
print(local_circuit.draw())
```

Out:

```
Cost after step 5: 0.9871000
Cost after step 10: 0.9651000
Cost after step 15: 0.9173000
Cost after step 20: 0.8059000
Cost after step 25: 0.6213000
Cost after step 30: 0.3703000
Cost after step 35: 0.1821000
Cost after step 40: 0.0684000
0: ──RX(0.44)──RY(0.00146)──╭C──────────────────┤ Probs
1: ──RX(2.99)──RY(-4e-05)───╰X──╭C──────────────┤
2: ──RX(3)─────RY(0)────────────╰X──╭C──────────┤
3: ──RX(3)─────RY(0)────────────────╰X──╭C──────┤
4: ──RX(3)─────RY(0)────────────────────╰X──╭C──┤
5: ──RX(3)─────RY(0)────────────────────────╰X──┤
```

It trained! And much faster than the global case. However, we know our local cost function is bounded by the global one, but just how much have we trained it?

```
cost_global(params_local)
```

Out:

```
1.0
```

Interestingly, the global cost function is still 1. If we trained the local cost function, why hasn’t the global cost function changed?

The answer is that we have trained the global cost a *little bit*, but
not enough to see a change with only 10000 shots. To see the effect,
we’ll need to increase the number of shots to an unreasonable amount.
Instead making the backend analytic gives us the exact
representation.

```
dev.analytic = True
global_circuit = qml.QNode(global_cost_simple, dev)
print(
"Current cost: "
+ str(cost_global(params_local))
+ ".\nInitial cost: "
+ str(cost_global([[3.0 for i in range(wires)], [0 for i in range(wires)]]))
+ ".\nDifference: "
+ str(
cost_global([[3.0 for i in range(wires)], [0 for i in range(wires)]])
- cost_global(params_local)
)
)
```

Out:

```
Current cost: 0.9999999999966592.
Initial cost: 0.9999999999999843.
Difference: 3.325117958752344e-12
```

Our circuit has definitely been trained, but not a useful amount. If we attempt to use this circuit, it would act the same as if we never trained at all. Furthermore, if we now attempt to train the global cost function, we are still firmly in the plateau region. In order to fully train the global circuit, we will need to increase the locality gradually as we train.

```
def tunable_cost_simple(rotations):
for i in range(wires):
qml.RX(rotations[0][i], wires=i)
qml.RY(rotations[1][i], wires=i)
qml.broadcast(qml.CNOT, wires=range(wires), pattern="chain")
return qml.probs(range(locality))
def cost_tunable(rotations):
return 1 - tunable_circuit(rotations)[0]
dev.analytic = False
tunable_circuit = qml.QNode(tunable_cost_simple, dev)
locality = 2
params_tunable = params_local
print(cost_tunable(params_tunable))
print(tunable_circuit.draw())
locality = 2
opt = qml.GradientDescentOptimizer(stepsize=0.1)
steps = 600
for i in range(steps):
# update the circuit parameters
params_tunable = opt.step(cost_tunable, params_tunable)
runCost = cost_tunable(params_tunable)
if (i + 1) % 10 == 0:
print(
"Cost after step {:5d}: {: .7f}".format(i + 1, runCost)
+ ". Locality: "
+ str(locality)
)
if runCost < 0.1 and locality < wires:
print("---Switching Locality---")
locality += 1
continue
elif runCost < 0.1 and locality >= wires:
break
print(tunable_circuit.draw())
```

Out:

```
0.9947
0: ──RX(0.44)──RY(0.00146)──╭C──────────────────╭┤ Probs
1: ──RX(2.99)──RY(-4e-05)───╰X──╭C──────────────╰┤ Probs
2: ──RX(3)─────RY(0)────────────╰X──╭C───────────┤
3: ──RX(3)─────RY(0)────────────────╰X──╭C───────┤
4: ──RX(3)─────RY(0)────────────────────╰X──╭C───┤
5: ──RX(3)─────RY(0)────────────────────────╰X───┤
Cost after step 10: 0.9887000. Locality: 2
Cost after step 20: 0.9694000. Locality: 2
Cost after step 30: 0.9111000. Locality: 2
Cost after step 40: 0.8119000. Locality: 2
Cost after step 50: 0.6358000. Locality: 2
Cost after step 60: 0.3875000. Locality: 2
Cost after step 70: 0.1897000. Locality: 2
---Switching Locality---
Cost after step 80: 0.9951000. Locality: 3
Cost after step 90: 0.9869000. Locality: 3
Cost after step 100: 0.9668000. Locality: 3
Cost after step 110: 0.9264000. Locality: 3
Cost after step 120: 0.8318000. Locality: 3
Cost after step 130: 0.6609000. Locality: 3
Cost after step 140: 0.4262000. Locality: 3
Cost after step 150: 0.2277000. Locality: 3
Cost after step 160: 0.0977000. Locality: 3
---Switching Locality---
Cost after step 170: 0.9885000. Locality: 4
Cost after step 180: 0.9750000. Locality: 4
Cost after step 190: 0.9377000. Locality: 4
Cost after step 200: 0.8543000. Locality: 4
Cost after step 210: 0.7118000. Locality: 4
Cost after step 220: 0.4924000. Locality: 4
Cost after step 230: 0.2623000. Locality: 4
Cost after step 240: 0.1131000. Locality: 4
---Switching Locality---
Cost after step 250: 0.9928000. Locality: 5
Cost after step 260: 0.9789000. Locality: 5
Cost after step 270: 0.9500000. Locality: 5
Cost after step 280: 0.8842000. Locality: 5
Cost after step 290: 0.7569000. Locality: 5
Cost after step 300: 0.5619000. Locality: 5
Cost after step 310: 0.3174000. Locality: 5
Cost after step 320: 0.1463000. Locality: 5
---Switching Locality---
Cost after step 330: 0.9925000. Locality: 6
Cost after step 340: 0.9848000. Locality: 6
Cost after step 350: 0.9619000. Locality: 6
Cost after step 360: 0.9085000. Locality: 6
Cost after step 370: 0.8013000. Locality: 6
Cost after step 380: 0.6221000. Locality: 6
Cost after step 390: 0.3884000. Locality: 6
Cost after step 400: 0.1932000. Locality: 6
0: ──RX(0.00087)──RY(0.00119)────╭C──────────────────╭┤ Probs
1: ──RX(0.00283)──RY(-0.000325)──╰X──╭C──────────────├┤ Probs
2: ──RX(0.0141)───RY(0.000665)───────╰X──╭C──────────├┤ Probs
3: ──RX(0.0506)───RY(-0.00338)───────────╰X──╭C──────├┤ Probs
4: ──RX(0.171)────RY(0.000395)───────────────╰X──╭C──├┤ Probs
5: ──RX(0.597)────RY(-0.0122)────────────────────╰X──╰┤ Probs
```

## A more thorough analysis¶

Now the circuit can be trained, even though we started from a place where the global function has a barren plateau. The significance of this is that we can now train from every starting location in this example.

But, how often does this problem occur? If we wanted to train this circuit from a random starting point, how often would we be stuck in a plateau? To investigate this, let’s attempt to train the global cost function using random starting positions and count how many times we run into a barren plateau.

Let’s use a number of qubits we are more likely to use in a real variational circuit: n=10. We will say that after 400 steps, any run with a cost function of less than 0.9 (chosen arbitrarily) will probably be trainable given more time. Any run with a greater cost function will probably be in a plateau.

This may take up to 15 minutes.

```
samples = 10
plateau = 0
trained = 0
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 400
wires = 8
dev = qml.device("default.qubit", wires=wires, shots=10000, analytic=False)
global_circuit = qml.QNode(global_cost_simple, dev)
for runs in range(samples):
print("--- New run! ---")
has_been_trained = False
params_global = [
[np.random.uniform(-np.pi, np.pi) for i in range(wires)],
[np.random.uniform(-np.pi, np.pi) for i in range(wires)],
]
for i in range(steps):
# update the circuit parameters
params_global = opt.step(cost_global, params_global)
if (i + 1) % 20 == 0:
print("Cost after step {:5d}: {: .7f}".format(i + 1, cost_global(params_global)))
if cost_global(params_global) < 0.9:
has_been_trained = True
break
if has_been_trained:
trained = trained + 1
else:
plateau = plateau + 1
print("Trained: {:5d}".format(trained))
print("Plateau'd: {:5d}".format(plateau))
samples = 10
plateau = 0
trained = 0
opt = qml.GradientDescentOptimizer(stepsize=0.2)
steps = 400
wires = 8
dev = qml.device("default.qubit", wires=wires, shots=10000, analytic=False)
tunable_circuit = qml.QNode(tunable_cost_simple, dev)
for runs in range(samples):
locality = 1
print("--- New run! ---")
has_been_trained = False
params_tunable = [
[np.random.uniform(-np.pi, np.pi) for i in range(wires)],
[np.random.uniform(-np.pi, np.pi) for i in range(wires)],
]
for i in range(steps):
# update the circuit parameters
params_tunable = opt.step(cost_tunable, params_tunable)
runCost = cost_tunable(params_tunable)
if (i + 1) % 10 == 0:
print(
"Cost after step {:5d}: {: .7f}".format(i + 1, runCost)
+ ". Locality: "
+ str(locality)
)
if runCost < 0.5 and locality < wires:
print("---Switching Locality---")
locality += 1
continue
elif runCost < 0.1 and locality >= wires:
trained = trained + 1
has_been_trained = True
break
if not has_been_trained:
plateau = plateau + 1
print("Trained: {:5d}".format(trained))
print("Plateau'd: {:5d}".format(plateau))
```

Out:

```
--- New run! ---
Cost after step 20: 0.9997000
Cost after step 40: 0.9997000
Cost after step 60: 0.9997000
Cost after step 80: 1.0000000
Cost after step 100: 1.0000000
Cost after step 120: 0.9998000
Cost after step 140: 0.9997000
Cost after step 160: 0.9999000
Cost after step 180: 0.9996000
Cost after step 200: 0.9998000
Cost after step 220: 0.9999000
Cost after step 240: 0.9997000
Cost after step 260: 1.0000000
Cost after step 280: 1.0000000
Cost after step 300: 0.9998000
Cost after step 320: 0.9995000
Cost after step 340: 0.9997000
Cost after step 360: 0.9995000
Cost after step 380: 0.9998000
Cost after step 400: 0.9998000
Trained: 0
Plateau'd: 1
--- New run! ---
Cost after step 20: 0.9980000
Cost after step 40: 0.9981000
Cost after step 60: 0.9989000
Cost after step 80: 0.9982000
Cost after step 100: 0.9985000
Cost after step 120: 0.9978000
Cost after step 140: 0.9981000
Cost after step 160: 0.9976000
Cost after step 180: 0.9987000
Cost after step 200: 0.9984000
Cost after step 220: 0.9980000
Cost after step 240: 0.9973000
Cost after step 260: 0.9965000
Cost after step 280: 0.9969000
Cost after step 300: 0.9964000
Cost after step 320: 0.9971000
Cost after step 340: 0.9966000
Cost after step 360: 0.9964000
Cost after step 380: 0.9959000
Cost after step 400: 0.9938000
Trained: 0
Plateau'd: 2
--- New run! ---
Cost after step 20: 0.9984000
Cost after step 40: 0.9976000
Cost after step 60: 0.9973000
Cost after step 80: 0.9969000
Cost after step 100: 0.9968000
Cost after step 120: 0.9959000
Cost after step 140: 0.9954000
Cost after step 160: 0.9945000
Cost after step 180: 0.9947000
Cost after step 200: 0.9933000
Cost after step 220: 0.9885000
Cost after step 240: 0.9849000
Cost after step 260: 0.9825000
Cost after step 280: 0.9746000
Cost after step 300: 0.9609000
Cost after step 320: 0.9293000
Trained: 1
Plateau'd: 2
--- New run! ---
Cost after step 20: 0.9961000
Cost after step 40: 0.9956000
Cost after step 60: 0.9951000
Cost after step 80: 0.9952000
Cost after step 100: 0.9960000
Cost after step 120: 0.9933000
Cost after step 140: 0.9939000
Cost after step 160: 0.9907000
Cost after step 180: 0.9926000
Cost after step 200: 0.9904000
Cost after step 220: 0.9880000
Cost after step 240: 0.9869000
Cost after step 260: 0.9875000
Cost after step 280: 0.9828000
Cost after step 300: 0.9832000
Cost after step 320: 0.9777000
Cost after step 340: 0.9711000
Cost after step 360: 0.9647000
Cost after step 380: 0.9556000
Cost after step 400: 0.9373000
Trained: 1
Plateau'd: 3
--- New run! ---
Cost after step 20: 1.0000000
Cost after step 40: 1.0000000
Cost after step 60: 0.9999000
Cost after step 80: 1.0000000
Cost after step 100: 1.0000000
Cost after step 120: 1.0000000
Cost after step 140: 1.0000000
Cost after step 160: 0.9999000
Cost after step 180: 1.0000000
Cost after step 200: 1.0000000
Cost after step 220: 0.9999000
Cost after step 240: 0.9999000
Cost after step 260: 1.0000000
Cost after step 280: 1.0000000
Cost after step 300: 1.0000000
Cost after step 320: 1.0000000
Cost after step 340: 1.0000000
Cost after step 360: 1.0000000
Cost after step 380: 1.0000000
Cost after step 400: 1.0000000
Trained: 1
Plateau'd: 4
--- New run! ---
Cost after step 20: 0.9990000
Cost after step 40: 0.9981000
Cost after step 60: 0.9980000
Cost after step 80: 0.9973000
Cost after step 100: 0.9970000
Cost after step 120: 0.9977000
Cost after step 140: 0.9972000
Cost after step 160: 0.9955000
Cost after step 180: 0.9953000
Cost after step 200: 0.9948000
Cost after step 220: 0.9935000
Cost after step 240: 0.9918000
Cost after step 260: 0.9909000
Cost after step 280: 0.9884000
Cost after step 300: 0.9860000
Cost after step 320: 0.9832000
Cost after step 340: 0.9781000
Cost after step 360: 0.9732000
Cost after step 380: 0.9669000
Cost after step 400: 0.9495000
Trained: 1
Plateau'd: 5
--- New run! ---
Cost after step 20: 0.9994000
Cost after step 40: 0.9997000
Cost after step 60: 0.9993000
Cost after step 80: 0.9994000
Cost after step 100: 0.9988000
Cost after step 120: 0.9993000
Cost after step 140: 0.9990000
Cost after step 160: 0.9992000
Cost after step 180: 0.9993000
Cost after step 200: 0.9988000
Cost after step 220: 0.9995000
Cost after step 240: 0.9988000
Cost after step 260: 0.9988000
Cost after step 280: 0.9986000
Cost after step 300: 0.9984000
Cost after step 320: 0.9992000
Cost after step 340: 0.9982000
Cost after step 360: 0.9984000
Cost after step 380: 0.9989000
Cost after step 400: 0.9984000
Trained: 1
Plateau'd: 6
--- New run! ---
Cost after step 20: 0.9814000
Cost after step 40: 0.9729000
Cost after step 60: 0.9575000
Cost after step 80: 0.9346000
Trained: 2
Plateau'd: 6
--- New run! ---
Cost after step 20: 0.9989000
Cost after step 40: 0.9995000
Cost after step 60: 0.9990000
Cost after step 80: 0.9993000
Cost after step 100: 0.9992000
Cost after step 120: 0.9992000
Cost after step 140: 0.9992000
Cost after step 160: 0.9987000
Cost after step 180: 0.9993000
Cost after step 200: 0.9993000
Cost after step 220: 0.9984000
Cost after step 240: 0.9987000
Cost after step 260: 0.9990000
Cost after step 280: 0.9982000
Cost after step 300: 0.9984000
Cost after step 320: 0.9985000
Cost after step 340: 0.9986000
Cost after step 360: 0.9981000
Cost after step 380: 0.9979000
Cost after step 400: 0.9976000
Trained: 2
Plateau'd: 7
--- New run! ---
Cost after step 20: 0.9901000
Cost after step 40: 0.9903000
Cost after step 60: 0.9893000
Cost after step 80: 0.9882000
Cost after step 100: 0.9841000
Cost after step 120: 0.9848000
Cost after step 140: 0.9804000
Cost after step 160: 0.9800000
Cost after step 180: 0.9792000
Cost after step 200: 0.9736000
Cost after step 220: 0.9700000
Cost after step 240: 0.9626000
Cost after step 260: 0.9531000
Cost after step 280: 0.9456000
Cost after step 300: 0.9271000
Cost after step 320: 0.8935000
Trained: 3
Plateau'd: 7
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step 10: 0.6818000. Locality: 3
Cost after step 20: 0.5936000. Locality: 3
Cost after step 30: 0.5349000. Locality: 3
Cost after step 40: 0.5129000. Locality: 3
---Switching Locality---
Cost after step 50: 0.8093000. Locality: 4
Cost after step 60: 0.6964000. Locality: 4
Cost after step 70: 0.6095000. Locality: 4
Cost after step 80: 0.5189000. Locality: 4
---Switching Locality---
Cost after step 90: 0.7434000. Locality: 5
Cost after step 100: 0.6862000. Locality: 5
Cost after step 110: 0.5919000. Locality: 5
---Switching Locality---
Cost after step 120: 0.7953000. Locality: 6
Cost after step 130: 0.7080000. Locality: 6
Cost after step 140: 0.6256000. Locality: 6
Cost after step 150: 0.5049000. Locality: 6
---Switching Locality---
Cost after step 160: 0.8548000. Locality: 7
Cost after step 170: 0.7156000. Locality: 7
Cost after step 180: 0.4901000. Locality: 7
---Switching Locality---
Cost after step 190: 0.5170000. Locality: 8
Cost after step 200: 0.1984000. Locality: 8
Trained: 1
Plateau'd: 0
--- New run! ---
Cost after step 10: 0.4785000. Locality: 1
---Switching Locality---
Cost after step 20: 0.4841000. Locality: 2
---Switching Locality---
Cost after step 30: 0.6351000. Locality: 3
---Switching Locality---
---Switching Locality---
Cost after step 40: 0.7132000. Locality: 5
---Switching Locality---
---Switching Locality---
Cost after step 50: 0.8944000. Locality: 7
Cost after step 60: 0.7610000. Locality: 7
Cost after step 70: 0.4972000. Locality: 7
---Switching Locality---
Cost after step 80: 0.6263000. Locality: 8
Cost after step 90: 0.3527000. Locality: 8
Cost after step 100: 0.0908000. Locality: 8
Trained: 2
Plateau'd: 0
--- New run! ---
Cost after step 10: 0.5506000. Locality: 1
---Switching Locality---
Cost after step 20: 0.7588000. Locality: 2
Cost after step 30: 0.6270000. Locality: 2
Cost after step 40: 0.4958000. Locality: 2
---Switching Locality---
Cost after step 50: 0.6983000. Locality: 3
Cost after step 60: 0.5846000. Locality: 3
---Switching Locality---
Cost after step 70: 0.6953000. Locality: 4
Cost after step 80: 0.4642000. Locality: 4
---Switching Locality---
Cost after step 90: 0.7602000. Locality: 5
---Switching Locality---
Cost after step 100: 0.8020000. Locality: 6
Cost after step 110: 0.5603000. Locality: 6
---Switching Locality---
Cost after step 120: 0.6565000. Locality: 7
---Switching Locality---
Cost after step 130: 0.6687000. Locality: 8
Cost after step 140: 0.4490000. Locality: 8
Cost after step 150: 0.1722000. Locality: 8
Trained: 3
Plateau'd: 0
--- New run! ---
---Switching Locality---
Cost after step 10: 0.6595000. Locality: 2
---Switching Locality---
Cost after step 20: 0.5993000. Locality: 3
---Switching Locality---
Cost after step 30: 0.7070000. Locality: 4
Cost after step 40: 0.4787000. Locality: 4
---Switching Locality---
---Switching Locality---
Cost after step 50: 0.8938000. Locality: 6
Cost after step 60: 0.7780000. Locality: 6
Cost after step 70: 0.5802000. Locality: 6
---Switching Locality---
Cost after step 80: 0.7025000. Locality: 7
Cost after step 90: 0.6265000. Locality: 7
Cost after step 100: 0.5473000. Locality: 7
---Switching Locality---
Cost after step 110: 0.9421000. Locality: 8
Cost after step 120: 0.8700000. Locality: 8
Cost after step 130: 0.7485000. Locality: 8
Cost after step 140: 0.5564000. Locality: 8
Cost after step 150: 0.2454000. Locality: 8
Trained: 4
Plateau'd: 0
--- New run! ---
Cost after step 10: 0.4730000. Locality: 1
---Switching Locality---
Cost after step 20: 0.5685000. Locality: 2
---Switching Locality---
Cost after step 30: 0.8118000. Locality: 3
Cost after step 40: 0.5318000. Locality: 3
---Switching Locality---
---Switching Locality---
---Switching Locality---
Cost after step 50: 0.4757000. Locality: 6
---Switching Locality---
---Switching Locality---
Cost after step 60: 0.7905000. Locality: 8
Cost after step 70: 0.5000000. Locality: 8
Cost after step 80: 0.1406000. Locality: 8
Trained: 5
Plateau'd: 0
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step 10: 0.4605000. Locality: 3
---Switching Locality---
Cost after step 20: 0.7569000. Locality: 4
Cost after step 30: 0.4888000. Locality: 4
---Switching Locality---
Cost after step 40: 0.7302000. Locality: 5
---Switching Locality---
Cost after step 50: 0.8086000. Locality: 6
Cost after step 60: 0.6818000. Locality: 6
Cost after step 70: 0.5580000. Locality: 6
---Switching Locality---
Cost after step 80: 0.6163000. Locality: 7
---Switching Locality---
Cost after step 90: 0.6984000. Locality: 8
Cost after step 100: 0.4357000. Locality: 8
Cost after step 110: 0.1255000. Locality: 8
Trained: 6
Plateau'd: 0
--- New run! ---
---Switching Locality---
Cost after step 10: 0.7342000. Locality: 2
Cost after step 20: 0.6722000. Locality: 2
Cost after step 30: 0.5659000. Locality: 2
---Switching Locality---
---Switching Locality---
Cost after step 40: 0.6285000. Locality: 4
---Switching Locality---
Cost after step 50: 0.6082000. Locality: 5
---Switching Locality---
Cost after step 60: 0.6759000. Locality: 6
---Switching Locality---
Cost after step 70: 0.6289000. Locality: 7
---Switching Locality---
Cost after step 80: 0.4981000. Locality: 8
Cost after step 90: 0.1649000. Locality: 8
Trained: 7
Plateau'd: 0
--- New run! ---
---Switching Locality---
Cost after step 10: 0.9897000. Locality: 2
Cost after step 20: 0.9720000. Locality: 2
Cost after step 30: 0.9319000. Locality: 2
Cost after step 40: 0.8583000. Locality: 2
Cost after step 50: 0.7400000. Locality: 2
Cost after step 60: 0.5604000. Locality: 2
---Switching Locality---
Cost after step 70: 0.5029000. Locality: 3
---Switching Locality---
---Switching Locality---
Cost after step 80: 0.8253000. Locality: 5
Cost after step 90: 0.5584000. Locality: 5
---Switching Locality---
---Switching Locality---
Cost after step 100: 0.4805000. Locality: 7
---Switching Locality---
Cost after step 110: 0.4104000. Locality: 8
Cost after step 120: 0.1146000. Locality: 8
Trained: 8
Plateau'd: 0
--- New run! ---
---Switching Locality---
---Switching Locality---
Cost after step 10: 0.6584000. Locality: 3
---Switching Locality---
Cost after step 20: 0.8869000. Locality: 4
Cost after step 30: 0.7494000. Locality: 4
Cost after step 40: 0.5086000. Locality: 4
---Switching Locality---
---Switching Locality---
Cost after step 50: 0.7283000. Locality: 6
Cost after step 60: 0.5970000. Locality: 6
---Switching Locality---
Cost after step 70: 0.9338000. Locality: 7
Cost after step 80: 0.8387000. Locality: 7
Cost after step 90: 0.6197000. Locality: 7
---Switching Locality---
Cost after step 100: 0.9334000. Locality: 8
Cost after step 110: 0.8439000. Locality: 8
Cost after step 120: 0.6706000. Locality: 8
Cost after step 130: 0.3985000. Locality: 8
Cost after step 140: 0.1158000. Locality: 8
Trained: 9
Plateau'd: 0
--- New run! ---
---Switching Locality---
Cost after step 10: 0.8999000. Locality: 2
Cost after step 20: 0.7906000. Locality: 2
Cost after step 30: 0.6457000. Locality: 2
Cost after step 40: 0.5161000. Locality: 2
---Switching Locality---
Cost after step 50: 0.5413000. Locality: 3
---Switching Locality---
Cost after step 60: 0.8664000. Locality: 4
Cost after step 70: 0.6748000. Locality: 4
---Switching Locality---
Cost after step 80: 0.6596000. Locality: 5
---Switching Locality---
Cost after step 90: 0.8711000. Locality: 6
Cost after step 100: 0.7097000. Locality: 6
---Switching Locality---
Cost after step 110: 0.5214000. Locality: 7
---Switching Locality---
Cost after step 120: 0.7057000. Locality: 8
Cost after step 130: 0.4188000. Locality: 8
Cost after step 140: 0.1190000. Locality: 8
Trained: 10
Plateau'd: 0
```

In the global case, anywhere between 70-80% of starting positions are untrainable, a significant number. It is likely that, as the complexity of our ansatz—and the number of qubits—increases, this factor will increase.

We can compare that to our local cost function, where every single area trained, and most even trained in less time. While these examples are simple, this local-vs-global cost behaviour has been shown to extend to more complex problems.

## References¶

[1] | (1, 2) Cerezo, M., Sone, A., Volkoff, T., Cincio, L., and Coles, P. (2020).
Cost-Function-Dependent Barren Plateaus in Shallow Quantum Neural Networks.
arXiv:2001.00550 |

**Total running time of the script:** ( 8 minutes 38.271 seconds)

## Contents

## Downloads

## Related tutorials