PennyLane QNodes don’t just support computing the derivative—they also support computing the second derivative. Further, the second derivative can be computed on both simulators and quantum hardware.
In this how-to, we’ll show you how you can extract the Hessian of a hybrid model using PennyLane in three different ways.
Consider the following hybrid model:
import pennylane as qml
from pennylane import numpy as np
dev = qml.device("default.qubit", wires=2)
@qml.qnode(dev)
def circuit(weights):
for i in range(2):
qml.Rot(*weights[i, 0], wires=0)
qml.Rot(*weights[i, 1], wires=1)
qml.CNOT(wires=[0, 1])
return qml.probs(wires=0)
def f(weights, x):
quantum = circuit(np.sin(weights))
return np.sum(np.abs(quantum - x) / np.cos(x))
We can compute the Hessian of this cost function f
,
by taking the Jacobian of the gradient. Let’s begin by defining our trainable parameters weights
and our input data x
, and computing the gradient:
>>> weights = np.random.random(size=[2, 2, 3], requires_grad=True)
>>> x = np.array([0.54, 0.1], requires_grad=False)
>>> grad_fn = qml.grad(f)
If we want, we can evaluate our gradient function, but it’s not required!
>>> grad = grad_fn(weights, x)
>>> print(np.round(grad, 5))
[[[-0. -0.43247 0.02584]
[-0. -0.00441 0.00115]]
[[ 0.02654 -0.20502 -0. ]
[-0. -0. -0. ]]]
We can now compute the Jacobian of our gradient function; this gives us the Hessian.
>>> hess_fn = qml.jacobian(grad_fn)
>>> H = hess_fn(weights, x)
>>> H.shape
(2, 2, 3, 2, 2, 3)
Note the shape of our Hessian matrix; this is because our input weights are of shape (2, 2, 3)
!
TensorFlow
In the above, we used the built-in Autograd interface. However, the Hessian can also be computed if you are using TensorFlow. We recreate our model from above, but this time using TensorFlow, and making sure to specify interface="tf"
:
import pennylane as qml
import tensorflow as tf
import numpy as np
dev = qml.device("default.qubit", wires=2)
@qml.qnode(dev, interface="tf")
def circuit(weights):
for i in range(2):
qml.Rot(weights[i, 0, 0], weights[i, 0, 1], weights[i, 0, 2], wires=0)
qml.Rot(weights[i, 1, 0], weights[i, 1, 1], weights[i, 1, 2], wires=1)
qml.CNOT(wires=[0, 1])
return qml.probs(wires=0)
@tf.function
def f(weights, x):
quantum = circuit(tf.sin(weights))
return tf.reduce_sum(tf.abs(quantum - x) / tf.cos(x))
weights = tf.Variable(np.random.random(size=[2, 2, 3]), dtype=tf.float64)
x = tf.constant([0.54, 0.1], dtype=tf.float64)
As before, we simply take the Jacobian of the gradient. Let’s set up the computation. Since we are taking the double derivative, we need to use two gradient tapes:
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
res = f(weights, x)
grad = tape2.gradient(res, weights)
hess = tape1.jacobian(grad, weights)
We can check the shape of the Hessian, and extract the value of one of its elements:
>>> hess.shape
TensorShape([2, 2, 3, 2, 2, 3])
>>> hess[0, 1, 2, 1, 0, 1]
<tf.Tensor: shape=(), dtype=float64, numpy=2.904043724838658e-06>
PyTorch and the parameter-shift rule
If we want to extract the Hessian using PyTorch, we must use the torch.autograd.functional
module. Aside from that, the process is very much the same. This time, let’s explicitly use the parameter-shift rule in order to compute the Hessian in a way that is compatible with hardware devices.
import pennylane as qml
import torch
import numpy as np
dev = qml.device("default.qubit", wires=2)
@qml.qnode(dev, interface="torch", diff_method="parameter-shift")
def circuit(weights):
for i in range(2):
qml.Rot(*weights[i, 0], wires=0)
qml.Rot(*weights[i, 1], wires=1)
qml.CNOT(wires=[0, 1])
return qml.probs(wires=0)
def f(weights, x):
quantum = circuit(torch.sin(weights))
return torch.sum(torch.abs(quantum - x) / torch.cos(x))
weights = torch.tensor(np.random.random(size=[2, 2, 3]), requires_grad=True)
x = torch.tensor([0.54, 0.1], requires_grad=False)
hess = torch.autograd.functional.hessian(f, (weights, x))
Note that the torch.autograd.functional.hessian
function does not respect our requires_grad
attributes! So the output will be a nested tuple where the derivative has been taken with respect to both weights
and x
. Since we are interested in just the Hessian with respect to the weights
, lets extract the 00th element:
>>> hess[0][0].shape
torch.Size([2, 2, 3, 2, 2, 3])
We can also explore how many device evaluations were performed to extract the Hessian:
>>> dev.num_executions
313
To compute the Hessian, 2𝑝[(𝑝−1)+1]+1 evaluations are required, where 𝑝 is the number of parameters in the model. Further, PyTorch will always implicitly compute the Jacobian prior to computing the Hessian, requiring an additional 2𝑝 evaluations. Let’s verify this:
>>> p = weights.numel()
>>> 2*p * ((p-1) + 1) + 1 + 2*p
313
About the author
Josh Izaac
Josh is a theoretical physicist, software tinkerer, and occasional baker. At Xanadu, he contributes to the development and growth of Xanadu’s open-source quantum software products.