PennyLane QNodes don’t just support computing the derivative—they also support computing the second derivative. Further, the second derivative can be computed on both simulators and quantum hardware.
In this how-to, we’ll show you how you can extract the Hessian of a hybrid model using PennyLane in three different ways.
Consider the following hybrid model:
import pennylane as qml from pennylane import numpy as np dev = qml.device("default.qubit", wires=2) @qml.qnode(dev) def circuit(weights): for i in range(2): qml.Rot(*weights[i, 0], wires=0) qml.Rot(*weights[i, 1], wires=1) qml.CNOT(wires=[0, 1]) return qml.probs(wires=0) def f(weights, x): quantum = circuit(np.sin(weights)) return np.sum(np.abs(quantum - x) / np.cos(x))
We can compute the Hessian of this cost function f
,
by taking the Jacobian of the gradient. Let’s begin by defining our trainable parameters weights
and our input data x
, and computing the gradient:
>>> weights = np.random.random(size=[2, 2, 3], requires_grad=True) >>> x = np.array([0.54, 0.1], requires_grad=False) >>> grad_fn = qml.grad(f)
If we want, we can evaluate our gradient function, but it’s not required!
>>> grad = grad_fn(weights, x) >>> print(np.round(grad, 5)) [[[-0. -0.43247 0.02584] [-0. -0.00441 0.00115]] [[ 0.02654 -0.20502 -0. ] [-0. -0. -0. ]]]
We can now compute the Jacobian of our gradient function; this gives us the Hessian.
>>> hess_fn = qml.jacobian(grad_fn) >>> H = hess_fn(weights, x) >>> H.shape (2, 2, 3, 2, 2, 3)
Note the shape of our Hessian matrix; this is because our input weights are of shape (2, 2, 3)
!
TensorFlow
In the above, we used the built-in Autograd interface. However, the Hessian can also be computed if you are using TensorFlow. We recreate our model from above, but this time using TensorFlow, and making sure to specify interface="tf"
:
import pennylane as qml import tensorflow as tf import numpy as np dev = qml.device("default.qubit", wires=2) @qml.qnode(dev, interface="tf") def circuit(weights): for i in range(2): qml.Rot(weights[i, 0, 0], weights[i, 0, 1], weights[i, 0, 2], wires=0) qml.Rot(weights[i, 1, 0], weights[i, 1, 1], weights[i, 1, 2], wires=1) qml.CNOT(wires=[0, 1]) return qml.probs(wires=0) @tf.function def f(weights, x): quantum = circuit(tf.sin(weights)) return tf.reduce_sum(tf.abs(quantum - x) / tf.cos(x)) weights = tf.Variable(np.random.random(size=[2, 2, 3]), dtype=tf.float64) x = tf.constant([0.54, 0.1], dtype=tf.float64)
As before, we simply take the Jacobian of the gradient. Let’s set up the computation. Since we are taking the double derivative, we need to use two gradient tapes:
with tf.GradientTape() as tape1: with tf.GradientTape() as tape2: res = f(weights, x) grad = tape2.gradient(res, weights) hess = tape1.jacobian(grad, weights)
We can check the shape of the Hessian, and extract the value of one of its elements:
>>> hess.shape TensorShape([2, 2, 3, 2, 2, 3]) >>> hess[0, 1, 2, 1, 0, 1] <tf.Tensor: shape=(), dtype=float64, numpy=2.904043724838658e-06>
PyTorch and the parameter-shift rule
If we want to extract the Hessian using PyTorch, we must use the torch.autograd.functional
module. Aside from that, the process is very much the same. This time, let’s explicitly use the parameter-shift rule in order to compute the Hessian in a way that is compatible with hardware devices.
import pennylane as qml import torch import numpy as np dev = qml.device("default.qubit", wires=2) @qml.qnode(dev, interface="torch", diff_method="parameter-shift") def circuit(weights): for i in range(2): qml.Rot(*weights[i, 0], wires=0) qml.Rot(*weights[i, 1], wires=1) qml.CNOT(wires=[0, 1]) return qml.probs(wires=0) def f(weights, x): quantum = circuit(torch.sin(weights)) return torch.sum(torch.abs(quantum - x) / torch.cos(x)) weights = torch.tensor(np.random.random(size=[2, 2, 3]), requires_grad=True) x = torch.tensor([0.54, 0.1], requires_grad=False) hess = torch.autograd.functional.hessian(f, (weights, x))
Note that the torch.autograd.functional.hessian
function does not respect our requires_grad
attributes! So the output will be a nested tuple where the derivative has been taken with respect to both weights
and x
. Since we are interested in just the Hessian with respect to the weights
, lets extract the 00th element:
>>> hess[0][0].shape torch.Size([2, 2, 3, 2, 2, 3])
We can also explore how many device evaluations were performed to extract the Hessian:
>>> dev.num_executions 313
To compute the Hessian, 2𝑝[(𝑝−1)+1]+1 evaluations are required, where 𝑝 is the number of parameters in the model. Further, PyTorch will always implicitly compute the Jacobian prior to computing the Hessian, requiring an additional 2𝑝 evaluations. Let’s verify this:
>>> p = weights.numel() >>> 2*p * ((p-1) + 1) + 1 + 2*p 313
About the author
Josh Izaac
Josh is a theoretical physicist, software tinkerer, and occasional baker. At Xanadu, he contributes to the development and growth of Xanadu’s open-source quantum software products.