How to use the Hartree-Fock method in PennyLane

Matija Medvidović (Xanadu resident)

Quantum chemistry is one of the flagship applications of quantum computers. There is a reason why the Variational Quantum Eigensolver (VQE) method stirred the proverbial pot so much - it relies on the power of quantum circuits to represent molecular electronic states while relying on well-established math used in classical methods. However, to access all of this amazing physics, one often needs to have an initial guess as to what the electronic structure (think shells and orbitals) of a given molecule roughly looks like.

The weapon of choice in that step is the Hartree-Fock (HF) method. Intuitively, it tells you what individual electrons would do in a molecule, taking only the average (mean-field) interactions with all other electrons. This allows quantum algorithms (VQE and the gang) to refine that picture and get you closer to the full answer.

But, let’s not get ahead of ourselves. In this how-to, we will

  • Briefly go over why we like to use the Hartree-Fock method (why-to as well as how-to).
  • Demonstrate how PennyLane’s end-to-end differentiability allows us to optimize energies, orbitals and molecular geometries on the fly.

You might also want to take a look at this demo on the differentiable HF for more details.

The Hartree-Fock method

In quantum mechanics, each particle is either a fermion or a boson. Bosons are more intuitive from our biased classical perspective - an unlimited number of bosons can occupy the same quantum state (extroverted, very social particles). On the other hand, fermions like to be alone. Only a single fermion can occupy a quantum state and no more (introverted, private particles).

For example, photons (light particles) are bosons. Electrons are fermions. So in quantum chemistry, when you “pour” several fermions into a system with predefined quantum energy states, they will occupy the lowest-lying states one by one, in an orderly fashion.

To get slightly more technical, HF aims to approximately solve the molecular Schrödinger equation \(H \psi = E \psi\) with the Hamiltonian

$$ H = T_e + T_n + V_{ee} + V_{en} + V_{nn} \; , $$

where \(V_{ee}\), \(V_{en}\) and \(V_{nn}\) describe electron-electron, electron-nucleus and nucleus-nucleus Coulomb interactions along with corresponding kinetic energies \(T\).

Usually, HF is performed within the popular Born-Oppenheimer approximation which assumes that all nuclei are frozen in space while electrons buzz around them. Indeed, it is often an excellent way of reducing problem complexity because nuclei are much heavier and move orders of magnitude slower than electrons. In practice, that means that we treat nuclear positions as external variational parameters (they don’t usually change within the HF algorithm itself).

From there, we solve the Schrödinger equation by assuming that the solution is in the form of a simple Slater determinant. If we denote individual electron orbitals with \(\phi_i\), then the Slater determinant reads

$$ \psi (x_1, x_2, \ldots, x_N) \approx \frac{1}{\sqrt{N!}} \begin{vmatrix} \phi _1 (x_1) & \phi _2 (x_1) & \cdots & \phi _N (x_1) \\ \phi _1 (x_2) & \phi _2 (x_2) & \cdots & \phi _N (x_2) \\ \vdots & \vdots & \ddots & \vdots \\ \phi _1 (x_N) & \phi _2 (x_N) & \cdots & \phi _N (x_N) \\ \end{vmatrix} \; . $$

The upshot is that many modern software libraries will calculate all of this for you, like PySCF. Under the hood, what these packages do is solve the mean-field equations, accounting for the average effect of all other electrons. A lot of the details stored in electron correlation get left out, but hey - that is precisely the reason why we refine HF solutions with other methods, like VQE.

Each \(\phi _i\) is called a molecular orbital and is often approximated as a linear combination of atomic orbitals (LCAO)

$$ \phi _i (x) = \sum _\mu c _{i \mu} \; \chi _\mu (x) $$

with atomic orbitals \(\chi _\mu\) and expansion coefficients \(c _{i \mu}\). We can find all \(c\)s and any parameters internal to \(\chi _\mu\)s by minimizing the energy expectation.

Additionally, modern software packages use a basis set expansion of atomic orbitals \(\chi _\mu\) in solving HF equations.

That means that the workflow for finding optimal geometry, LCAO parameters and basis parameters is as follows:

  1. Determine the nuclear positions (coordinates)
  2. Determine the LCAO expansion coefficients \(c _{i \mu}\) (the HF solver thakes care of that under the hood)
  3. Determine basis expansion for atomic orbitals \(\chi _\mu\)
  4. Solve the HF equations for molecular orbitals \(\phi _i\)
  5. Run a quantum circuit to include the electronic interactions
  6. Go back to steps 1, 2 & 3 and reevaluate your choices

In summary - we end up with three nested optimization problems. Yikes!

For the rest of this how-to, we will forget about VQEs and just focus on the HF optimization. For a more in-depth look at the HF optimization within a VQE algorithm, see the demo on the optimization of molecular geometries and on differentiable HF.

In quantum chemistry, it is useful to think of qubits as encoding electronic energy states, sometimes called spin orbitals. Take a look at PennyLane’s qchem.hf_state function. It returns the configuration in which the lowest-lying states are occupied - it is precisely what electrons do because they are fermions! This is our initial guess for the electronic wavefunction.

import pennylane as qml
from pennylane import numpy as np
from pennylane import qchem

hf = qchem.hf_state(electrons=2, orbitals=4)
>>> print(hf)
tensor([1, 1, 0, 0], requires_grad=True)

If you play around with qchem.hf_state, you will notice that it always returns a 1-D array of length orbitals with the first electrons elements set to 1 and the remaining orbitals-electrons elements set to 0. Why does that happen? Precisely because electrons are fermions.

That also means that the qchem.hf_state is equivalent to the following function:

def my_hf_state(electrons: int, orbitals: int) -> np.ndarray:
    return np.array([1]*electrons + [0]*(orbitals-electrons))

The difficult part comes when we try to calculate what these orbitals look like in our own 3D space (getting all of the \(\phi _i\)s), as opposed to their qubit encodings. In other words, the qubit representation of a HF state is always:

$$ \left| \text{HF} \right> = | \overset{\text{orbitals}}{\overbrace{\underset{\text{electrons}}{\underbrace{1,\ldots, 1}}, 0, \ldots , 0}} \rangle \; , $$

but the precise meaning of those ones and zeros change depending on the molecule.

Differentiable Hartree-Fock

Because PennyLane is end-to-end automatically differentiable, we can take advantage of classical optimization methods to obtain optimal orbital coefficients \(c\), nuclear positions \(R\) and basis set parameters \(\alpha\). Let’s take the \(H_2\) molecule as a toy problem - because it’s always good to know what to expect when trying new methods out.

symbols = ['H', 'H']

Now we initialize the parameters. Since the only relevant coordinate in the \(H_2\) molecule is the distance \(d\) between nuclei, we can set all other coordinates to 0. We also have to keep in mind that PennyLane uses atomic units by default.

d = 1.0

geometry = np.array([
    [0.0, 0.0, -d/2], # First nuclear position
    [0.0, 0.0,  d/2]], # Second nuclear position
    requires_grad=True
)

Next, initialize parameters \(coeffs\) and \(\alpha\):

alpha = np.array([
    [3.42525091, 0.62391373, 0.1688554],
    [3.42525091, 0.62391373, 0.1688554]],
    requires_grad = True
)

coeffs = np.array([
    [0.15432897, 0.53532814, 0.44463454],
    [0.15432897, 0.53532814, 0.44463454]],
    requires_grad = True
)

Technical note:

The specific shape of alpha and coeffs tensors is (n_electrons, n_basis_functions) which depends on the specific basis set in use. We are using the standard STO-3G basis, which uses 3 Gaussians per orbital.


Note how we have initialized basis parameters for both basis functions to same values! That is exploiting the symmetry between the two nuclei once again, whatever the ground state ends up being, it has to preserve that symmetry.

Sanity check: How do we know if PennyLane is going to optimize all of our parameters? Just check the requires_grad attribute of each array:

>>> print('Optimize geometry:', geometry.requires_grad)
Optimize geometry: True
>>> print('Optimize alpha:', alpha.requires_grad)
Optimize alpha: True
>>> print('Optimize coeffs:', coeffs.requires_grad)
Optimize coeffs: True

Finally, let’s create the qchem.Molecule object:

molecule = qchem.Molecule(symbols, geometry, alpha=alpha, coeff=coeffs)

the cost function

cost = qchem.hf_energy(molecule)
params = [geometry, alpha, coeffs]

and the standard gradient descent optimizer:

opt = qml.GradientDescentOptimizer(stepsize=0.2)

Let’s write a simple optimization loop. At each step, we will calculate the energy and store the current value in a list energies as well as the changing nuclear distances in the list distances.

energies = []
distances = []

bohr_to_angstrom = 0.5291772

for n in range(50):

    params, energy = opt.step_and_cost(cost, *params)
    energies.append(energy.item())

    distance = np.linalg.norm(
        (params[0][0] - params[0][1]) * bohr_to_angstrom,
        axis=0
    )

    distances.append(distance.item())

    if n%10 == 0:
        print(
            f'Iteration {n:2} | '
            f'Energy: {energy.item():.6f} | '
            f'Distance: {distance.item():.6f}'
        )
Iteration  0 | Energy: -1.065999 | Distance: 0.606446
Iteration 10 | Energy: -1.120126 | Distance: 0.720048
Iteration 20 | Energy: -1.120524 | Distance: 0.735426
Iteration 30 | Energy: -1.120555 | Distance: 0.738554
Iteration 40 | Energy: -1.120567 | Distance: 0.739279

Let’s plot the results!

import matplotlib.pyplot as plt

fig, ax1 = plt.subplots(figsize=(12, 6))
ax2 = ax1.twinx()

ax1.plot(energies, c='forestgreen', lw=3, label='HF Energy')
ax2.plot(distances, c='firebrick', lw=3, label='Nuclear distance')
ax1.axhline(-1.1373060483, c='k', ls='-.', label='Exact Energy')
ax2.axhline(0.74, c='k', ls=':', lw=2, label='Exact distance')

fig.legend(frameon=True, fontsize=15, loc='center', ncol=2)

ax1.set_xlabel('Iteration #', fontsize=20)
ax1.set_ylabel('Energy [Ha]', fontsize=20)
ax2.set_ylabel(r'Nuclear distance [$\AA $]', fontsize=20)
ax1.tick_params(axis='x', labelsize=16)
ax1.tick_params(axis='y', labelsize=16)
ax2.tick_params(axis='y', labelsize=16)

plt.show()

So what exactly happened here? Under the hood, PennyLane is doing the HF calculation to evaluate the energy expectation

$$ E_{HF} (\alpha, c, d) = \left\langle HF (\alpha, c, d) \right\vert H \left\vert HF (\alpha, c, d)\right\rangle $$

and calculating the relevant derivatives

$$ \frac{\partial E_{HF}}{\partial \alpha} \; , \quad \frac{\partial E_{HF}}{\partial c} \; , \quad \frac{\partial E_{HF}}{\partial d} $$

using methods of automatic differentiation (AD). After these derivatives (“gradients”) have been obtained, our optimizer updates the parameters \(\alpha\), \(c\) and \(d\) in a simple gradient descent step. All of this happens when we call qml.GradientDescentOptimizer.step_and_cost.

Normally, without AD support, one would have to do a parameter sweep to find the optimal values of \(\alpha\), \(c\) and \(d\). That means just computing the energy for a broad range of different parameter values (at a much larger computational cost) and then hand-picking the lowest value.

For example, if we were interested in the equilibrium nuclear distance, we would have to solve the problem using finite differences. This is a more numerically expensive operation, and less reliable than using AD. Another option is symbolic differentiation (automatic or manual manipulation of formulas) which may not always be available or realiable.

Note that the energy and the nuclear distance have still not converged to the exact values because the HF scheme itself is an approximation. That is okay because it is often only required as an initial guess to refine with more sophisticated methods.

Finally, we can check out, the final nuclear distance, for example:

>>> print(distances[-1]) # in angstroms
0.7394400720936742

Conclusion

In this how (and why)-to, we have seen how automatic differentiation can allow us to simultaneously optimize molecular geometries and orbital parameters. These modern methods have allowed us to pre-optimize molecular geometries and orbital parameters to get the most out of existing classical methods before we even start looking at quantum algorithms.

Matija Medvidović

Matija Medvidović

Matija is a PhD student at Columbia University and the Flatiron Institute in New York. He works with machine learning methods to study quantum many-body physics and quantum computers. He is currently a part of the Xanadu residency program. He is a firm believer in keeping bios short and concise.