g-sim: Lie-algebraic classical simulations for variational quantum computing

Korbinian Kottmann

Demos/
Quantum Computing/
g-sim: Lie-algebraic classical simulations for variational quantum computing

g-sim: Lie-algebraic classical simulations for variational quantum computing

Korbinian Kottmann

Published: June 06, 2024. Last updated: September 22, 2025.

For the most part, we now know the phenomenon of barren plateaus can be reduced to the dimension of the circuit’s dynamical Lie algebra (DLA). In particular, exponentially sized DLAs lead to exponentially vanishing gradients (barren plateaus). Conversely, it has been realized that circuits with polynomially sized DLAs can be efficiently simulated using a technique called \(\mathfrak{g}\)-sim, leading to discussions on whether all trainable parametrized circuits are also efficiently classically simulable.

So what is all the fuss about? How does \(\mathfrak{g}\)-sim work? What are its restrictions? How can I run it in PennyLane? We are going to try to answer all these questions in the demo below.

Introduction

Lie algebras are tightly connected to quantum physics [1]. While Lie algebra theory is an integral part of high energy and condensed matter physics, recent developments have shown connections to quantum simulation and quantum computing. In particular, the infamous barren plateau problem has been fully characterized by the underlying dynamical Lie algebra (DLA) [2] [3]. The main result of these works is that the dimension of the circuit’s DLA is inversely proportional to the variance of the mean of the gradient (over a uniform parameter distribution), leading to exponentially vanishing gradients in the uniform average case whenever the DLA scales exponentially in system size.

At the same time, there exist Lie algebraic techniques with which one can classically simulate expectation values of circuits with a complexity polynomial in the dimension of the circuit’s DLA [4] [5] [6] [7]. Hence, circuits with guaranteed non-exponentially vanishing gradients in the uniform average case are classically simulable, leading to some debate on whether the field of variational quantum computing is doomed or not [8]. The majority of DLAs are in fact exponentially sized [9], shifting this debate towards the question of whether or not uniform average case results are relevant in practice for variational methods [10], with some arguing for better initialization methods [11].

In this demo, we want to focus on those cases where efficient classical simulation is possible due to polynomially sized DLAs. These instances are rather limited as it mainly concerns DLAs of non-interacting systems as well as the transverse-field Ising model and variations thereof (see [9] for details).

Lie algebra basics

Before going into the specifics of Lie algebra simulation (\(\mathfrak{g}\)-sim), we want to briefly recap the most important concepts of Lie algebra theory that are relevant for us. More info can be found in our Intro to (Dynamical) Lie Algebras for quantum practitioners.

Given Hermitian operators \(G = \{h_i\}\) (think Hermitian observables like terms of a Hamiltonian), the dynamical Lie algebra \(\mathfrak{g}\) can be computed via the Lie closure \(\langle \cdot \rangle_\text{Lie}\) (see lie_closure()),

\[\mathfrak{g} = \langle \{h_i\} \rangle_\text{Lie} \subseteq \mathfrak{su}(2^n).\]

That is, by computing all possible nested commutators until no new operators emerge. This leads to a set of operators that is closed under commutation, hence the name. In particular, the result of the commutator between any two elements in \(\mathfrak{g}\) can be decomposed as a linear combination of other elements in \(\mathfrak{g},\)

\[[h_\alpha, h_\beta] = \sum_\gamma f^\gamma_{\alpha \beta} h_\gamma.\]

The coefficients \(f^\gamma_{\alpha \beta}\) are called the structure constants of the DLA and can be computed via the standard projection in vector spaces (as is \(\mathfrak{g}\)),

\[f^\gamma_{\alpha \beta} = \frac{\langle h_\gamma, [h_\alpha, h_\beta]\rangle}{\langle h_\gamma, h_\gamma\rangle}.\]

The main difference from the usual vector spaces like \(\mathbb{R}^N\) or \(\mathbb{C}^N\) is that here we use the trace inner product between operators \(\langle h_\alpha, h_\beta \rangle = \text{tr}\left[h_\alpha^\dagger h_\beta \right]\)

Note

Technically, the (dynamical) Lie algebra is formed by skew-Hermitian operators \(\{i h_i\}.\) We avoid this distinction here since for all practical purposes one can also look at Hermitian operators and explicitly add imaginary units in the exponents where appropriate. For more details, see the note in the “Lie algebras” section of our Intro to (dynamical) Lie algebras for quantum practitioners.

-sim theory\(\mathfrak{g}\)-sim theory

In Lie algebra simulation, \(\mathfrak{g}\)-sim, we are interested in how expectation values of Lie algebra elements are transformed under unitary evolution. We start from an initial expectation value vector of the input state \(\rho^0\) with respect to each DLA element,

\[(\vec{e}^0)_\alpha = \text{tr}\left[h_\alpha \rho^0 \right].\]

Graphically, we can represent this as a tensor with one leg.

When we transform the state \(\rho^0\) with a unitary evolution \(U,\) we can use the cyclic property of the trace to shift the evolution onto the DLA element,

\[(\vec{e}^1)_\alpha = \text{tr}\left[ h_\alpha U \rho^0 U^\dagger \right] = \text{tr}\left[ U^\dagger h_\alpha U \rho^0 \right].\]

In the context of \(\mathfrak{g}\)-sim, we assume the unitary operator to be generated by DLA elements \(h_\mu \in \mathfrak{g};\) in particular, we have

\[U = e^{-i \theta h_\mu}\]

with some real parameter \(\theta \in \mathbb{R}.\)

As a consequence of the Baker–Campbell–Hausdorff formula, we know that any \(h_\alpha \in \mathfrak{g}\) transformed under such a \(U\) is again in \(\mathfrak{g}\) (because it leads to a sum of nested commutators between DLA elements, and the DLA is closed under commutation). In fact, it is a well-known result that the resulting operator is given by the exponential of the structure constants

\[e^{i \theta h_\mu} h_\alpha e^{-i \theta h_\mu} = \sum_\beta e^{-i \theta f^\mu_{\alpha \beta}} h_\beta.\]

This is the identity connecting the adjoint representations of a Lie group, \(\text{Ad}_{e^{-ih_\mu}}(x) = e^{ih_\mu} x e^{-ih_\mu},\) and the adjoint representation of the associated Lie algebra, \(\left(\text{ad}_{h_\mu}\right)_{\alpha \beta} = f^\mu_{\alpha \beta}.\) It can be summarized as

\[\text{Ad}_{e^{-ih_\mu}} = e^{-i \text{ad}_{h_\mu}}.\]

To the best of our knowledge there is no universally accepted name for this identity (see, e.g. Adjoint representation (Wikipedia) or Lemma 3.14 in Introduction to Lie Groups and Lie Algebras), so we shall refer to it as the “adjoint identity” from here on.

With this, we can see how the initial expectation value vector is transformed under unitary evolution,

\[(\vec{e}^1)_\alpha = \sum_\beta e^{-i \theta f^\mu_{\alpha \beta}} \text{tr}\left[h_\beta \rho^0 \right].\]

This is simply the matrix-vector product between the adjoint representation of the unitary gate and the initial expectation value vector. For a unitary circuit composed of multiple gates,

\[\mathcal{U} = \prod_j e^{-i \theta_j h_j},\]

this becomes the product of multiple adjoint representations of said gates,

\[\tilde{U} = \prod_j e^{-i \theta_j \text{ad}_{h_j}}.\]

So overall, the evolution can be summarized graphically as the following.

We are typically interested in expectation values of observables composed of DLA elements, \(\langle \hat{O} \rangle = \sum_\alpha w_\alpha h_\alpha.\) Overall, the computation in \(\mathfrak{g}\)-sim is a vector-matrix-vector product,

\[\langle \hat{O} \rangle = \text{tr}\left[\hat{O} \mathcal{U} \rho^0 \mathcal{U}^\dagger \right] = \sum_{\alpha \beta} w_\alpha \tilde{U}_{\alpha \beta} e_\beta = \vec{w} \cdot \tilde{U} \cdot \vec{e}.\]

Or, graphically:

The dimension of \(\left(\text{ad}_{h_j}\right)_{\alpha \beta} = f^j_{\alpha \beta}\) is \(\text{dim}(\mathfrak{g}) \times \text{dim}(\mathfrak{g}).\) So while we evolve a \(2^n\)-dimensional complex vector in state vector simulators, we evolve a \(\text{dim}(\mathfrak{g})\)-dimensional expectation vector in \(\mathfrak{g}\)-sim, which is more efficient whenever \(\text{dim}(\mathfrak{g}) < 2^n.\) In general, it is efficient whenever \(\text{dim}(\mathfrak{g}) = O\left(\text{poly}(n)\right).\)

-sim in PennyLane\(\mathfrak{g}\)-sim in PennyLane

Let us put this into practice and write a differentiable \(\mathfrak{g}\)-simulator in PennyLane. We start with some boilerplate PennyLane imports.

import pennylane as qml
from pennylane import X, Z, I
import numpy as np

import jax
import jax.numpy as jnp
from jax.scipy.linalg import expm

jax.config.update("jax_enable_x64", True)
jax.config.update("jax_platform_name", "cpu")

System DLA

As mentioned before, polynomially sized DLAs are rare, with the transverse-field Ising model (TFIM) with nearest neighbors being one of them. We take, for simplicity, the one-dimensional variant with open boundary conditions,

\[H_\text{TFIM} = \sum_{j=1}^{n-1} J X_j X_{j+1} + \sum_{i=1}^{n} h Z_j.\]

We define its generators and compute the lie_closure().

n = 10 # number of qubits.
generators = [X(i) @ X(i+1) for i in range(n-1)]
generators += [Z(i) for i in range(n)]

# work with PauliSentence instances for efficiency
generators = [op.pauli_rep for op in generators]

dla = qml.lie_closure(generators, pauli=True)
dim_g = len(dla)

We are using the PauliSentence representation of the operators via the op.pauli_rep attribute for more efficient arithmetic and processing.

Initial expectation value vector

With that, we can compute the initial expectation value vector for the \(\rho_0 = |0 \rangle \langle 0 |\) initial state for every DLA element. We are doing a trick of representing the initial state as a Pauli operator, \(|0 \rangle \langle 0 |^{\otimes n} = \prod_{i=1}^n (I_i + Z_i)/2.\) We take advantage of the locality of the DLA elements and use the analytic, normalized trace method trace(), all to avoid having to go to the full Hilbert space.

# compute initial expectation value vector
e_in = np.zeros(dim_g, dtype=float)

for i, h_i in enumerate(dla):
    # initial state |0x0| = (I + Z)/2, note that trace function
    # below already normalizes by the dimension,
    # so we can ommit the explicit factor /2
    rho_in = qml.prod(*(I(i) + Z(i) for i in h_i.wires))
    rho_in = rho_in.pauli_rep

    e_in[i] = (h_i @ rho_in).trace()

e_in = jnp.array(e_in)
e_in

Array([0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.], dtype=float64)

Observable

We can compute the expectation value of any linear combination of DLA elements. We choose the TFIM Hamiltonian itself,

\[\hat{O} = H_\text{TFIM} = \sum_j J X_j X_{j+1} + h Z_j.\]

So just the generators with some coefficients. Here we choose \(J=h=0.5\) for simplicity. We generate the \(\vec{w}\) vector by setting the appropriate coefficients to ``0.5`.`

w = np.zeros(dim_g, dtype=float)
w[:len(generators)] = 0.5
w = jnp.array(w)

Forward and backward pass

Together with the structure constants computed via structure_constants() we now have all ingredients to define the forward pass of the expectation value computation. For demonstration purposes, we choose a random subset of depth=10 generators for gates from the DLA.

adjoint_repr = qml.structure_constants(dla)

depth = 10
gate_choice = np.random.choice(dim_g, size=depth)
gates = adjoint_repr[gate_choice]

def forward(theta):
    # simulation
    e_t = e_in
    for i in range(depth):
        e_t = expm(theta[i] * gates[i]) @ e_t

    # final expectation value
    result_g_sim = w @ e_t

    return result_g_sim.real

theta = jax.random.normal(jax.random.PRNGKey(0), shape=(10,))

gsim_forward, gsim_backward = forward(theta), jax.grad(forward)(theta)
gsim_forward, gsim_backward

(Array(2.2116628, dtype=float64), Array([ 0.60618954,  1.72412289,  0.8273452 , -0.73382531,  0.25272748,
        1.29904475,  0.        , -0.02876474, -0.16899613,  0.        ],      dtype=float64))

As a sanity check, we compare the computation with the full state vector equivalent circuit.

H = 0.5 * qml.sum(*[op.operation() for op in generators])

@qml.qnode(qml.device("default.qubit"), interface="jax")
def qnode(theta):
    for i, mu in enumerate(gate_choice):
        qml.exp(-1j * theta[i] * dla[mu].operation())
    return qml.expval(H)

statevec_forward, statevec_backward = qnode(theta), jax.grad(qnode)(theta)
statevec_forward, statevec_backward

(Array(2.2116628, dtype=float64), Array([ 6.06189542e-01,  1.72412289e+00,  8.27345201e-01, -7.33825311e-01,
        2.52727482e-01,  1.29904475e+00,  4.36021527e-16, -2.87647427e-02,
       -1.68996135e-01,  1.48902100e-15], dtype=float64))

We see that both simulations yield the same results, while full state vector simulation is done with a \(2^n = 1024\) dimensional state vector, and \(\mathfrak{g}\)-sim with a \(\text{dim}(g) = 2n (2n-1)/2 = 190\) dimensional expectation value vector.

print(
    qml.math.allclose(statevec_forward, gsim_forward),
    qml.math.allclose(statevec_backward, gsim_backward),
)

True True

Beyond 6 qubits, \(\mathfrak{g}\)-sim is more efficient in simulating circuits generated by the TFIM Hamiltonian.

import matplotlib.pyplot as plt
ns = np.arange(2, 17)

plt.plot(ns, 2*ns*(2*ns-1)/2, "x-", label="dim(g)")
plt.plot(ns, 2**ns, ".-", label="2^n")
plt.yscale("log")
plt.legend()
plt.xlabel("n qubits")
plt.show()

VQE

Let us do a quick run of the variational quantum eigensolver (VQE) on the system at hand.

First, we define our optimization loop in jax. Consider this boilerplate code and see our demo on how to optimize a QML model using JAX and Optax for details.

import optax
from datetime import datetime

def run_opt(value_and_grad, theta, n_epochs=100, lr=0.1, b1=0.9, b2=0.999, E_exact=0., verbose=True):

    optimizer = optax.adam(learning_rate=lr, b1=b1, b2=b2)
    opt_state = optimizer.init(theta)

    energy = np.zeros(n_epochs)
    gradients = []
    thetas = []

    @jax.jit
    def step(opt_state, theta):
        val, grad_circuit = value_and_grad(theta)
        updates, opt_state = optimizer.update(grad_circuit, opt_state)
        theta = optax.apply_updates(theta, updates)

        return opt_state, theta, val


    t0 = datetime.now()

    ## Optimization loop
    for n in range(n_epochs):
        opt_state, theta, val = step(opt_state, theta)

        energy[n] = val
        thetas.append(theta)
    t1 = datetime.now()
    if verbose:
        print(f"final loss: {val - E_exact}; min loss: {np.min(energy) - E_exact}; after {t1 - t0}")

    return thetas, energy, gradients

We can use the Hamiltonian variational ansatz as a natural parametrization of an ansatz circuit to obtain the ground-state energy.

In particular, we use the full Hamiltonian generator with a trainable parameter for each term,

\[\prod_{\ell=1}^{10} e^{-i \sum_j \theta^X_j X_j X_{j+1} + \theta^Z_j Z_j},\]

and repeat that over depth=10 layers.

# Pick the adjoint repr of only the Hamiltonian generators
ham_terms = adjoint_repr[:len(generators)]

def forward(theta):
    # simulation
    e_t = jnp.array(e_in)

    for i in range(depth):
        e_t = expm(jnp.einsum("j,jkl->kl", theta[i], ham_terms)) @ e_t

    # final expectation values
    result_g_sim = w @ e_t

    return result_g_sim.real

Now we can run the optimization to find the ground-state energy.

theta = jax.random.normal(jax.random.PRNGKey(0), shape=(depth, len(generators),))

value_and_grad = jax.jit(jax.value_and_grad(forward))

value_and_grad(theta) # jit-compile first

E_exact = H.eigvals().min()

_, energies, _ = run_opt(value_and_grad, theta, E_exact=E_exact, verbose=True)

import matplotlib.pyplot as plt
plt.plot(energies-E_exact)
plt.yscale("log")
plt.ylabel("$E - E_{exact}$")
plt.xlabel("epochs")
plt.show()

final loss: 0.004612067529370378; min loss: 0.004612067529370378; after 0:00:24.735879

We see good convergence to the true ground-state energy after 100 epochs.

Conclusion

We learned about the conceptually intriguing connection between unitary evolution and the adjoint representation of the system DLA via the adjoint identity, and saw how this connection can be used for classical simulation. In particular, for specific systems like the TFIM we can efficiently simulate circuit expectation values.

In case you are more curious about Lie algebraic simulation techniques, check out the follow-up demo on (g+P)-sim an extension of \(\mathfrak{g}\)-sim.

References

About the author

Korbinian Kottmann

Quantum simulation & open source software

Total running time of the script: (0 minutes 45.083 seconds)

Share demo

Ask a question on the forum

About

Software & Documentation

Resources

Topics

Community & Support

g-sim: Lie-algebraic classical simulations for variational quantum computing

Introduction

Lie algebra basics

-sim theory\(\mathfrak{g}\)-sim theory

-sim in PennyLane\(\mathfrak{g}\)-sim in PennyLane

System DLA

Initial expectation value vector

Observable

Forward and backward pass

VQE

Conclusion

References

About the author

Related Demos