April 19, 2023
Quantum NLP with the lambeq–PennyLane integration
The following is a guest post by Charlie London, Thomas Cervoni and Bob Coecke from Quantinuum, showcasing the recently published integration between lambeq and PennyLane, which allows for the automatic differentiation and training of natural language processing models.
Quantum natural language processing (QNLP) is an exciting research area at the intersection of quantum computing and computational linguistics, with its practical implementation of natural language processing. In this post we give an introduction to QNLP and show how you can run your own QNLP model by training a PennyLane hybrid model on the lambeq
library using PyTorch.
Quantinuum has built the open-source lambeq
library so that quantum software engineers interested in NLP, NLP engineers interested in quantum computing, and anybody who finds this field exciting can build and train a QNLP pipeline from beginning to end.
QNLP pipeline in lambeq
In lambeq
, a QNLP pipeline looks like this:

The structure of the QNLP model is determined by the grammatical structure of the input sentence, so lambeq
first calls a parser. In its current version, lambeq
includes bobcat
, a state-of-the-art neural-trained parser, and also supports several others.
Once the parsing is complete, lambeq
outputs a syntax tree for the sentence, which is then encoded into an abstract representation called a string diagram, which reflects the relationships between the words, as defined by a compositional model of choice. This step happens independently of any implementation decisions that occur at a lower level.
String diagrams can be formalized using category theory and further simplified using rewriting rules, which allow for the removal of specific interactions between words that might be considered redundant for the task at hand, or which make the computation more amenable to implementation on a quantum processing unit.
Finally, lambeq
converts the resulting rewritten string diagram into a concrete quantum circuit based on a specific parameterization scheme and concrete choices of ansätze. With this, the output quantum circuit is ready to be used for training.
To demonstrate, we will train a hybrid model named PennyLaneModel
, which will determine whether a given pair of sentences are talking about different topics. We will use an IQPAnsatz
to convert string diagrams into quantum circuits; when circuits are passed to the model, they will be automatically converted into PennyLane circuits.
Preparing the hybrid QNLP model
To train our hybrid model, we start by specifying some training hyperparameters and importing NumPy and PyTorch.
import torch import random import numpy as np # set seed for reproducibility SEED = 12 torch.manual_seed(SEED) random.seed(SEED) np.random.seed(SEED)
Inputting data
Let's read the data and print some example sentences. These data are an example task provided in lambeq
, which uses the train and test sets, as well as a development/validation set that can be used for hyperparameter optimisation and early stopping. In this task, our data are pairs of sentences, each about cooking or computing. The label is a 1 if both sentences in a pair are about the same topic and a 0 otherwise.
def read_data(filename): labels, sentences = [], [] with open(filename) as f: for line in f: line = line.split(',') labels.append(int(line[2])) sentences.append((line[0], line[1])) return labels, sentences train_labels, train_data = read_data('mc_pair_train_data.csv') dev_labels, dev_data = read_data('mc_pair_dev_data.csv') test_labels, test_data = read_data('mc_pair_test_data.csv') print(list(zip(train_data[:5], train_labels[:5])))
[(('person runs program .', 'skillful person cooks meal .'), 0), (('skillful man bakes dinner .', 'woman runs program .'), 0), (('man bakes tasty dinner .', 'person cooks meal .'), 1), (('woman runs application .', 'skillful man prepares program .'), 1), (('man bakes tasty meal .', 'man runs software .'), 0)]
Creating and parameterizing diagrams
The first step in this process is to convert the sentences into string diagrams using the BobcatParser
, a state-of-the-art combinatory categorial grammar parser included with lambeq
. BobCatParser
expects a list of strings, so we will flatten our paired data for the parsing steps and re-pair the sentences later.
train_data_l, train_data_r = zip(*train_data) train_data_unpaired = list(train_data_l) + list(train_data_r) dev_data_l, dev_data_r = zip(*dev_data) dev_data_unpaired = list(dev_data_l) + list(dev_data_r) test_data_l, test_data_r = zip(*test_data) test_data_unpaired = list(test_data_l) + list(test_data_r)
from lambeq import BobcatParser reader = BobcatParser(verbose='text') raw_train_diagrams = reader.sentences2diagrams(train_data_unpaired) raw_dev_diagrams = reader.sentences2diagrams(dev_data_unpaired) raw_test_diagrams = reader.sentences2diagrams(test_data_unpaired)
Simplifying diagrams
We simplify the diagrams by calling lambeq.utils.remove_cups
; because a cup in a string diagram corresponds to a Bell effect postselection in a quantum circuit, removing cups can reduce the number of postselections per diagram, making it more efficient to evaluate them.
from lambeq import remove_cups train_diagrams = [remove_cups(diagram) for diagram in raw_train_diagrams] dev_diagrams = [remove_cups(diagram) for diagram in raw_dev_diagrams] test_diagrams = [remove_cups(diagram) for diagram in raw_test_diagrams]
We can visualise these diagrams using discopy.monoidal.Diagram.draw
.
train_diagrams[1].draw()

Creating circuits
To run the experiments on a quantum computer, we apply a quantum ansatz to the string diagrams. For this experiment we use an IQPAnsatz
, where one-qubit systems represent noun wires (n) and sentence wires (s).
from lambeq import AtomicType, IQPAnsatz ansatz = IQPAnsatz({AtomicType.NOUN: 1, AtomicType.SENTENCE: 1}, n_layers=1, n_single_qubit_params=3) train_circuits = [ansatz(diagram) for diagram in train_diagrams] dev_circuits = [ansatz(diagram) for diagram in dev_diagrams] test_circuits = [ansatz(diagram) for diagram in test_diagrams] train_circuits[0].draw(figsize=(6, 8))

In general, sentence circuits will have outputs of size 2n, where n is the number of open output wires in the circuit. We have constructed our circuits so that each has a single output wire, and so our outputs will be of the form [a, b]
, where a
corresponds to the probability that the output state is 0, and b
to the probability that it is 1.
Hybrid QNLP model
This hybrid model determines whether a pair of diagrams refers to the same or different topics, by first running the pair circuits to get a probability output for each, and then concatenating them and passing them to a simple neural network.
We expect the circuits to learn to output [0, 1] or [1, 0], depending on the topic they are referring to (in the case of our example, cooking or computing), and the neural network to learn the XOR function to determine whether the topics are the same (output 0) or different (output 1).
PennyLane allows us to train both the circuits and the neural network (NN) simultaneously, using PyTorch autograd
.
BATCH_SIZE = 50 EPOCHS = 100 SEED = 2
As the probability outputs from our circuits are guaranteed to be positive, we transform them from x
to 2 * (x - 0.5)
, so inputs given to the neural network will be in the range [-1, 1]. This helps us avoid dying ReLUs, which could otherwise occur if all the input weights to a given hidden neuron were negative.In that case, the overall input to the neuron would also be negative, and ReLU would set its output to 0, leading to the gradient of all these weights being 0 for all samples, and causing the neuron to never learn.
(A couple of alternative approaches could also involve initializing all the NN weights to be positive, or using LeakyReLU
as the activation function).
from torch import nn from lambeq import PennyLaneModel # inherit from PennyLaneModel to use the PennyLane circuit evaluation class XORSentenceModel(PennyLaneModel): def __init__(self, **kwargs): PennyLaneModel.__init__(self, **kwargs) self.xor_net = nn.Sequential(nn.Linear(4, 10), nn.ReLU(), nn.Linear(10, 1), nn.Sigmoid()) def forward(self, diagram_pairs): first_d, second_d = zip(*diagram_pairs) # evaluate each circuit and concatenate the results evaluated_pairs = torch.cat((self.get_diagram_output(first_d), self.get_diagram_output(second_d)), dim=1) evaluated_pairs = 2 * (evaluated_pairs - 0.5) # pass the concatenated results through a simple neural network return self.xor_net(evaluated_pairs)
Creating paired dataset
We flattened our data to parse the sentences into string diagrams; now, we need to pair our diagrams again.
def make_pair_data(diagrams): pair_diags = list(zip(diagrams[:len(diagrams)//2], diagrams[len(diagrams)//2:])) return pair_diags train_pair_circuits = make_pair_data(train_circuits) dev_pair_circuits = make_pair_data(dev_circuits) test_pair_circuits = make_pair_data(test_circuits)
Initializing the model
As XORSentenceModel
inherits from PennyLaneModel
, we can pass in probabilities=True
and normalize=True
to XORSentenceModel.from_diagrams
, so that the outputs of circuit evaluation will be normalized probabilities of each potential output state.
from lambeq import Dataset all_pair_circuits = (train_pair_circuits + dev_pair_circuits + test_pair_circuits) a, b = zip(*all_pair_circuits) # initialise our model by passing in the diagrams, so that we have trainable parameters for each token model = XORSentenceModel.from_diagrams(a + b, probabilities=True, normalize=True) model.initialise_weights() model = model.double() # initialise datasets and optimizers as in PyTorch train_pair_dataset = Dataset(train_pair_circuits, train_labels, batch_size=BATCH_SIZE) optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
Training and accuracies
We train the model using pure PyTorch
in the same way as above.
def accuracy(circs, labels): predicted = model(circs) return (torch.round(torch.flatten(predicted)) == torch.DoubleTensor(labels)).sum().item()/len(circs)
best = {'acc': 0, 'epoch': 0} for i in range(EPOCHS): epoch_loss = 0 for circuits, labels in train_pair_dataset: optimizer.zero_grad() predicted = model(circuits) # use BCELoss as our outputs are probabilities, and labels are binary loss = torch.nn.functional.binary_cross_entropy( torch.flatten(predicted), torch.DoubleTensor(labels)) epoch_loss += loss.item() loss.backward() optimizer.step() # evaluate on dev set every 5 epochs # save the model if it's the best so far # stop training if the model hasn't improved for 10 epochs if i % 5 == 0: dev_acc = accuracy(dev_pair_circuits, dev_labels) print('Epoch: {}'.format(i)) print('Train loss: {}'.format(epoch_loss)) print('Dev acc: {}'.format(dev_acc)) if dev_acc > best['acc']: best['acc'] = dev_acc best['epoch'] = i model.save('xor_model.lt') elif i - best['epoch'] >= 10: print('Early stopping') break # load the best performing iteration of the model on the dev set if best['acc'] > accuracy(dev_pair_circuits, dev_labels): model.load('xor_model.lt') model = model.double()
Epoch: 0 Train loss: 4.288879457948259 Dev acc: 0.485 Epoch: 5 Train loss: 4.077649859011359 Dev acc: 0.54 Epoch: 10 Train loss: 2.0303092858583365 Dev acc: 0.77 Epoch: 15 Train loss: 1.212296025928454 Dev acc: 0.805 Epoch: 20 Train loss: 1.4002607296794438 Dev acc: 0.915 Epoch: 25 Train loss: 0.1731687529107136 Dev acc: 1.0 Epoch: 30 Train loss: 0.0760011289161605 Dev acc: 1.0 Epoch: 35 Train loss: 0.048206356983370065 Dev acc: 1.0 Early stopping
Let's use the test data to check how well our model is performing:
print('Final test accuracy: {}'.format(accuracy(test_pair_circuits, test_labels)))
Final test accuracy: 1.0
Analysing the internal representations of the model
We hypothesized that the quantum circuits would be able to separate the representations of sentences about food and cooking and that the classical NN would learn to XOR these representations to give the model output. Here we can look at parts of the model separately to determine whether this hypothesis was accurate.
First, we can look at the output of the NN when given the four possible binary inputs to XOR.
xor_labels = [[1, 0, 1, 0], [0, 1, 0, 1], [1, 0, 0, 1], [0, 1, 1, 0]] # the first two entries correspond to the same label for both sentences, # the last two to different labels xor_tensors = torch.tensor(xor_labels).double() model.xor_net(xor_tensors).detach().numpy()
array([[0.93369337], [0.86299455], [0.0272331 ], [0.00149232]])
In the case that the labels are the same (the first two outputs), the outputs are significantly greater than 0.5 (close to the correct label 1), and in the case that the labels are different (the final two outputs), the outputs are significantly less than 0.5 (close to the correct label 0), so the NN seems to have learned the XOR function.
We can also look at the outputs of some of the test circuits to determine whether they have been able to separate the two classes of sentences.
# cooking sentence print(test_data[0][0]) p_circ = test_pair_circuits[0][0].to_pennylane(probabilities=True) symbol_weight_map = dict(zip(model.symbols, model.weights)) p_circ.initialise_concrete_params(symbol_weight_map) unnorm = p_circ.eval().detach().numpy() print(unnorm / np.sum(unnorm))
skillful man bakes sauce . array([0.03813219, 0.96186781])
Let's compare the result with a sentence of the other category.
# computing sentence print(test_data[1][0]) p_circ = test_pair_circuits[1][0].to_pennylane(probabilities=True) p_circ.initialise_concrete_params(model.symbols, model.weights) unnorm = p_circ.eval().detach().numpy() print(unnorm / np.sum(unnorm))
skillful person prepares software . array([0.93893711, 0.06106289])
From these examples, it seems that the circuits can differentiate between the two topics strongly, assigning approximately [0, 1] to the sentence about food and [1, 0] to the sentence about computing.
Summary
In this blog post, we have briefly introduced lambeq
and its new integration with PennyLane, including a simple example of training a hybrid model using PennyLane as the backend. The latest lambeq
release also enables users to access PennyLane's differentiation methods and to use PennyLane plugins to run models on real quantum devices. A more in-depth tutorial showcasing many of these features can be found in the lambeq
documentation.
We always appreciate questions and suggestions from the community, and any feedback to the lambeq
GitHub or lambeq
Discord is always appreciated!
To learn more about lambeq
, check out these resources:
About the authors
Charlie London
Charlie London is a theoretical computer scientist and software developer. At Quantinuum, Charlie has been working on QNLP experiments and recently developed the PennyLane module in lambeq.
Thomas Cervoni
Thomas Cervoni has a mathematical physics and business background. At Quantinuum, Thomas contributes to developing educational content for the community; you can contact him if you have any questions/suggestions.
Bob Coecke
Bob Coecke is chief scientist at Quantinuum and professor emeritus at Oxford University. Bob pioneered QNLP and diagrammatic reasoning for quantum computing, the most used instance of which is the ZX calculus.