1. Quantum Datasets/
  2. Conventional Datasets for QML

Conventional Datasets for QML

In addition to quantum data, quantum algorithms can also work with conventional data, frequently referred to as “classical” data. For example, ongoing research investigates whether quantum machine learning algorithms can be used to label, categorize, or recognize features in objects like images, video or sound recordings.

Using machine learning datasets with PennyLane

Various datasets have become an integral part of machine learning research, and are typically used to benchmark new machine learning algorithms and techniques. Many of these are too large to use in current quantum models, where we have to revert to the historical beginnings of machine learning (such as the Iris flower dataset from 1936 or the MNIST handwritten digit dataset from 1994) and/or apply dimensionality reduction techniques on larger data.

Services exist for easily downloading standard machine learning datasets within Python, and these integrate seamlessly into PennyLane workflows. For example, Scikit Learn, TensorFlow, and PyTorch all provide common built-in data loaders. Once loaded in your machine learning library of choice, these can be easily used with PennyLane via our embedding templates.

Embedding classical data in quantum algorithms

When working with classical data in quantum algorithms, an important (and non-trivial) aspect is how the classical data is embedded within the quantum system.

For more details, check out the Embedding Templates that come with PennyLane. These templates help automate various data pre-processing steps that convert classical information into orthonormalized vectors or other quantum-compatible input.

Note that how to incorporate and embed classical data for use in quantum algorithms is still an active field of research!

Example demos

PennyLane also has various demos available showcasing using classical data. Examples include:

  1. The MNIST database - labelled digital images of handwritten digits. An example can be found in the Quanvolutional Neural Networks demo.
  2. Iris flower dataset - measurements of various features of flower samples from three different Iris species and the species labels. An example can be found in the Variational Classifier demo.
  3. Hymenoptera dataset - digital images of ants and bees useful for binary classification. An example can be found in the Quantum Transfer Learning demo.