r"""
.. role:: html(raw)
   :format: html

.. _variational_classifier:

Variational classifier
======================

.. meta::
    :property="og:description": Using PennyLane to implement quantum circuits that can be trained from labelled data to
        classify new data samples.
    :property="og:image": https://pennylane.ai/qml/_images/classifier_output_59_0.png

.. related::

   tutorial_data_reuploading_classifier Data-reuploading classifier
   tutorial_multiclass_classification Multiclass margin classifier
   tutorial_ensemble_multi_qpu Ensemble classification

*Author: PennyLane dev team. Last updated: 19 Jan 2021.*

In this tutorial, we show how to use PennyLane to implement variational
quantum classifiers - quantum circuits that can be trained from labelled
data to classify new data samples. The architecture is inspired by
`Farhi and Neven (2018) <https://arxiv.org/abs/1802.06002>`__ as well as
`Schuld et al. (2018) <https://arxiv.org/abs/1804.00633>`__.
"""

##############################################################################
#
# We will first show that the variational quantum classifier can reproduce
# the parity function
#
# .. math::
#
#     f: x \in \{0,1\}^{\otimes n} \rightarrow y =
#     \begin{cases} 1 \text{  if uneven number of ones in } x \\ 0
#     \text{ otherwise} \end{cases}.
#
# This optimization example demonstrates how to encode binary inputs into
# the initial state of the variational circuit, which is simply a
# computational basis state.
#
# We then show how to encode real vectors as amplitude vectors (*amplitude
# encoding*) and train the model to recognize the first two classes of
# flowers in the Iris dataset.
#
# 1. Fitting the parity function
# ------------------------------
#
# Imports
# ~~~~~~~
#
# As before, we import PennyLane, the PennyLane-provided version of NumPy,
# and an optimizer.





##############################################################################
# Quantum and classical nodes
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# We create a quantum device with four “wires” (or qubits).



##############################################################################
# Variational classifiers usually define a “layer” or “block”, which is an
# elementary circuit architecture that gets repeated to build the
# variational circuit.
#
# Our circuit layer consists of an arbitrary rotation on every qubit, as
# well as CNOTs that entangle each qubit with its neighbour.















##############################################################################
# We also need a way to encode data inputs :math:`x` into the circuit, so
# that the measured output depends on the inputs. In this first example,
# the inputs are bitstrings, which we encode into the state of the qubits.
# The quantum state :math:`\psi` after
# state preparation is a computational basis state that has 1s where
# :math:`x` has 1s, for example
#
# .. math::  x = 0101 \rightarrow |\psi \rangle = |0101 \rangle .
#
# We use the :class:`~pennylane.BasisState` function provided by PennyLane, which expects
# ``x`` to be a list of zeros and ones, i.e. ``[0,1,0,1]``.






##############################################################################
# Now we define the quantum node as a state preparation routine, followed
# by a repetition of the layer structure. Borrowing from machine learning,
# we call the parameters ``weights``.













##############################################################################
# Different from previous examples, the quantum node takes the data as a
# keyword argument ``x`` (with the default value ``None``). Keyword
# arguments of a quantum node are considered as fixed when calculating a
# gradient; they are never trained.
#
# If we want to add a “classical” bias parameter, the variational quantum
# classifier also needs some post-processing. We define the final model by
# a classical node that uses the first variable, and feeds the remainder
# into the quantum node. Before this, we reshape the list of remaining
# variables for easy use in the quantum node.






##############################################################################
# Cost
# ~~~~
#
# In supervised learning, the cost function is usually the sum of a loss
# function and a regularizer. We use the standard square loss that
# measures the distance between target labels and model predictions.











##############################################################################
# To monitor how many inputs the current classifier predicted correctly,
# we also define the accuracy given target labels and model predictions.













##############################################################################
# For learning tasks, the cost depends on the data - here the features and
# labels considered in the iteration of the optimization routine.







##############################################################################
# Optimization
# ~~~~~~~~~~~~
#
# Let’s now load and preprocess some data.
#
# .. note::
#
#     The parity dataset can be downloaded
#     :html:`<a href="https://raw.githubusercontent.com/XanaduAI/qml/master/demonstrations/variational_classifier/data/parity.txt"
#     download=parity.txt target="_blank">here</a>` and
#     should be placed in the subfolder ``variational_classifier/data``.











##############################################################################
# We initialize the variables randomly (but fix a seed for
# reproducibility). The first variable in the list is used as a bias,
# while the rest is fed into the gates of the variational circuit.









##############################################################################
# Next we create an optimizer and choose a batch size…




##############################################################################
# …and train the optimizer. We track the accuracy - the share of correctly
# classified data samples. For this we compute the outputs of the
# variational classifier and turn them into predictions in
# :math:`\{-1,1\}` by taking the sign of the output.






















##############################################################################
# 2. Iris classification
# ----------------------
#
# Quantum and classical nodes
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# To encode real-valued vectors into the amplitudes of a quantum state, we
# use a 2-qubit simulator.



##############################################################################
# State preparation is not as simple as when we represent a bitstring with
# a basis state. Every input x has to be translated into a set of angles
# which can get fed into a small routine for state preparation. To
# simplify things a bit, we will work with data from the positive
# subspace, so that we can ignore signs (which would require another
# cascade of rotations around the z axis).
#
# The circuit is coded according to the scheme in `Möttönen, et al.
# (2004) <https://arxiv.org/abs/quant-ph/0407010>`__, or—as presented
# for positive vectors only—in `Schuld and Petruccione
# (2018) <https://link.springer.com/book/10.1007/978-3-319-96424-9>`__. We
# had to also decompose controlled Y-axis rotations into more basic
# circuits following `Nielsen and Chuang
# (2010) <http://www.michaelnielsen.org/qcqi/>`__.






























##############################################################################
# Let’s test if this routine actually works.




















##############################################################################
# Note that the ``default.qubit`` simulator provides a shortcut to
# ``statepreparation`` with the command
# ``qml.QubitStateVector(x, wires=[0, 1])``. However, some devices may not
# support an arbitrary state-preparation routine.
#
# Since we are working with only 2 qubits now, we need to update the layer
# function as well.








##############################################################################
# The variational classifier model and its cost remain essentially the
# same, but we have to reload them with the new state preparation and
# layer functions.





















##############################################################################
# Data
# ~~~~
#
# We then load the Iris data set. There is a bit of preprocessing to do in
# order to encode the inputs into the amplitudes of a quantum state. In
# the last preprocessing step, we translate the inputs x to rotation
# angles using the ``get_angles`` function we defined above.
#
# .. note::
#
#     The Iris dataset can be downloaded
#     :html:`<a href="https://raw.githubusercontent.com/XanaduAI/qml/master/demonstrations/variational_classifier/data/iris_classes1and2_scaled.txt"
#     download=parity.txt target="_blank">here</a>` and should be placed
#     in the subfolder ``variational_classifer/data``.





# pad the vectors to size 2^2 with constant values




# normalize each input




# angles for state preparation are new features





##############################################################################
# These angles are our new features, which is why we have renamed X to
# “features” above. Let’s plot the stages of preprocessing and play around
# with the dimensions (dim1, dim2). Some of them still separate the
# classes well, while others are less informative.
#
# *Note: To run the following code you need the matplotlib library.*


































##############################################################################
# This time we want to generalize from the data samples. To monitor the
# generalization performance, the data is split into training and
# validation set.










# We need these later for plotting



##############################################################################
# Optimization
# ~~~~~~~~~~~~
#
# First we initialize the variables.







##############################################################################
# Again we optimize the cost. This may take a little patience.




# train the variational classifier
























##############################################################################
# We can plot the continuous output of the variational classifier for the
# first two dimensions of the Iris data set.




# make data for decision regions



# preprocess grid points like data inputs above










# plot decision regions








# plot data



































