跳转至

量子神经网络(QNN)

本教程旨在介绍如何使用 QuICT 中内置的 FRQI 量子图像编码方式和量子神经网络模块构建一个用于分类 MNIST 手写数据集的量子神经网络(Quantum Neural Network, QNN)。

导入运行库

首先,导入必要的运行库及相关依赖,并固定随机种子:

import collections
import yaml
import time
import tqdm
import matplotlib.pyplot as plt
from torchvision import datasets, transforms

from QuICT.algorithm.quantum_machine_learning.ansatz_library import BasicQNN
from QuICT.algorithm.quantum_machine_learning.encoding import FRQI
from QuICT.algorithm.quantum_machine_learning.model.QNN import QuantumNet
from QuICT.algorithm.quantum_machine_learning.optimizer.optimizer import Adam
from QuICT.algorithm.quantum_machine_learning.utils.data import Dataset, DataLoader
from QuICT.algorithm.quantum_machine_learning.utils.loss import MSELoss
from QuICT.algorithm.quantum_machine_learning.utils.ml_utils import *


SEED = 0
set_seed(SEED)

加载和预处理 MNIST 数据

在本教程中,我们将对数字3和6进行分类,并使用 FRQI [2] 编码方式对将灰度图像编码为量子电路。对MNIST数据集的预处理主要目的是 使图片能够在尽量保留较多信息的情况下成功被编码为量子电路。

1. 加载原始 MNIST 数据

Pytorch 的 torchvision 库中的 datasets 能够自动下载 MNIST 手写数据集:

train_data = datasets.MNIST(root="./data/", train=True, download=True)
test_data = datasets.MNIST(root="./data/", train=False, download=True)
train_X = train_data.data
train_Y = train_data.targets
test_X = test_data.data
test_Y = test_data.targets
print("Training examples: ", len(train_Y))
print("Testing examples: ", len(test_Y))
Training examples:  60000
Testing examples:  10000

2. 筛选数据集使其仅包含数字3和6

为了实现对数字3和6的二分类,我们需要删除其他数字,只保留标签为3和6的数据。并且定义标签 y = 6 为正类, y = 3 为负类:

def filter_targets(X, Y, class0=3, class1=6):
    idx = (Y == class0) | (Y == class1)
    X, Y = (X[idx], Y[idx])
    Y = Y == class1
    return X, Y
train_X, train_Y = filter_targets(train_X, train_Y)
test_X, test_Y = filter_targets(test_X, test_Y)
print("Filtered training examples: ", len(train_Y))
print("Filtered testing examples: ", len(test_Y))
Filtered training examples:  12049
Filtered testing examples:  1968

随机选择一个数据并显示:

print("Label: ", train_Y[200])
plt.imshow(train_X[200], cmap="gray")
Label:  tensor(False)

Image

3. 缩小图像

原始的 MNIST 数据集图片尺寸是 \(28\times28\),而 FRQI 编码只适用于分辨率为 \(2^n \times 2^n\) 的图像,因此需要将其缩小到 \(16\times16\)

def downscale(X, resize):
    transform = transforms.Resize(size=resize)
    X = transform(X) / 255.0
    return X
RESIZE = (16, 16)
resized_train_X = downscale(train_X, RESIZE)
resized_test_X = downscale(test_X, RESIZE)

同样地,显示序号为200的图像:

plt.imshow(resized_train_X[200], cmap="gray")

Resized Image

4. 去除冲突数据

在经过向下采样后,可能会产生部分重复图片,并且这些重复图片可能会同时被标记为正类和负类。为了避免这对分类结果造成影响,需要去除同一图片被同时标记为3和6的样本:

def remove_conflict(X, Y, resize):
    x_dict = collections.defaultdict(set)
    for x, y in zip(X, Y):
        x_dict[tuple(x.numpy().flatten())].add(y.item())
    X_rmcon = []
    Y_rmcon = []
    for x in x_dict.keys():
        if len(x_dict[x]) == 1:
            X_rmcon.append(np.array(x).reshape(resize))
            Y_rmcon.append(list(x_dict[x])[0])
    X = np.array(X_rmcon)
    Y = np.array(Y_rmcon)
    return X, Y
train_X, train_Y = remove_conflict(resized_train_X, train_Y, RESIZE)
test_X, test_Y = remove_conflict(resized_test_X, test_Y, RESIZE)
print("Remaining training examples: ", len(train_Y))
print("Remaining testing examples: ", len(test_Y))
Filtered training examples:  12049
Filtered testing examples:  1968

从结果可见本组实验数据中并无冲突样例,但是这个预处理步骤是必要的,如果改变分类的数字或者预处理方式,如:改变图像下采样尺寸,改变图像灰度阶数等,非常有可能出现冲突数据。

将图像数据通过量子电路编码为量子态

本教程将使用 FRQI 编码(Flexible Representation of Quantum Images)[2] 将图片数据转换为量子态。

对于一张 \(2^n \times 2^n\) 的图像, FRQI 编码需要使用 \(2n + 1\) 个量子比特,其中前 \(n\) 个量子比特描述像素的 Y 坐标,接下来的 \(n\) 个量子比特描述像素的 X 坐标,这 \(2n\) 个量子比特统称为位置量子比特,用于描述像素在图像中的位置,最后一个量子比特用于描述像素的灰度值,称为颜色量子比特。FRQI 编码会将图像转化为量子态:

\[ |I(\theta)⟩ = \frac{1}{2^n} \sum_{i=0}^{2^{2n}-1} (\cos \theta_{i}|0⟩ + \sin \theta_{i}|1⟩) \otimes |i⟩ \]

其中

\[ \theta_{i} \in \left [ 0, \frac{\pi}{2} \right ], i = 0, 1, ..., 2^{2n}-1 \]

可见FRQI是是一种混合编码,前半部分 \(\cos \theta_{i}|0⟩ + \sin \theta_{i}|1⟩\) 为连续编码,代表像素颜色,在电路中通过使用受控 Ry 门并设置不同的旋转角度来表示不同灰度值;后半部分 \(|i⟩\) 为离散编码,表示像素的位置,在电路中用位置量子比特的受控情况表示像素的横纵坐标。

QuICT 的 encoding 库内置了 FRQI 编码方式,以这样一张 4x4 的二值图为例:

img = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
plt.imshow(img, cmap="gray")

sample_image

frqi = FRQI(grayscale=2)
cir = frqi(img)
cir.gate_decomposition(decomposition=False)
cir.draw()

FRQI

Info

QuICT 的 encoding 库目前内置了 QubitLattice, FRQI, 和 NEQR 三种图像编码方式。其中 FRQI 和 NEQR 支持对灰度图进行编码,并内置了可以减少编码后电路量子门数的量子图像压缩选项。

此处,我们可以批量化的生成FRQI编码使用的量子电路:

def encoding_img(X, encoding):
    data_circuits = []
    for i in tqdm.tqdm(range(len(X))):
        data_circuit = encoding(X[i])
        data_circuits.append(data_circuit)
    return data_circuits
frqi = FRQI(grayscale=256)
train_X = encoding_img(train_X, frqi)
test_X = encoding_img(test_X, frqi)
100%|██████████| 12049/12049 [00:11<00:00, 1035.43it/s]
100%|██████████| 1968/1968 [00:02<00:00, 788.43it/s]

最后,设置好 batch_size ,即:每次迭代需要使用的数据数量,将编码好的图像量子电路装入支持迭代取数据的 DataLoader ,每轮训练开始前都将数据打乱 以避免模型学到数据顺序,对于数据总数无法整除 batch_size 的情况,丢掉最后一个 batch :

BATCH_SIZE = 32
train_dataset = Dataset(train_X, train_Y)
test_dataset = Dataset(test_X, test_Y)
train_loader = DataLoader(
    dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True
)
test_loader = DataLoader(
    dataset=test_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True
)

量子神经网络

本教程根据 Farhi et al.[2] 使用的方法进行简化构建模型量子电路,主要用始终作用于读出量子比特的双比特门进行电路构建。

1. 构建模型电路

模型电路除了数据量子比特之外,还额外有一个读出量子比特,用于存储预测的分类结果,根据 Measure 门的测量结果是 \(\left | 0 \right \rangle\)\(\left | 1 \right \rangle\) 判定输入图片属于正类还是负类。 在模拟中通常使用 Pauli-Z 测量,由于 Z 算子的谱分解形式为:

\[ Z=\begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}= 1 \times \left | 0 \right \rangle \left \langle 0 \right | + (-1) \times \left | 1 \right \rangle \left \langle 1 \right | \]

可见 Z 算子有两个特征值 \(+1\)\(-1\),对应的特征向量分别是 \(\left | 0 \right \rangle\)\(\left | 1 \right \rangle\),当使用 Z 算子做投影测量,如果测量结果为 \(+1\) 则测量态被投影成 \(\left | 0 \right \rangle\) ,如果测量结果为 \(-1\) 则测量态被投影成 \(\left | 1 \right \rangle\)

QuICT 中可以通过设置哈密顿量 \(H\) 计算测量期望值,再通过期望值求得正类和负类的预测概率:

\[ E = \left \langle \psi \right | H \left | \psi \right \rangle = 1 \times p(+1) + (-1) \times p(-1) \]
\[ p(+1) = \frac{1+E}{2} \quad P(-1) = \frac{1-E}{2} \]

Farhi et al.[2] 使用的模型量子电路,是用双比特门(通常是 RXX,RYY,RZZ 和 RZX 门)始终作用在读出量子比特,和全部数据量子比特上构建的。 QuICT 内置了这样的 QNN 模型电路,我们规定最后一个量子比特是读出量子比特比特,其他量子比特是数据量子比特:

以含4个数据量子比特的情况为例,双层 RYY , RXX 的 QNN 模型电路应为:

basic_qnn = BasicQNN(5, ["YY", "XX"])
basic_qnn_cir = basic_qnn.init_circuit()
basic_qnn_cir.draw()

PQC

Info

QuICT 的 ansatz 库中内置了 BasicQNN, CRADL, CRAML, Hardware-Efficient Ansatz 共 4 种 ansatz 供用户调用。

本教程中将使用三层网络,分别是 RYY 层,RZZ 层和 RZX 层,可训练参数的数量为数据量子比特数 * 网络层数,即 27 个:

n_qubits = int(np.log2(RESIZE[0] * RESIZE[1])) + 2
ansatz = BasicQNN(n_qubits, ["YY", "ZZ", "ZX"])

2. 用 QuICT 内置的 QuantumNet 进行训练

封装好编码后的图像数据,并根据需求构建好模型电路后,就可以使用 QuICT 封装好的 QuantumNet 模块进行训练了。 QuantumNet 模块支持自动微分,只需传入定义好的模型电路和优化器即可完成 QNN 更新需要的前向传播和反向传播过程。

首先,设置机器学习相关参数:

EPOCH = 10  # 训练总轮数
LR = 0.001  # 梯度下降的学习率

定义待训练的 QNN 网络,损失函数和经典优化器,此处使用均方误差损失函数和 Adam 优化器:

loss_fun = MSELoss()
optimizer = Adam(lr=LR)
net = QuantumNet(
    n_qubits=n_qubits,
    ansatz=ansatz,
    optimizer=optimizer,
    device="CPU",
)

Info

QuICT 中内置了三种损失函数,分别是 MSE 损失函数,Hinge 损失函数,和 BCE 损失函数。同时包含 SGD, AdaGrad, RMSProp, Adam 四种经典优化器。

开始训练,每轮训练完成后进行一轮验证:

for ep in range(EPOCH):
    # Train
    loader = tqdm.tqdm(
        train_loader, desc="Training epoch {}".format(ep + 1), leave=True
    )
    for it, (x_train, y_train) in enumerate(loader):
        expectations = net.forward(x_train)  # (32, 1)
        y_true = (2 * y_train - 1.0).reshape(expectations.shape)
        y_pred = -expectations
        loss = loss_fun(y_pred, y_true)
        # optimize
        net.backward(loss)
        # update
        net.update()

        correct = np.where(y_true * y_pred.pargs > 0)[0].shape[0]
        accuracy = correct / len(y_train)
        loader.set_postfix(
            it=it, loss="{:.3f}".format(loss.item), accuracy="{:.3f}".format(accuracy)
        )

    # Validation
    loader_val = tqdm.tqdm(
        test_loader, desc="Validating epoch {}".format(ep + 1), leave=True
    )
    loss_val_list = []
    total_correct = 0
    for it, (x_test, y_test) in enumerate(loader_val):
        expectations_val = net.forward(x_test, train=False)
        y_true_val = (2 * y_test - 1.0).reshape(expectations_val.shape)
        y_pred_val = -expectations_val

        loss_val = loss_fun(y_pred_val, y_true_val)
        loss_val_list.append(loss_val.item)
        correct_val = np.where(y_true_val * y_pred_val.pargs > 0)[0].shape[0]

        total_correct += correct_val
        accuracy_val = correct_val / len(y_test)
        loader_val.set_postfix(
            it=it,
            loss="{:.3f}".format(loss_val.item),
            accuracy="{:.3f}".format(accuracy_val),
        )
    avg_loss = np.mean(loss_val_list)
    qnn_avg_acc = total_correct / (len(loader_val) * BATCH_SIZE)
    print("Validation Average Loss: {}, Accuracy: {}".format(avg_loss, qnn_avg_acc))
Training epoch 1: 100%|██████████| 376/376 [02:41<00:00,  2.32it/s, accuracy=0.906, it=375, loss=0.935]
Validating epoch 1: 100%|██████████| 61/61 [00:18<00:00,  3.34it/s, accuracy=0.938, it=60, loss=0.938]
Validation Average Loss: 0.9394127107271177, Accuracy: 0.8837090163934426
Training epoch 2: 100%|██████████| 376/376 [02:41<00:00,  2.32it/s, accuracy=0.844, it=375, loss=0.934]
Validating epoch 2: 100%|██████████| 61/61 [00:18<00:00,  3.35it/s, accuracy=0.844, it=60, loss=0.907]
Validation Average Loss: 0.9182945990390312, Accuracy: 0.9118852459016393
Training epoch 3: 100%|██████████| 376/376 [02:42<00:00,  2.32it/s, accuracy=0.906, it=375, loss=0.883]
Validating epoch 3: 100%|██████████| 61/61 [00:18<00:00,  3.35it/s, accuracy=0.938, it=60, loss=0.874]
Validation Average Loss: 0.8638879795421284, Accuracy: 0.9252049180327869
Training epoch 4: 100%|██████████| 376/376 [02:41<00:00,  2.32it/s, accuracy=0.969, it=375, loss=0.823]
Validating epoch 4: 100%|██████████| 61/61 [00:18<00:00,  3.33it/s, accuracy=0.906, it=60, loss=0.804]
Validation Average Loss: 0.7808000876160641, Accuracy: 0.9646516393442623
Training epoch 5: 100%|██████████| 376/376 [02:41<00:00,  2.32it/s, accuracy=0.938, it=375, loss=0.760]
Validating epoch 5: 100%|██████████| 61/61 [00:18<00:00,  3.35it/s, accuracy=0.969, it=60, loss=0.755]
Validation Average Loss: 0.7496410566096504, Accuracy: 0.9605532786885246
Training epoch 6: 100%|██████████| 376/376 [02:42<00:00,  2.32it/s, accuracy=0.938, it=375, loss=0.728]
Validating epoch 6: 100%|██████████| 61/61 [00:18<00:00,  3.37it/s, accuracy=0.906, it=60, loss=0.742]
Validation Average Loss: 0.7384923100156554, Accuracy: 0.9600409836065574
Training epoch 7: 100%|██████████| 376/376 [02:41<00:00,  2.32it/s, accuracy=1.000, it=375, loss=0.716]
Validating epoch 7: 100%|██████████| 61/61 [00:18<00:00,  3.35it/s, accuracy=0.938, it=60, loss=0.740]
Validation Average Loss: 0.7300563092932201, Accuracy: 0.9646516393442623
Training epoch 8: 100%|██████████| 376/376 [02:42<00:00,  2.31it/s, accuracy=1.000, it=375, loss=0.716]
Validating epoch 8: 100%|██████████| 61/61 [00:18<00:00,  3.36it/s, accuracy=1.000, it=60, loss=0.688]
Validation Average Loss: 0.724440254241452, Accuracy: 0.9697745901639344
Training epoch 9: 100%|██████████| 376/376 [02:43<00:00,  2.31it/s, accuracy=1.000, it=375, loss=0.713]
Validating epoch 9: 100%|██████████| 61/61 [00:18<00:00,  3.36it/s, accuracy=1.000, it=60, loss=0.732]
Validation Average Loss: 0.7224605226930816, Accuracy: 0.9707991803278688
Training epoch 10: 100%|██████████| 376/376 [02:42<00:00,  2.32it/s, accuracy=1.000, it=375, loss=0.703]
Validating epoch 10: 100%|██████████| 61/61 [00:17<00:00,  3.42it/s, accuracy=0.969, it=60, loss=0.722]
Validation Average Loss: 0.7216466591301091, Accuracy: 0.9702868852459017

3. 用训练好的模型进行测试

QuICT 内置了保存和加载模型的函数,如果你已经有一个训练好的模型,可以加载它以用来测试 QNN 分类效果。

# Restore checkpoint
model_path = "YOUR MODEL PATH"
restore_checkpoint(net, model_path, restore_optim=True)
loader_test = tqdm.tqdm(
    test_loader, desc="Testing: ", leave=True
)
loss_test_list = []
total_correct = 0
for it, (x_test, y_test) in enumerate(loader_test):
    expectations_test = net.forward(x_test, train=False)
    y_true_test = (2 * y_test - 1.0).reshape(expectations_test.shape)
    y_pred_test = -expectations_test

    loss_test = loss_fun(y_pred_test, y_true_test)
    loss_test_list.append(loss_test.item)
    correct_test = np.where(y_true_test * y_pred_test.pargs > 0)[0].shape[0]

    total_correct += correct_test
    accuracy_test = correct_test / len(y_test)
    loader_test.set_postfix(
        it=it,
        loss="{:.3f}".format(loss_test.item),
        accuracy="{:.3f}".format(accuracy_test),
    )
avg_loss = np.mean(loss_test_list)
qnn_avg_acc = total_correct / (len(loader_test) * BATCH_SIZE)
print("Testing Average Loss: {}, Accuracy: {}".format(avg_loss, qnn_avg_acc))
Testing: 100%|██████████| 61/61 [00:21<00:00,  2.90it/s, accuracy=0.969, it=60, loss=0.759]
Testing Average Loss: 0.7215073446063779, Accuracy: 0.9702868852459017

与经典人工神经网络对比

接下来我们将构建经典人工神经网络 ,并在相同的条件下(相同的预处理数据,优化器和损失函数)进行对比测试。

首先,导入 Pytorch 库,并针对 \(16 \times 16\) 图片构建一个简单的 ANN :

import torch.nn as nn
import torch.nn.functional as F


class ClassicalNet(nn.Module):
    def __init__(self):
        super(ClassicalNet, self).__init__()
        self.fc1 = torch.nn.Sequential(torch.nn.Linear(256, 16), torch.nn.ReLU())
        self.fc2 = torch.nn.Linear(16, 1)

    def forward(self, x):
        x = torch.from_numpy(x).type(torch.float)
        x = x.view(32, -1)
        x = self.fc1(x)
        x = self.fc2(x)
        x = x.flatten()
        return x

对于 ANN 来说任务简单,网络参数较少,为了避免过拟合,训练轮数将缩减为1:

EPOCH = 1
classical_net = ClassicalNet().to(torch.device("cpu"))
classical_optim = torch.optim.Adam([dict(params=classical_net.parameters(), lr=LR)])
loss_func = nn.MSELoss()
# train epoch
for ep in range(EPOCH):
    classical_net.train()
    loader = tqdm.tqdm(
        classical_train_loader, desc="Training epoch {}".format(ep + 1), leave=True
    )
    # train iteration
    for it, (x_train, y_train) in enumerate(loader):
        classical_optim.zero_grad()
        y_pred = classical_net(x_train).type(torch.float)
        y_train = 2 * y_train - 1.0
        y_train = torch.tensor(y_train).type(torch.float)
        loss = loss_func(y_pred, y_train)
        correct = np.sum((y_pred * y_train).detach().numpy() > 0)
        accuracy = correct / len(y_train)
        loss.backward()
        classical_optim.step()
        loader.set_postfix(
            it=it,
            loss="{:.3f}".format(loss),
            accuracy="{:.3f}".format(accuracy),
        )

    # Validation
    classical_net.eval()
    loader_val = tqdm.tqdm(
        classical_test_loader, desc="Validating epoch {}".format(ep + 1), leave=True
    )
    loss_val_list = []
    total_correct = 0
    for it, (x_test, y_test) in enumerate(loader_val):
        y_pred = classical_net(x_test).type(torch.float)
        y_test = 2 * y_test - 1.0
        y_test = torch.tensor(y_test).type(torch.float)
        loss_val = loss_func(y_pred, y_test)
        loss_val_list.append(loss_val.cpu().detach().numpy())
        correct = np.sum((y_pred * y_test).detach().numpy() > 0)
        total_correct += correct
        accuracy_val = correct / len(y_test)
        loader_val.set_postfix(
            it=it,
            loss="{:.3f}".format(loss_val),
            accuracy="{:.3f}".format(accuracy_val),
        )
    avg_loss = np.mean(loss_val_list)
    ann_avg_acc = total_correct / (len(loader_val) * BATCH_SIZE)
    print(
        "Validation Average Loss: {}, Accuracy: {}".format(avg_loss, ann_avg_acc)
    )
Training epoch 1: 100%|██████████| 376/376 [00:00<00:00, 482.75it/s, accuracy=1.000, it=375, loss=0.069]
Validating epoch 1: 100%|██████████| 61/61 [00:00<00:00, 1983.92it/s, accuracy=1.000, it=60, loss=0.076]
Validation Average Loss: 0.06602335721254349, Accuracy: 0.9959016393442623

QNN 与 ANN 对 MNIST 手写数据集数字3和6分类的准确率对比:

ax = sns.barplot(x=["QNN", "ANN"], y=[qnn_avg_acc, ann_avg_acc], palette="muted")
ax.set_yticks(ticks=[0, 0.2, 0.4, 0.6, 0.8, 1.0])

QNN_ANN


参考文献

[1] Edward F, Hartmut N. Classification with Quantum Neural Networks on Near Term Processors. arXiv:1802.06002 (2018)

[2] Le, P.Q., Dong, F. & Hirota, K. A flexible representation of quantum images for polynomial preparation, image compression, and processing operations. Quantum Inf Process 10, 63–84 (2011). https://doi.org/10.1007/s11128-010-0177-y