量子神经网络(QNN)¶
本教程旨在介绍如何使用 QuICT 中内置的 FRQI 量子图像编码方式和量子神经网络模块构建一个用于分类 MNIST 手写数据集的量子神经网络(Quantum Neural Network, QNN)。
导入运行库¶
首先,导入必要的运行库及相关依赖,并固定随机种子:
import collections
import yaml
import time
import tqdm
import matplotlib.pyplot as plt
from torchvision import datasets, transforms
from QuICT.algorithm.quantum_machine_learning.ansatz_library import BasicQNN
from QuICT.algorithm.quantum_machine_learning.encoding import FRQI
from QuICT.algorithm.quantum_machine_learning.model.QNN import QuantumNet
from QuICT.algorithm.quantum_machine_learning.optimizer.optimizer import Adam
from QuICT.algorithm.quantum_machine_learning.utils.data import Dataset, DataLoader
from QuICT.algorithm.quantum_machine_learning.utils.loss import MSELoss
from QuICT.algorithm.quantum_machine_learning.utils.ml_utils import *
SEED = 0
set_seed(SEED)
加载和预处理 MNIST 数据¶
在本教程中,我们将对数字3和6进行分类,并使用 FRQI [2] 编码方式对将灰度图像编码为量子电路。对MNIST数据集的预处理主要目的是 使图片能够在尽量保留较多信息的情况下成功被编码为量子电路。
1. 加载原始 MNIST 数据¶
Pytorch 的 torchvision 库中的 datasets 能够自动下载 MNIST 手写数据集:
train_data = datasets.MNIST(root="./data/", train=True, download=True)
test_data = datasets.MNIST(root="./data/", train=False, download=True)
train_X = train_data.data
train_Y = train_data.targets
test_X = test_data.data
test_Y = test_data.targets
print("Training examples: ", len(train_Y))
print("Testing examples: ", len(test_Y))
2. 筛选数据集使其仅包含数字3和6¶
为了实现对数字3和6的二分类,我们需要删除其他数字,只保留标签为3和6的数据。并且定义标签 y = 6 为正类, y = 3 为负类:
def filter_targets(X, Y, class0=3, class1=6):
idx = (Y == class0) | (Y == class1)
X, Y = (X[idx], Y[idx])
Y = Y == class1
return X, Y
train_X, train_Y = filter_targets(train_X, train_Y)
test_X, test_Y = filter_targets(test_X, test_Y)
print("Filtered training examples: ", len(train_Y))
print("Filtered testing examples: ", len(test_Y))
随机选择一个数据并显示:
3. 缩小图像¶
原始的 MNIST 数据集图片尺寸是 \(28\times28\),而 FRQI 编码只适用于分辨率为 \(2^n \times 2^n\) 的图像,因此需要将其缩小到 \(16\times16\):
def downscale(X, resize):
transform = transforms.Resize(size=resize)
X = transform(X) / 255.0
return X
RESIZE = (16, 16)
resized_train_X = downscale(train_X, RESIZE)
resized_test_X = downscale(test_X, RESIZE)
同样地,显示序号为200的图像:
4. 去除冲突数据¶
在经过向下采样后,可能会产生部分重复图片,并且这些重复图片可能会同时被标记为正类和负类。为了避免这对分类结果造成影响,需要去除同一图片被同时标记为3和6的样本:
def remove_conflict(X, Y, resize):
x_dict = collections.defaultdict(set)
for x, y in zip(X, Y):
x_dict[tuple(x.numpy().flatten())].add(y.item())
X_rmcon = []
Y_rmcon = []
for x in x_dict.keys():
if len(x_dict[x]) == 1:
X_rmcon.append(np.array(x).reshape(resize))
Y_rmcon.append(list(x_dict[x])[0])
X = np.array(X_rmcon)
Y = np.array(Y_rmcon)
return X, Y
train_X, train_Y = remove_conflict(resized_train_X, train_Y, RESIZE)
test_X, test_Y = remove_conflict(resized_test_X, test_Y, RESIZE)
print("Remaining training examples: ", len(train_Y))
print("Remaining testing examples: ", len(test_Y))
从结果可见本组实验数据中并无冲突样例,但是这个预处理步骤是必要的,如果改变分类的数字或者预处理方式,如:改变图像下采样尺寸,改变图像灰度阶数等,非常有可能出现冲突数据。
将图像数据通过量子电路编码为量子态¶
本教程将使用 FRQI 编码(Flexible Representation of Quantum Images)[2] 将图片数据转换为量子态。
对于一张 \(2^n \times 2^n\) 的图像, FRQI 编码需要使用 \(2n + 1\) 个量子比特,其中前 \(n\) 个量子比特描述像素的 Y 坐标,接下来的 \(n\) 个量子比特描述像素的 X 坐标,这 \(2n\) 个量子比特统称为位置量子比特,用于描述像素在图像中的位置,最后一个量子比特用于描述像素的灰度值,称为颜色量子比特。FRQI 编码会将图像转化为量子态:
其中
可见FRQI是是一种混合编码,前半部分 \(\cos \theta_{i}|0⟩ + \sin \theta_{i}|1⟩\) 为连续编码,代表像素颜色,在电路中通过使用受控 Ry 门并设置不同的旋转角度来表示不同灰度值;后半部分 \(|i⟩\) 为离散编码,表示像素的位置,在电路中用位置量子比特的受控情况表示像素的横纵坐标。
QuICT 的 encoding 库内置了 FRQI 编码方式,以这样一张 4x4 的二值图为例:
img = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
plt.imshow(img, cmap="gray")
Info
QuICT 的 encoding 库目前内置了 QubitLattice, FRQI, 和 NEQR 三种图像编码方式。其中 FRQI 和 NEQR 支持对灰度图进行编码,并内置了可以减少编码后电路量子门数的量子图像压缩选项。
此处,我们可以批量化的生成FRQI编码使用的量子电路:
def encoding_img(X, encoding):
data_circuits = []
for i in tqdm.tqdm(range(len(X))):
data_circuit = encoding(X[i])
data_circuits.append(data_circuit)
return data_circuits
frqi = FRQI(grayscale=256)
train_X = encoding_img(train_X, frqi)
test_X = encoding_img(test_X, frqi)
100%|██████████| 12049/12049 [00:11<00:00, 1035.43it/s]
100%|██████████| 1968/1968 [00:02<00:00, 788.43it/s]
最后,设置好 batch_size ,即:每次迭代需要使用的数据数量,将编码好的图像量子电路装入支持迭代取数据的 DataLoader ,每轮训练开始前都将数据打乱
以避免模型学到数据顺序,对于数据总数无法整除 batch_size 的情况,丢掉最后一个 batch :
BATCH_SIZE = 32
train_dataset = Dataset(train_X, train_Y)
test_dataset = Dataset(test_X, test_Y)
train_loader = DataLoader(
dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True
)
test_loader = DataLoader(
dataset=test_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True
)
量子神经网络¶
本教程根据 Farhi et al.[2] 使用的方法进行简化构建模型量子电路,主要用始终作用于读出量子比特的双比特门进行电路构建。
1. 构建模型电路¶
模型电路除了数据量子比特之外,还额外有一个读出量子比特,用于存储预测的分类结果,根据 Measure 门的测量结果是 \(\left | 0 \right \rangle\) 和 \(\left | 1 \right \rangle\) 判定输入图片属于正类还是负类。 在模拟中通常使用 Pauli-Z 测量,由于 Z 算子的谱分解形式为:
可见 Z 算子有两个特征值 \(+1\) 和 \(-1\),对应的特征向量分别是 \(\left | 0 \right \rangle\) 和 \(\left | 1 \right \rangle\),当使用 Z 算子做投影测量,如果测量结果为 \(+1\) 则测量态被投影成 \(\left | 0 \right \rangle\) ,如果测量结果为 \(-1\) 则测量态被投影成 \(\left | 1 \right \rangle\) 。
QuICT 中可以通过设置哈密顿量 \(H\) 计算测量期望值,再通过期望值求得正类和负类的预测概率:
Farhi et al.[2] 使用的模型量子电路,是用双比特门(通常是 RXX,RYY,RZZ 和 RZX 门)始终作用在读出量子比特,和全部数据量子比特上构建的。 QuICT 内置了这样的 QNN 模型电路,我们规定最后一个量子比特是读出量子比特比特,其他量子比特是数据量子比特:
以含4个数据量子比特的情况为例,双层 RYY , RXX 的 QNN 模型电路应为:
Info
QuICT 的 ansatz 库中内置了 BasicQNN, CRADL, CRAML, Hardware-Efficient Ansatz 共 4 种 ansatz 供用户调用。
本教程中将使用三层网络,分别是 RYY 层,RZZ 层和 RZX 层,可训练参数的数量为数据量子比特数 * 网络层数,即 27 个:
2. 用 QuICT 内置的 QuantumNet 进行训练¶
封装好编码后的图像数据,并根据需求构建好模型电路后,就可以使用 QuICT 封装好的 QuantumNet 模块进行训练了。 QuantumNet 模块支持自动微分,只需传入定义好的模型电路和优化器即可完成 QNN 更新需要的前向传播和反向传播过程。
首先,设置机器学习相关参数:
定义待训练的 QNN 网络,损失函数和经典优化器,此处使用均方误差损失函数和 Adam 优化器:
loss_fun = MSELoss()
optimizer = Adam(lr=LR)
net = QuantumNet(
n_qubits=n_qubits,
ansatz=ansatz,
optimizer=optimizer,
device="CPU",
)
Info
QuICT 中内置了三种损失函数,分别是 MSE 损失函数,Hinge 损失函数,和 BCE 损失函数。同时包含 SGD, AdaGrad, RMSProp, Adam 四种经典优化器。
开始训练,每轮训练完成后进行一轮验证:
for ep in range(EPOCH):
# Train
loader = tqdm.tqdm(
train_loader, desc="Training epoch {}".format(ep + 1), leave=True
)
for it, (x_train, y_train) in enumerate(loader):
expectations = net.forward(x_train) # (32, 1)
y_true = (2 * y_train - 1.0).reshape(expectations.shape)
y_pred = -expectations
loss = loss_fun(y_pred, y_true)
# optimize
net.backward(loss)
# update
net.update()
correct = np.where(y_true * y_pred.pargs > 0)[0].shape[0]
accuracy = correct / len(y_train)
loader.set_postfix(
it=it, loss="{:.3f}".format(loss.item), accuracy="{:.3f}".format(accuracy)
)
# Validation
loader_val = tqdm.tqdm(
test_loader, desc="Validating epoch {}".format(ep + 1), leave=True
)
loss_val_list = []
total_correct = 0
for it, (x_test, y_test) in enumerate(loader_val):
expectations_val = net.forward(x_test, train=False)
y_true_val = (2 * y_test - 1.0).reshape(expectations_val.shape)
y_pred_val = -expectations_val
loss_val = loss_fun(y_pred_val, y_true_val)
loss_val_list.append(loss_val.item)
correct_val = np.where(y_true_val * y_pred_val.pargs > 0)[0].shape[0]
total_correct += correct_val
accuracy_val = correct_val / len(y_test)
loader_val.set_postfix(
it=it,
loss="{:.3f}".format(loss_val.item),
accuracy="{:.3f}".format(accuracy_val),
)
avg_loss = np.mean(loss_val_list)
qnn_avg_acc = total_correct / (len(loader_val) * BATCH_SIZE)
print("Validation Average Loss: {}, Accuracy: {}".format(avg_loss, qnn_avg_acc))
Training epoch 1: 100%|██████████| 376/376 [02:41<00:00, 2.32it/s, accuracy=0.906, it=375, loss=0.935]
Validating epoch 1: 100%|██████████| 61/61 [00:18<00:00, 3.34it/s, accuracy=0.938, it=60, loss=0.938]
Validation Average Loss: 0.9394127107271177, Accuracy: 0.8837090163934426
Training epoch 2: 100%|██████████| 376/376 [02:41<00:00, 2.32it/s, accuracy=0.844, it=375, loss=0.934]
Validating epoch 2: 100%|██████████| 61/61 [00:18<00:00, 3.35it/s, accuracy=0.844, it=60, loss=0.907]
Validation Average Loss: 0.9182945990390312, Accuracy: 0.9118852459016393
Training epoch 3: 100%|██████████| 376/376 [02:42<00:00, 2.32it/s, accuracy=0.906, it=375, loss=0.883]
Validating epoch 3: 100%|██████████| 61/61 [00:18<00:00, 3.35it/s, accuracy=0.938, it=60, loss=0.874]
Validation Average Loss: 0.8638879795421284, Accuracy: 0.9252049180327869
Training epoch 4: 100%|██████████| 376/376 [02:41<00:00, 2.32it/s, accuracy=0.969, it=375, loss=0.823]
Validating epoch 4: 100%|██████████| 61/61 [00:18<00:00, 3.33it/s, accuracy=0.906, it=60, loss=0.804]
Validation Average Loss: 0.7808000876160641, Accuracy: 0.9646516393442623
Training epoch 5: 100%|██████████| 376/376 [02:41<00:00, 2.32it/s, accuracy=0.938, it=375, loss=0.760]
Validating epoch 5: 100%|██████████| 61/61 [00:18<00:00, 3.35it/s, accuracy=0.969, it=60, loss=0.755]
Validation Average Loss: 0.7496410566096504, Accuracy: 0.9605532786885246
Training epoch 6: 100%|██████████| 376/376 [02:42<00:00, 2.32it/s, accuracy=0.938, it=375, loss=0.728]
Validating epoch 6: 100%|██████████| 61/61 [00:18<00:00, 3.37it/s, accuracy=0.906, it=60, loss=0.742]
Validation Average Loss: 0.7384923100156554, Accuracy: 0.9600409836065574
Training epoch 7: 100%|██████████| 376/376 [02:41<00:00, 2.32it/s, accuracy=1.000, it=375, loss=0.716]
Validating epoch 7: 100%|██████████| 61/61 [00:18<00:00, 3.35it/s, accuracy=0.938, it=60, loss=0.740]
Validation Average Loss: 0.7300563092932201, Accuracy: 0.9646516393442623
Training epoch 8: 100%|██████████| 376/376 [02:42<00:00, 2.31it/s, accuracy=1.000, it=375, loss=0.716]
Validating epoch 8: 100%|██████████| 61/61 [00:18<00:00, 3.36it/s, accuracy=1.000, it=60, loss=0.688]
Validation Average Loss: 0.724440254241452, Accuracy: 0.9697745901639344
Training epoch 9: 100%|██████████| 376/376 [02:43<00:00, 2.31it/s, accuracy=1.000, it=375, loss=0.713]
Validating epoch 9: 100%|██████████| 61/61 [00:18<00:00, 3.36it/s, accuracy=1.000, it=60, loss=0.732]
Validation Average Loss: 0.7224605226930816, Accuracy: 0.9707991803278688
Training epoch 10: 100%|██████████| 376/376 [02:42<00:00, 2.32it/s, accuracy=1.000, it=375, loss=0.703]
Validating epoch 10: 100%|██████████| 61/61 [00:17<00:00, 3.42it/s, accuracy=0.969, it=60, loss=0.722]
Validation Average Loss: 0.7216466591301091, Accuracy: 0.9702868852459017
3. 用训练好的模型进行测试¶
QuICT 内置了保存和加载模型的函数,如果你已经有一个训练好的模型,可以加载它以用来测试 QNN 分类效果。
# Restore checkpoint
model_path = "YOUR MODEL PATH"
restore_checkpoint(net, model_path, restore_optim=True)
loader_test = tqdm.tqdm(
test_loader, desc="Testing: ", leave=True
)
loss_test_list = []
total_correct = 0
for it, (x_test, y_test) in enumerate(loader_test):
expectations_test = net.forward(x_test, train=False)
y_true_test = (2 * y_test - 1.0).reshape(expectations_test.shape)
y_pred_test = -expectations_test
loss_test = loss_fun(y_pred_test, y_true_test)
loss_test_list.append(loss_test.item)
correct_test = np.where(y_true_test * y_pred_test.pargs > 0)[0].shape[0]
total_correct += correct_test
accuracy_test = correct_test / len(y_test)
loader_test.set_postfix(
it=it,
loss="{:.3f}".format(loss_test.item),
accuracy="{:.3f}".format(accuracy_test),
)
avg_loss = np.mean(loss_test_list)
qnn_avg_acc = total_correct / (len(loader_test) * BATCH_SIZE)
print("Testing Average Loss: {}, Accuracy: {}".format(avg_loss, qnn_avg_acc))
Testing: 100%|██████████| 61/61 [00:21<00:00, 2.90it/s, accuracy=0.969, it=60, loss=0.759]
Testing Average Loss: 0.7215073446063779, Accuracy: 0.9702868852459017
与经典人工神经网络对比¶
接下来我们将构建经典人工神经网络 ,并在相同的条件下(相同的预处理数据,优化器和损失函数)进行对比测试。
首先,导入 Pytorch 库,并针对 \(16 \times 16\) 图片构建一个简单的 ANN :
import torch.nn as nn
import torch.nn.functional as F
class ClassicalNet(nn.Module):
def __init__(self):
super(ClassicalNet, self).__init__()
self.fc1 = torch.nn.Sequential(torch.nn.Linear(256, 16), torch.nn.ReLU())
self.fc2 = torch.nn.Linear(16, 1)
def forward(self, x):
x = torch.from_numpy(x).type(torch.float)
x = x.view(32, -1)
x = self.fc1(x)
x = self.fc2(x)
x = x.flatten()
return x
对于 ANN 来说任务简单,网络参数较少,为了避免过拟合,训练轮数将缩减为1:
EPOCH = 1
classical_net = ClassicalNet().to(torch.device("cpu"))
classical_optim = torch.optim.Adam([dict(params=classical_net.parameters(), lr=LR)])
loss_func = nn.MSELoss()
# train epoch
for ep in range(EPOCH):
classical_net.train()
loader = tqdm.tqdm(
classical_train_loader, desc="Training epoch {}".format(ep + 1), leave=True
)
# train iteration
for it, (x_train, y_train) in enumerate(loader):
classical_optim.zero_grad()
y_pred = classical_net(x_train).type(torch.float)
y_train = 2 * y_train - 1.0
y_train = torch.tensor(y_train).type(torch.float)
loss = loss_func(y_pred, y_train)
correct = np.sum((y_pred * y_train).detach().numpy() > 0)
accuracy = correct / len(y_train)
loss.backward()
classical_optim.step()
loader.set_postfix(
it=it,
loss="{:.3f}".format(loss),
accuracy="{:.3f}".format(accuracy),
)
# Validation
classical_net.eval()
loader_val = tqdm.tqdm(
classical_test_loader, desc="Validating epoch {}".format(ep + 1), leave=True
)
loss_val_list = []
total_correct = 0
for it, (x_test, y_test) in enumerate(loader_val):
y_pred = classical_net(x_test).type(torch.float)
y_test = 2 * y_test - 1.0
y_test = torch.tensor(y_test).type(torch.float)
loss_val = loss_func(y_pred, y_test)
loss_val_list.append(loss_val.cpu().detach().numpy())
correct = np.sum((y_pred * y_test).detach().numpy() > 0)
total_correct += correct
accuracy_val = correct / len(y_test)
loader_val.set_postfix(
it=it,
loss="{:.3f}".format(loss_val),
accuracy="{:.3f}".format(accuracy_val),
)
avg_loss = np.mean(loss_val_list)
ann_avg_acc = total_correct / (len(loader_val) * BATCH_SIZE)
print(
"Validation Average Loss: {}, Accuracy: {}".format(avg_loss, ann_avg_acc)
)
Training epoch 1: 100%|██████████| 376/376 [00:00<00:00, 482.75it/s, accuracy=1.000, it=375, loss=0.069]
Validating epoch 1: 100%|██████████| 61/61 [00:00<00:00, 1983.92it/s, accuracy=1.000, it=60, loss=0.076]
Validation Average Loss: 0.06602335721254349, Accuracy: 0.9959016393442623
QNN 与 ANN 对 MNIST 手写数据集数字3和6分类的准确率对比:
ax = sns.barplot(x=["QNN", "ANN"], y=[qnn_avg_acc, ann_avg_acc], palette="muted")
ax.set_yticks(ticks=[0, 0.2, 0.4, 0.6, 0.8, 1.0])
参考文献¶
[1] Edward F, Hartmut N. Classification with Quantum Neural Networks on Near Term Processors. arXiv:1802.06002 (2018)
[2] Le, P.Q., Dong, F. & Hirota, K. A flexible representation of quantum images for polynomial preparation, image compression, and processing operations. Quantum Inf Process 10, 63–84 (2011). https://doi.org/10.1007/s11128-010-0177-y





