defmake_data(sentences): """把单词序列转换为数字序列""" enc_inputs, dec_inputs, dec_outputs = [], [], [] for i inrange(len(sentences)):
enc_input = [[src_vocab[n] for n in sentences[i][0].split()]] dec_input = [[tgt_vocab[n] for n in sentences[i][1].split()]] dec_output = [[tgt_vocab[n] for n in sentences[i][2].split()]]
entences = [ # 中文和英语的单词个数不要求相同 # enc_input dec_input dec_output ['我 有 一 个 好 朋 友 P', 'S I have a good friend .', 'I have a good friend . E'], ['我 有 零 个 女 朋 友 P', 'S I have zero girl friend .', 'I have zero girl friend . E'], ['我 有 一 个 男 朋 友 P', 'S I have a boy friend .', 'I have a boy friend . E'] ]
66%|██████▋ | 7298/10990 [13:28<06:17, 9.78it/s] Generated Text: cause being free is a state of mind steady he written control charm he he ’ your bottle wedding des bottle des des bottle 25%|██▌ | 2799/10990 [04:31<10:29, 13.02it/s] Generated Text: i give her all my love yo tell turn girl girl girl girl turn yo girl met 62%|██████▏ | 6799/10990 [12:30<07:09, 9.75it/s] Generated Text: that's why i've done it again. no - no - roof. double rides rides rides writtenves double rides rides bottle rides des rides rides rides he written rides rides bottle pity forth
40%|████ | 2301/5708 [09:51<14:29, 3.92it/s]Epoch [2/2], Loss: 4.5555 predicted_tokens: i'm not alone about i i'm not the place [SEP] the [SEP] [SEP] [SEP] [SEP] [SEP] ground_truth: i'm not worried, and i'm in a hurry to die [SEP] [PAD] [PAD] [PAD] [PAD] 91%|█████████ | 5201/5708 [22:41<02:16, 3.71it/s]Epoch [2/2], Loss: 4.4399 predicted_tokens: you ready for go? [SEP]?????????????? ground_truth: you ready to roll? [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
从这些例子中可以看出,模型已经很快的学习到了一些有意义的文本,一反之前生成的总是以 i 开头的无意义文本的常态,那我们再来看一下训练的tensorboard图像
只需要看紫色和黄色的线,因为这两条线是加载同一个初始的模型权重在同一个数据集上均训练两个epoch后的结果(紫色的batch是512,黄色的batch是64,原因看ps),可以发现,紫色的loss比黄色更低(4.5 vs 4.583),而且收敛速度(曲率)明显更快,这也间接验证了之前我们的观察,提出的问题确实存在,且给出的解决方案是合理的。perfect!