本项目为音频分类入门教程,基于Paddle API展开。先讲解音频基础知识,包括本质、三要素、格式及处理概念;再介绍短时傅里叶变换和LogFBank等特征提取方法;最后用《双城之战第二季》中杰斯、金克斯、狼母的音频数据,构建LSTM模型完成分类,含数据加载、模型训练与测试,测试效果良好。
☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜
对于一段音频,一般会将整段音频进行分帧,每一帧含有一定长度的信号数据,一般使用 25ms,帧与帧之间的移动距离称为帧移,一般使用 10ms,然后对每一帧的信号数据加窗后,进行离散傅立叶变换(DFT)得到频谱图。
通过按照上面的对一段音频进行分帧后,我们可以用傅里叶变换来分析每一帧信号的频率特性。将每一帧的频率信息拼接后,可以获得该音频不同时刻的频率特征——Spectrogram,也称作为语谱图。
下面例子采用 paddle.signal.stft 演示如何提取示例音频的频谱特征,并进行可视化:
!pip install paddlespeech==1.2.0 # 安装paddlespeech及相关依赖!pip install paddleaudio==1.0.1!pip install typeguard==2.13.3In [2]
import paddleimport numpy as npfrom paddleaudio import loadimport matplotlib.pyplot as plt
data, sr = load(file='/home/aistudio/Arcane_3class/test/杰斯_audio60.wav', sr=32000, mono=True, dtype='float32')
x = paddle.to_tensor(data)
n_fft = 1024win_length = 1024hop_length = 320# [D, T]spectrogram = paddle.signal.stft(x, n_fft=n_fft, win_length=win_length, hop_length=512, onesided=True)
print('spectrogram.shape: {}'.format(spectrogram.shape))print('spectrogram.dtype: {}'.format(spectrogram.dtype))
spec = np.log(np.abs(spectrogram.numpy())**2)
plt.figure()
plt.title("Log Power Spectrogram")
plt.imshow(spec[:100, :], origin='lower')
plt.show()
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) W1228 14:41:44.374023 274 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8 W1228 14:41:44.375391 274 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
spectrogram.shape: [513, 88] spectrogram.dtype: paddle.complex64
研究表明,人类对声音的感知是非线性的,随着声音频率的增加,人对更高频率的声音的区分度会不断下降。
例如同样是相差 500Hz 的频率,一般人可以轻松分辨出声音中 500Hz 和 1,000Hz 之间的差异,但是很难分辨出 10,000Hz 和 10,500Hz 之间的差异。
因此,学者提出了梅尔频率,在该频率计量方式下,人耳对相同数值的频率变化的感知程度是一样的。
关于梅尔频率的计算,其会对原始频率的低频的部分进行较多的采样,从而对应更多的频率,而对高频的声音进行较少的采样,从而对应较少的频率。使得人耳对梅尔频率的低频和高频的区分性一致。
Mel Fbank 的计算过程如下,而我们一般都是使用 LogFBank 作为识别特征:
下面例子采用 paddleaudio.features.LogMelSpectrogram 演示如何提取示例音频的 LogFBank:
from paddleaudio.features import LogMelSpectrogram
f_min=50.0f_max=14000.0# - sr: 音频文件的采样率。# - n_fft: FFT样本点个数。# - hop_length: 音频帧之间的间隔。# - win_length: 窗函数的长度。# - window: 窗函数种类。# - n_mels: 梅尔刻度数量。feature_extractor = LogMelSpectrogram(
sr=sr,
n_fft=n_fft,
hop_length=hop_length,
win_length=win_length,
window='hann',
f_min=f_min,
f_max=f_max,
n_mels=64)
x = paddle.to_tensor(data).unsqueeze(0) # [B, L]log_fbank = feature_extractor(x) # [B, D, T]log_fbank = log_fbank.squeeze(0) # [D, T]print('log_fbank.shape: {}'.format(log_fbank.shape))
plt.figure()
plt.imshow(log_fbank.numpy(), origin='lower')
plt.show()
log_fbank.shape: [64, 141]
!unzip /home/aistudio/data/data310325/Arcane_3class.zip -d ~ # 解压数据集In [4]
!tree -L 1 Arcane_3class/*/ # 数据集由train训练集和test测试集组成,其中训练集包括”杰斯“、”狼母“、”金克斯“三位核心动漫人物的音频数据
Arcane_3class/test/├── 杰斯_audio60.wav├── 杰斯_audio62.wav├── 杰斯_audio63.wav├── 狼母_audio43.wav├── 狼母_audio44.wav├── 狼母_audio45.wav├── 金克斯_audio42.wav├── 金克斯_audio44.wav└── 金克斯_audio45.wavArcane_3class/train/├── label.txt ├── 杰斯├── 狼母└── 金克斯3 directories, 10 files
# 初始化LogMelSpectrogram音频特征提取器import paddlefrom paddleaudio.features import LogMelSpectrogram
n_fft = 1024win_length = 1024hop_length = 320sr = 16000f_min=50.0f_max=14000.0# - sr: 音频文件的采样率。# - n_fft: FFT样本点个数。# - hop_length: 音频帧之间的间隔。# - win_length: 窗函数的长度。# - window: 窗函数种类。# - n_mels: 梅尔刻度数量。feature_extractor = LogMelSpectrogram(
sr=sr,
n_fft=n_fft,
hop_length=hop_length,
win_length=win_length,
window='hann',
f_min=f_min,
f_max=f_max,
n_mels=40)
In [7]
# 生成音频分类标注文件label.txtimport osimport glob
label_list = ["杰斯","金克斯","狼母"]with open("/home/aistudio/Arcane_3class/train/label.txt","w") as f:
audio_list = glob.glob("/home/aistudio/Arcane_3class/train/*/*.wav") for audio in audio_list:
audio_name = os.path.basename(audio)
labe_name = audio_name.split("_")[0]
label = label_list.index(labe_name) print("audio:",audio) print("label:",label)
f.write(f"{audio}\t{label}\n")
f.close()
audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio14.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio18.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio20.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio41.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio46.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio59.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio24.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio27.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio4.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio7.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio8.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio42.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio44.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio58.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio33.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio34.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio54.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio36.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio56.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio9.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio10.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio6.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio28.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio29.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio38.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio45.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio49.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio50.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio47.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio5.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio1.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio12.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio2.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio22.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio3.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio48.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio31.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio57.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio13.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio25.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio26.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio30.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio32.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio23.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio53.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio55.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio16.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio21.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio35.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio40.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio11.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio15.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio19.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio43.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio17.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio37.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio39.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio51.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio52.wav label: 0 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio15.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio33.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio39.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio28.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio29.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio42.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio7.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio9.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio1.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio37.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio5.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio8.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio35.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio36.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio12.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio18.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio19.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio22.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio23.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio26.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio30.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio11.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio13.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio16.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio21.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio41.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio14.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio27.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio31.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio4.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio24.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio25.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio3.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio10.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio17.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio2.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio20.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio6.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio32.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio34.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio38.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio40.wav label: 2 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio11.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio12.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio18.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio26.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio41.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio7.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio22.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio32.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio33.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio1.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio16.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio17.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio34.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio14.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio2.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio25.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio40.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio8.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio15.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio29.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio30.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio36.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio39.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio21.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio31.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio37.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio6.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio9.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio10.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio23.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio27.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio3.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio35.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio4.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio5.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio13.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio19.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio20.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio24.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio28.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio38.wav label: 1In [6]
# 基于Paddle API 构建音频数据加载器import paddlefrom paddle.io import Datasetimport osfrom PIL import Imageimport numpy as npfrom paddleaudio import loadclass CustomDataset(Dataset):
def __init__(self,data,seq_len):
super(CustomDataset, self).__init__()
self.all_audio_list = data
self.seq_len = seq_len def __getitem__(self, index):
audio_data = self.all_audio_list[index][0]
data, sr = load(file=audio_data, mono=True, dtype='float32') # 单通道,float32音频样本点
x = paddle.to_tensor(data).unsqueeze(0) # [B, L]
log_fbank = feature_extractor(x) # [B, D, T]
log_fbank = log_fbank.squeeze(0) # [D, T]
if log_fbank.shape[1]
In [9]
all_audio_list = []with open("/home/aistudio/Arcane_3class/train/label.txt","r") as f: for line in f:
all_audio_list.append(line.split())
In [7]
# 定义模型结构(lstm+全连接层组成)import paddle.nn as nnclass MyLSTMModel2(nn.Layer):
def __init__(self):
super(MyLSTMModel2,self).__init__()
self.rnn = paddle.nn.LSTM(input_size = 40, hidden_size = 128, num_layers =1)
self.bn = paddle.nn.BatchNorm1D(128)
self.fc = nn.Sequential(
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64,3),
) def forward(self,input): # forward 定义执行实际运行时网络的执行逻辑
out, (h, c)=self.rnn(input)
h = self.bn(h.squeeze(axis=0))
h= self.fc(h) return h# 实例化模型model = MyLSTMModel2()
In [16]
import paddleimport paddle.nn as nnimport paddle.optimizer as optimizerfrom paddle.vision.transforms import Compose, Resize, ToTensorfrom paddle.vision.datasets import MNISTfrom paddle.io import DataLoader# 设定超参数batch_size = 8learning_rate = 0.001epochs = 200# 实例化自定义数据集dataset = CustomDataset(data=all_audio_list,seq_len=64)# 创建数据加载器train_loader = paddle.io.DataLoader(dataset, batch_size=batch_size, shuffle=True)# 定义损失函数和优化器loss_fn = nn.CrossEntropyLoss()
optimizer = optimizer.Adam(parameters=model.parameters(), learning_rate=learning_rate)# 训练模型for epoch in range(epochs): for batch_id, data in enumerate(train_loader()):
audio,labels = data # 前向传播
preds = model(audio)
loss = loss_fn(preds, labels)
# 反向传播
loss.backward()
optimizer.step()
optimizer.clear_grad()
if batch_id % 10 == 0: print(f"Epoch [{epoch+1}/{epochs}], Step [{batch_id+1}/{len(train_loader)}], Loss: {loss.numpy()}")
paddle.save(model.state_dict(),"save_model/epoch_{}.pdparams".format(epoch))
paddle.save(optimizer.state_dict(),"save_model/epoch_{}.pdopt".format(epoch))
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/norm.py:818: UserWarning: When training, we now always track global mean and variance.
warnings.warn(
Epoch [1/200], Step [1/18], Loss: 0.4725416898727417
Epoch [1/200], Step [11/18], Loss: 0.0440545529127121
Epoch [2/200], Step [1/18], Loss: 0.470591276884079
Epoch [2/200], Step [11/18], Loss: 0.09851406514644623
Epoch [3/200], Step [1/18], Loss: 0.08877842873334885
Epoch [3/200], Step [11/18], Loss: 0.0233285091817379
Epoch [4/200], Step [1/18], Loss: 0.009626179933547974
Epoch [4/200], Step [11/18], Loss: 0.019429411739110947
Epoch [5/200], Step [1/18], Loss: 0.2107754945755005
Epoch [5/200], Step [11/18], Loss: 0.23881441354751587
Epoch [6/200], Step [1/18], Loss: 0.1475854516029358
Epoch [6/200], Step [11/18], Loss: 0.07661047577857971
Epoch [7/200], Step [1/18], Loss: 0.04603996500372887
Epoch [7/200], Step [11/18], Loss: 0.0187629796564579
Epoch [8/200], Step [1/18], Loss: 0.0401885025203228
Epoch [8/200], Step [11/18], Loss: 0.03683056682348251
Epoch [9/200], Step [1/18], Loss: 0.021549057215452194
Epoch [9/200], Step [11/18], Loss: 0.048799484968185425
Epoch [10/200], Step [1/18], Loss: 0.06304647773504257
Epoch [10/200], Step [11/18], Loss: 0.09490914642810822
Epoch [11/200], Step [1/18], Loss: 0.04316396266222
Epoch [11/200], Step [11/18], Loss: 0.023982465267181396
Epoch [12/200], Step [1/18], Loss: 0.03854821241249842644
Epoch [12/200], Step [11/18], Loss: 0.00421815924346447
Epoch [13/200], Step [1/18], Loss: 0.007414224557578564
Epoch [13/200], Step [11/18], Loss: 0.01622013747692108
Epoch [14/200], Step [1/18], Loss: 0.03377651050686836
Epoch [14/200], Step [11/18], Loss: 0.4106796383857727
Epoch [15/200], Step [1/18], Loss: 0.011543096974492073
Epoch [15/200], Step [11/18], Loss: 0.5009737014770508
Epoch [16/200], Step [1/18], Loss: 0.32573240995407104
Epoch [16/200], Step [11/18], Loss: 0.06828346848487854
Epoch [17/200], Step [1/18], Loss: 0.05403272435069084
Epoch [17/200], Step [11/18], Loss: 0.15716595947742462
Epoch [18/200], Step [1/18], Loss: 0.027571331709623337
Epoch [18/200], Step [11/18], Loss: 0.30799034237861633
Epoch [19/200], Step [1/18], Loss: 0.177289217710495
Epoch [19/200], Step [11/18], Loss: 0.09615731239318848
Epoch [20/200], Step [1/18], Loss: 0.09176662564277649
Epoch [20/200], Step [11/18], Loss: 0.048372216522693634
Epoch [21/200], Step [1/18], Loss: 0.021251996979117393
Epoch [21/200], Step [11/18], Loss: 0.023149274289608
Epoch [22/200], Step [1/18], Loss: 0.0194238293915987
Epoch [22/200], Step [11/18], Loss: 0.009129498153924942
Epoch [23/200], Step [1/18], Loss: 0.033666759729385376
Epoch [23/200], Step [11/18], Loss: 0.3238777816295624
Epoch [24/200], Step [1/18], Loss: 0.026807384565472603
Epoch [24/200], Step [11/18], Loss: 0.05678751319646835
Epoch [25/200], Step [1/18], Loss: 0.010233682580292225
Epoch [25/200], Step [11/18], Loss: 0.017964594066143036
Epoch [26/200], Step [1/18], Loss: 0.018377654254436493
Epoch [26/200], Step [11/18], Loss: 0.06426376104354858
Epoch [27/200], Step [1/18], Loss: 0.0289956983178854
Epoch [27/200], Step [11/18], Loss: 0.006790440529584885
Epoch [28/200], Step [1/18], Loss: 0.03905341029167175
Epoch [28/200], Step [11/18], Loss: 0.028408877551555634
Epoch [29/200], Step [1/18], Loss: 0.021818600594997406
Epoch [29/200], Step [11/18], Loss: 0.03449644148349762
Epoch [30/200], Step [1/18], Loss: 0.008687200956046581
Epoch [30/200], Step [11/18], Loss: 0.010629050433635712
Epoch [31/200], Step [1/18], Loss: 0.11608363687992096
Epoch [31/200], Step [11/18], Loss: 0.4328106641769409
Epoch [32/200], Step [1/18], Loss: 0.05060550570487976
Epoch [32/200], Step [11/18], Loss: 0.039930470287799835
Epoch [33/200], Step [1/18], Loss: 0.008874151855707169
Epoch [33/200], Step [11/18], Loss: 0.08507397025823593
Epoch [34/200], Step [1/18], Loss: 0.0058389026671648026
Epoch [34/200], Step [11/18], Loss: 0.0046081929467618465
Epoch [35/200], Step [1/18], Loss: 0.12371867895126343
Epoch [35/200], Step [11/18], Loss: 0.038248419761657715
Epoch [36/200], Step [1/18], Loss: 0.16413459181785583
Epoch [36/200], Step [11/18], Loss: 0.032994143664836884
Epoch [37/200], Step [1/18], Loss: 0.03072711080312729
Epoch [37/200], Step [11/18], Loss: 0.03439655900001526
Epoch [38/200], Step [1/18], Loss: 0.006345064379274845
Epoch [38/200], Step [11/18], Loss: 0.012138884514570236
Epoch [39/200], Step [1/18], Loss: 0.011927744373679161
Epoch [39/200], Step [11/18], Loss: 0.008709677495062351
Epoch [40/200], Step [1/18], Loss: 0.029146356508135796
Epoch [40/200], Step [11/18], Loss: 0.11137421429157257
Epoch [41/200], Step [1/18], Loss: 0.031168609857559204
Epoch [41/200], Step [11/18], Loss: 0.012156662531197071
Epoch [42/200], Step [1/18], Loss: 0.019586797803640366
Epoch [42/200], Step [11/18], Loss: 0.0041570719331502914
Epoch [43/200], Step [1/18], Loss: 0.005333346780389547
Epoch [43/200], Step [11/18], Loss: 0.07366456836462025
Epoch [44/200], Step [1/18], Loss: 0.02151975780725479
Epoch [44/200], Step [11/18], Loss: 0.10591024160385132
Epoch [45/200], Step [1/18], Loss: 0.008960297331213951
Epoch [45/200], Step [11/18], Loss: 0.028149280697107315
Epoch [46/200], Step [1/18], Loss: 0.028821412412473917007
Epoch [46/200], Step [11/18], Loss: 0.017873506993055344
Epoch [47/200], Step [1/18], Loss: 0.01558565441519022
Epoch [47/200], Step [11/18], Loss: 0.45754313468933105
Epoch [48/200], Step [1/18], Loss: 0.03192921727895737
Epoch [48/200], Step [11/18], Loss: 0.5914071202578137
Epoch [49/200], Step [1/18], Loss: 0.38782456517219543
Epoch [49/200], Step [11/18], Loss: 0.0011864150874316692
Epoch [50/200], Step [1/18], Loss: 0.055461883544921875
Epoch [50/200], Step [11/18], Loss: 0.10176341235637665
Epoch [51/200], Step [1/18], Loss: 0.015766264870762825
Epoch [51/200], Step [11/18], Loss: 0.003621053881943226
Epoch [52/200], Step [1/18], Loss: 0.018626023083925247
Epoch [52/200], Step [11/18], Loss: 0.004425644408911467
Epoch [53/200], Step [1/18], Loss: 0.007729521952569485
Epoch [53/200], Step [11/18], Loss: 0.0073486268520355225
Epoch [54/200], Step [1/18], Loss: 0.01855909451842308
Epoch [54/200], Step [11/18], Loss: 0.003889220766723156
Epoch [55/200], Step [1/18], Loss: 0.005263946484774351
Epoch [55/200], Step [11/18], Loss: 0.1581307053565979
Epoch [56/200], Step [1/18], Loss: 0.021987076848745346
Epoch [56/200], Step [11/18], Loss: 0.05597549304366112
Epoch [57/200], Step [1/18], Loss: 0.380911260843277
Epoch [57/200], Step [11/18], Loss: 0.03639882430434227
Epoch [58/200], Step [1/18], Loss: 0.009338815696537495
Epoch [58/200], Step [11/18], Loss: 0.0846019983291626
Epoch [59/200], Step [1/18], Loss: 0.026767950505018234
Epoch [59/200], Step [11/18], Loss: 0.11035145819187164
Epoch [60/200], Step [1/18], Loss: 0.027456166222691536
Epoch [60/200], Step [11/18], Loss: 0.014665378257632256
Epoch [61/200], Step [1/18], Loss: 0.0067307911813259125
Epoch [61/200], Step [11/18], Loss: 0.014259519055485725
Epoch [62/200], Step [1/18], Loss: 0.0066595980897545815
Epoch [62/200], Step [11/18], Loss: 0.012577966786921024
Epoch [63/200], Step [1/18], Loss: 0.008954830467700958
Epoch [63/200], Step [11/18], Loss: 0.025002148002386093
Epoch [64/200], Step [1/18], Loss: 0.4726946949958801
Epoch [64/200], Step [11/18], Loss: 0.7237932682037354
Epoch [65/200], Step [1/18], Loss: 0.08082826435565948
Epoch [65/200], Step [11/18], Loss: 0.06651557981967926
Epoch [66/200], Step [1/18], Loss: 0.011343597434461117
Epoch [66/200], Step [11/18], Loss: 0.017044804990291595
Epoch [67/200], Step [1/18], Loss: 0.024660665541887283
Epoch [67/200], Step [11/18], Loss: 0.008989336900413036
Epoch [68/200], Step [1/18], Loss: 0.014209344983100891
Epoch [68/200], Step [11/18], Loss: 1.0901525020599365
Epoch [69/200], Step [1/18], Loss: 0.019023453816771507
Epoch [69/200], Step [11/18], Loss: 0.012965338304638863
Epoch [70/200], Step [1/18], Loss: 0.03433838486671448
Epoch [70/200], Step [11/18], Loss: 0.016142964363098145
Epoch [71/200], Step [1/18], Loss: 0.0070319571532309055
Epoch [71/200], Step [11/18], Loss: 0.0030483559239655733
Epoch [72/200], Step [1/18], Loss: 0.055289510637521744
Epoch [72/200], Step [11/18], Loss: 0.017122216522693634
Epoch [73/200], Step [1/18], Loss: 0.06607518345117569
Epoch [73/200], Step [11/18], Loss: 0.016266677528619766
Epoch [74/200], Step [1/18], Loss: 0.08082333207130432
Epoch [74/200], Step [11/18], Loss: 0.0279204361140728
Epoch [75/200], Step [1/18], Loss: 0.004301637876778841
Epoch [75/200], Step [11/18], Loss: 0.021140452474355698
Epoch [76/200], Step [1/18], Loss: 0.013559591956436634
Epoch [76/200], Step [11/18], Loss: 0.01826651394367218
Epoch [77/200], Step [1/18], Loss: 0.0061076125130057335
Epoch [77/200], Step [11/18], Loss: 0.002152765402570367
Epoch [78/200], Step [1/18], Loss: 0.007121165283024311
Epoch [78/200], Step [11/18], Loss: 0.009206684306263924
Epoch [79/200], Step [1/18], Loss: 0.04074825346469879
Epoch [79/200], Step [11/18], Loss: 0.0065627931617200375
Epoch [80/200], Step [1/18], Loss: 0.01308474037796259
Epoch [80/200], Step [11/18], Loss: 0.10330627858638763
Epoch [81/200], Step [1/18], Loss: 0.013666542246937752
Epoch [81/200], Step [11/18], Loss: 0.03404892981052399
Epoch [82/200], Step [1/18], Loss: 0.00352277560159564
Epoch [82/200], Step [11/18], Loss: 0.02659265138208866
Epoch [83/200], Step [1/18], Loss: 0.003314829431474209
Epoch [83/200], Step [11/18], Loss: 0.03542104363441467
Epoch [84/200], Step [1/18], Loss: 0.1119469627737999
Epoch [84/200], Step [11/18], Loss: 0.01624973490834236
Epoch [85/200], Step [1/18], Loss: 0.013936529867351055
Epoch [85/200], Step [11/18], Loss: 0.001210562651976943
Epoch [86/200], Step [1/18], Loss: 0.004351664334535599
Epoch [86/200], Step [11/18], Loss: 0.016089780256152153
Epoch [87/200], Step [1/18], Loss: 0.0008625991176813841
Epoch [87/200], Step [11/18], Loss: 0.0026758601889014244
Epoch [88/200], Step [1/18], Loss: 0.04442567750811577
Epoch [88/200], Step [11/18], Loss: 0.015446970239281654
Epoch [89/200], Step [1/18], Loss: 0.0005034751957282424
Epoch [89/200], Step [11/18], Loss: 0.005449206568300724
Epoch [90/200], Step [1/18], Loss: 0.2731953561306
Epoch [90/200], Step [11/18], Loss: 0.0018118073930963874
Epoch [91/200], Step [1/18], Loss: 0.0004698720294982195
Epoch [91/200], Step [11/18], Loss: 0.0032033436000347137
Epoch [92/200], Step [1/18], Loss: 0.019340932369232178
Epoch [92/200], Step [11/18], Loss: 0.007545084692537785
Epoch [93/200], Step [1/18], Loss: 0.004510772414505482
Epoch [93/200], Step [11/18], Loss: 0.02083991840481758
Epoch [94/200], Step [1/18], Loss: 0.002590321935713291
Epoch [94/200], Step [11/18], Loss: 0.03413398563861847
Epoch [95/200], Step [1/18], Loss: 0.020762939006090164
Epoch [95/200], Step [11/18], Loss: 0.013762586750090122
Epoch [96/200], Step [1/18], Loss: 0.006758917588740587
Epoch [96/200], Step [11/18], Loss: 0.015020255441625118
Epoch [97/200], Step [1/18], Loss: 0.026167867705225945
Epoch [97/200], Step [11/18], Loss: 0.0027395961806178093
Epoch [98/200], Step [1/18], Loss: 0.0034174644388258457
Epoch [98/200], Step [11/18], Loss: 0.024442892521619797
Epoch [99/200], Step [1/18], Loss: 0.02677156589925289
Epoch [99/200], Step [11/18], Loss: 0.004590878263115883
Epoch [100/200], Step [1/18], Loss: 0.004817421548068523
Epoch [100/200], Step [11/18], Loss: 0.003802280407398939
Epoch [101/200], Step [1/18], Loss: 0.029921934008598328
Epoch [101/200], Step [11/18], Loss: 0.00753698218613863
Epoch [102/200], Step [1/18], Loss: 0.0037281459663063288
Epoch [102/200], Step [11/18], Loss: 0.014354806393384933
Epoch [103/200], Step [1/18], Loss: 0.3014940023422241
Epoch [103/200], Step [11/18], Loss: 0.010186146013438702
Epoch [104/200], Step [1/18], Loss: 0.009418152272701263
Epoch [104/200], Step [11/18], Loss: 0.0036066388711333275
Epoch [105/200], Step [1/18], Loss: 0.010349811986088753
Epoch [105/200], Step [11/18], Loss: 0.008489744737744331
Epoch [106/200], Step [1/18], Loss: 0.013178281486034393
Epoch [106/200], Step [11/18], Loss: 0.016042709350585938
Epoch [107/200], Step [1/18], Loss: 0.31146734952926636
Epoch [107/200], Step [11/18], Loss: 0.0026772483251988888
Epoch [108/200], Step [1/18], Loss: 0.003950158599764109
Epoch [108/200], Step [11/18], Loss: 0.08102487772703171
Epoch [109/200], Step [1/18], Loss: 0.003397746477276087
Epoch [109/200], Step [11/18], Loss: 0.02333853393793106
Epoch [110/200], Step [1/18], Loss: 0.0010620998218655586
Epoch [110/200], Step [11/18], Loss: 0.0037958319298923016
Epoch [111/200], Step [1/18], Loss: 0.001602979376912117
Epoch [111/200], Step [11/18], Loss: 0.5138437151908875
Epoch [112/200], Step [1/18], Loss: 0.009153843857347965
Epoch [112/200], Step [11/18], Loss: 0.05277949944138527
Epoch [113/200], Step [1/18], Loss: 0.008996937423944473
Epoch [113/200], Step [11/18], Loss: 0.026577509939670563
Epoch [114/200], Step [1/18], Loss: 0.07632716000080109
Epoch [114/200], Step [11/18], Loss: 0.6511035561561584
Epoch [115/200], Step [1/18], Loss: 1.1457531452178955
Epoch [115/200], Step [11/18], Loss: 0.6809557676315308
Epoch [116/200], Step [1/18], Loss: 0.2524426281452179
Epoch [116/200], Step [11/18], Loss: 0.428086519241333
Epoch [117/200], Step [1/18], Loss: 0.6725389957427979
Epoch [117/200], Step [11/18], Loss: 0.09876668453216553
Epoch [118/200], Step [1/18], Loss: 1.044818639755249
Epoch [118/200], Step [11/18], Loss: 0.32220691442489624
Epoch [119/200], Step [1/18], Loss: 0.23680479824543
Epoch [119/200], Step [11/18], Loss: 0.4861851930618286
Epoch [120/200], Step [1/18], Loss: 0.30875518918037415
Epoch [120/200], Step [11/18], Loss: 0.6225939989089966
Epoch [121/200], Step
[1/18], Loss: 0.049871932715177536
Epoch [121/200], Step [11/18], Loss: 0.8413558006286621
Epoch [122/200], Step [1/18], Loss: 0.36755722761154175
Epoch [122/200], Step [11/18], Loss: 0.07630021870136261
Epoch [123/200], Step [1/18], Loss: 0.051008716225624084
Epoch [123/200], Step [11/18], Loss: 0.6509355902671814
Epoch [124124/200], Step [1/18], Loss: 0.0252146627753973
Epoch [124124/200], Step [11/18], Loss: 0.5530363917350769
Epoch [125/200], Step [1/18], Loss: 0.05530519038438797
Epoch [125/200], Step [11/18], Loss: 0.25131523609161377
Epoch [126/200], Step [1/18], Loss: 0.01882755197584629
Epoch [126/200], Step [11/18], Loss: 0.29099002480506897
Epoch [127/200], Step [1/18], Loss: 0.28348326683044434
Epoch [127/200], Step [11/18], Loss: 0.08514250814914703
Epoch [128/200], Step [1/18], Loss: 0.02945907786488533
Epoch [128/200], Step [11/18], Loss: 0.07844363152980804
Epoch [129/200], Step [1/18], Loss: 0.01358343381434679
Epoch [129/200], Step [11/18], Loss: 0.27571654319763184
Epoch [130/200], Step [1/18], Loss: 0.10371901839971542
Epoch [130/200], Step [11/18], Loss: 0.03517628461122513
Epoch [131/200], Step [1/18], Loss: 0.00815553218126297
Epoch [131/200], Step [11/18], Loss: 0.09514375776052475
Epoch [132/200], Step [1/18], Loss: 0.45774367451667786
Epoch [132/200], Step [11/18], Loss: 0.3840058743953705
Epoch [133/200], Step [1/18], Loss: 0.06872424483299255
Epoch [133/200], Step [11/18], Loss: 0.03554276004433632
Epoch [134/200], Step [1/18], Loss: 0.3256757855415344
Epoch [134/200], Step [11/18], Loss: 0.04108048602938652
Epoch [135/200], Step [1/18], Loss: 0.013568353839218616
Epoch [135/200], Step [11/18], Loss: 0.03537318855524063
Epoch [136/200], Step [1/18], Loss: 0.08078506588935852
Epoch [136/200], Step [11/18], Loss: 0.16535770893096924
Epoch [137/200], Step [1/18], Loss: 0.07062816619873047
Epoch [137/200], Step [11/18], Loss: 0.23596513271331787
Epoch [138/200], Step [1/18], Loss: 0.017027437686920166
Epoch [138/200], Step [11/18], Loss: 0.00647686468437314
Epoch [139/200], Step [1/18], Loss: 0.013125029392540455
Epoch [139/200], Step [11/18], Loss: 0.27549853920936584
Epoch [140/200], Step [1/18], Loss: 0.007153613492846489
Epoch [140/200], Step [11/18], Loss: 0.017620528116822243
Epoch [141/200], Step [1/18], Loss: 0.0321669727563858
Epoch [141/200], Step [11/18], Loss: 0.028842061758041382
Epoch [142/200], Step [1/18], Loss: 0.01732991263270378
Epoch [142/200], Step [11/18], Loss: 0.08353880792856216
Epoch [143/200], Step [1/18], Loss: 0.01723271794617176
Epoch [143/200], Step [11/18], Loss: 0.019574100151658058
Epoch [144/200], Step [1/18], Loss: 0.03397369384765625
Epoch [144/200], Step [11/18], Loss: 0.10844092816114426
Epoch [145/200], Step [1/18], Loss: 0.3786786198616028
Epoch [145/200], Step [11/18], Loss: 0.1694055199623108
Epoch [146/200], Step [1/18], Loss: 0.11119166761636734
Epoch [146/200], Step [11/18], Loss: 0.17424573004245758
Epoch [147/200], Step [1/18], Loss: 0.15077194571495056
Epoch [147/200], Step [11/18], Loss: 0.5065086483955383
Epoch [148/200], Step [1/18], Loss: 0.1338863968849182
Epoch [148/200], Step [11/18], Loss: 0.41857266426086426
Epoch [149/200], Step [1/18], Loss: 0.14975376427173615
Epoch [149/200], Step [11/18], Loss: 0.1162782609462738
Epoch [150/200], Step [1/18], Loss: 0.3046249747276306
Epoch [150/200], Step [11/18], Loss: 0.2820568382740021
Epoch [151/200], Step [1/18], Loss: 0.1767234355211258
Epoch [151/200], Step [11/18], Loss: 0.5894790291786194
Epoch [152/200], Step [1/18], Loss: 0.0710759088397026
Epoch [152/200], Step [11/18], Loss: 0.2845103144645691
Epoch [153/200], Step [1/18], Loss: 0.007126178592443466
Epoch [153/200], Step [11/18], Loss: 0.1113148108124124733
Epoch [154/200], Step [1/18], Loss: 0.04131874442100525
Epoch [154/200], Step [11/18], Loss: 0.06208159029483795
Epoch [155/200], Step [1/18], Loss: 0.11961569637060165
Epoch [155/200], Step [11/18], Loss: 0.08468692749738693
Epoch [156/200], Step [1/18], Loss: 0.21016210317611694
Epoch [156/200], Step [11/18], Loss: 0.020117301493883133
Epoch [157/200], Step [1/18], Loss: 0.31543296575546265
Epoch [157/200], Step [11/18], Loss: 0.03285551816225052
Epoch [158/200], Step [1/18], Loss: 0.025582782924175262
Epoch [158/200], Step [11/18], Loss: 0.22900016605854034
Epoch [159/200], Step [1/18], Loss: 0.3325921893119812
Epoch [159/200], Step [11/18], Loss: 0.8100109100341797
Epoch [160/200], Step [1/18], Loss: 0.006363577675074339
Epoch [160/200], Step [11/18], Loss: 0.022655433043837547
Epoch [161/200], Step [1/18], Loss: 0.094673752784729
Epoch [161/200], Step [11/18], Loss: 0.09117478132247925
Epoch [162/200], Step [1/18], Loss: 0.06463250517845154
Epoch [162/200], Step [11/18], Loss: 0.047544147819280624
Epoch [163/200], Step [1/18], Loss: 0.03960324078798294
Epoch [163/200], Step [11/18], Loss: 0.009391479194164276
Epoch [164/200], Step [1/18], Loss: 0.08041112124124919891
Epoch [164/200], Step [11/18], Loss: 0.017049631103873253
Epoch [165/200], Step [1/18], Loss: 0.013496411964297295
Epoch [165/200], Step [11/18], Loss: 0.02232395112514496
Epoch [166/200], Step [1/18], Loss: 0.04993608221411705
Epoch [166/200], Step [11/18], Loss: 0.5434579849243164
Epoch [167/200], Step [1/18], Loss: 0.06688367575407028
Epoch [167/200], Step [11/18], Loss: 0.0397261306643486
Epoch [168/200], Step [1/18], Loss: 0.00531834876164794
Epoch [168/200], Step [11/18], Loss: 0.017009573057293892
Epoch [169/200], Step [1/18], Loss: 0.014699848368763924
Epoch [169/200], Step [11/18], Loss: 0.12913461029529572
Epoch [170/200], Step [1/18], Loss: 0.04674993082880974
Epoch [170/200], Step [11/18], Loss: 0.008987809531390667
Epoch [171/200], Step [1/18], Loss: 0.3470563292503357
Epoch [171/200], Step [11/18], Loss: 0.014212577603757381
Epoch [172/200], Step [1/18], Loss: 0.014295908622443676
Epoch [172/200], Step [11/18], Loss: 0.01740555465221405
Epoch [173/200], Step [1/18], Loss: 0.05029941722750664
Epoch [173/200], Step [11/18], Loss: 0.053891196846961975
Epoch [174/200], Step [1/18], Loss: 0.08363781869411469
Epoch [174/200], Step [11/18], Loss: 0.0013446426019072533
Epoch [175/200], Step [1/18], Loss: 0.05936658754944801
Epoch [175/200], Step [11/18], Loss: 0.7805588245391846
Epoch [176/200], Step [1/18], Loss: 0.029463572427630424
Epoch [176/200], Step [11/18], Loss: 0.33329683542251587
Epoch [177/200], Step [1/18], Loss: 0.002200994174927473
Epoch [177/200], Step [11/18], Loss: 0.09669584780931473
Epoch [178/200], Step [1/18], Loss: 0.013506107032299042
Epoch [178/200], Step [11/18], Loss: 0.021688269451260567
Epoch [179/200], Step [1/18], Loss: 0.005644794087857008
Epoch [179/200], Step [11/18], Loss: 0.24360783398151398
Epoch [180/200], Step [1/18], Loss: 0.5215405225753784
Epoch [180/200], Step [11/18], Loss: 0.03189065679907799
Epoch [181/200], Step [1/18], Loss: 0.04039095342159271
Epoch [181/200], Step [11/18], Loss: 0.04796888679265976
Epoch [182/200], Step [1/18], Loss: 0.02029312412402009964
Epoch [182/200], Step [11/18], Loss: 0.027300354093313217
Epoch [183/200], Step [1/18], Loss: 0.00514404708519578
Epoch [183/200], Step [11/18], Loss: 0.014119967818260193
Epoch [184/200], Step [1/18], Loss: 0.03561864793300629
Epoch [184/200], Step [11/18], Loss: 0.004448604304343462
Epoch [185/200], Step [1/18], Loss: 0.19735635817050934
Epoch [185/200], Step [11/18], Loss: 0.03777945786714554
Epoch [186/200], Step [1/18], Loss: 0.05664841830730438
Epoch [186/200], Step [11/18], Loss: 0.026479505002498627
Epoch [187/200], Step [1/18], Loss: 0.005472094751894474
Epoch [187/200], Step [11/18], Loss: 0.0316663533449173
Epoch [188/200], Step [1/18], Loss: 0.007353189866989851
Epoch [188/200], Step [11/18], Loss: 0.0011719940230250359
Epoch [189/200], Step [1/18], Loss: 0.007917560636997223
Epoch [189/200], Step [11/18], Loss: 0.0023582070134580135
Epoch [190/200], Step [1/18], Loss: 0.035708069801330566
Epoch [190/200], Step [11/18], Loss: 0.05112633854150772
Epoch [191/200], Step [1/18], Loss: 0.002874162746593356
Epoch [191/200], Step [11/18], Loss: 0.009168487042188644
Epoch [192/200], Step [1/18], Loss: 0.003728349693119526
Epoch [192/200], Step [11/18], Loss: 0.01515199150890112
Epoch [193/200], Step [1/18], Loss: 0.008820852264761925
Epoch [193/200], Step [11/18], Loss: 0.0008103932486847043
Epoch [194/200], Step [1/18], Loss: 0.001101092784665525
Epoch [194/200], Step [11/18], Loss: 0.0012981765903532505
Epoch [195/200], Step [1/18], Loss: 0.004354500211775303
Epoch [195/200], Step [11/18], Loss: 0.018854297697544098
Epoch [196/200], Step [1/18], Loss: 0.6042110323905945
Epoch [196/200], Step [11/18], Loss: 0.0017237844876945019
Epoch [197/200], Step [1/18], Loss: 0.008842434734106064
Epoch [197/200], Step [11/18], Loss: 0.0016365956980735064
Epoch [198/200], Step [1/18], Loss: 0.15027210116386414
Epoch [198/200], Step [11/18], Loss: 0.024806607514619827
Epoch [199/200], Step [1/18], Loss: 0.34135231375694275
Epoch [199/200], Step [11/18], Loss: 0.15495596826076508
Epoch [200/200], Step [1/18], Loss: 0.7455297112412464905
Epoch [200/200], Step [11/18], Loss: 0.0043668486177921295
In [17]
# 模型保存import paddlefrom paddle.static import InputSpec
path = "./export_model/audionet"paddle.jit.save(
layer=model,
path=path,
input_spec=[InputSpec(shape=[1,64,40])])
/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/norm.py:818: UserWarning: When training, we now always track global mean and variance.
warnings.warn(
In [8]
# 加载模型import paddleimport numpy as np
path = "./export_model/audionet"loaded_model = paddle.jit.load(path)
loaded_model.eval()
In [12]
# 模型测试import globfrom paddleaudio import loadfrom paddleaudio.features import LogMelSpectrogramimport os
seq_len = 64speaker = ["杰斯","金克斯","狼母"]
test_audio_list = glob.glob("/home/aistudio/Arcane_3class/test/*.wav")for audio in test_audio_list:
file_name = os.path.basename(audio)
data, sr = load(file=audio, mono=True, dtype='float32') # 单通道,float32音频样本点
x = paddle.to_tensor(data).unsqueeze(0) # [B, L]
log_fbank = feature_extractor(x) # [B, D, T]
log_fbank = log_fbank.squeeze(0) # [D, T]
if log_fbank.shape[1]
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00001624, 0.00006163, 0.99992216]])
狼母_audio44.wav识别结果为 狼母
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00002558, 0.00004835, 0.99992597]])
狼母_audio45.wav识别结果为 狼母
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00970634, 0.98993659, 0.00035706]])
金克斯_audio42.wav识别结果为 金克斯
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.03899606, 0.94868642, 0.01231745]])
金克斯_audio45.wav识别结果为 金克斯
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.99920183, 0.00043954, 0.00035864]])
杰斯_audio60.wav识别结果为 杰斯
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00002513, 0.00004669, 0.99992812]])
狼母_audio43.wav识别结果为 狼母
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00015131, 0.99914122, 0.00070748]])
金克斯_audio44.wav识别结果为 金克斯