Keras中的正则化策略

12

我一直在尝试在Keras中设置一个非线性回归问题。不幸的是,结果显示出过拟合。下面是代码:

model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)

没有正则化的结果显示在这里:Without regularization。训练的平均绝对误差要比验证小得多,并且两者之间有一个固定的差距,这是过拟合的迹象。

每个层都指定了L2正则化,如下所示,

model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)

这些结果显示在这里 L2正则化结果。测试的MAE接近于训练,这很好。然而,训练的MAE很差,为0.03(没有正则化时,它要低得多,为0.0028)。

我该怎么做才能通过正则化减少训练的MAE?

1个回答

13

根据您的结果,看起来您需要找到合适的正则化量来平衡训练准确性和对测试集的良好泛化。这可能只需降低L2参数即可。尝试将lambda从0.001降低到0.0001并比较结果。

如果您无法找到L2的良好参数设置,则可以尝试使用dropout正则化。只需在每对密集层之间添加model.add(Dropout(0.2)),必要时可以尝试不同的dropout率。更高的dropout率对应更多的正则化。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接