Keras-tuner使用Hyperband仅运行了2个epochs

31

下面的代码与kera-tuner网站上的Hello-World示例相同,但使用Hyperband而不是RandomSearch。

from tensorflow import keras
from tensorflow.keras import layers

from kerastuner.tuners import RandomSearch, Hyperband
from kerastuner.engine.hypermodel import HyperModel
from kerastuner.engine.hyperparameters import HyperParameters

(x, y), (val_x, val_y) = keras.datasets.mnist.load_data()
x = x.astype('float32') / 255.
val_x = val_x.astype('float32') / 255.

x = x[:10000]
y = y[:10000]

def build_model(hp):
    model = keras.Sequential()
    model.add(layers.Flatten(input_shape=(28, 28)))
    for i in range(hp.Int('num_layers', 2, 20)):
        model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
                               activation='relu'))
    model.add(layers.Dense(10, activation='softmax'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
    return model

tuner = Hyperband(
    build_model,
    max_epochs=50,
    objective='val_accuracy',
    seed=20,
    executions_per_trial=1,
    directory='test_dir',
    project_name='daninhas_hyperband'
)    

# tuner.search_space_summary()

tuner.search(x=x,
             y=y,
             epochs=50,
             validation_data=(val_x, val_y))

tuner.results_summary()

即使设置了max_epochs=50epoch=50,模型训练仍然只运行了2个epochs。

(...)
2020-06-03 12:55:23.245993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-03 12:55:23.246022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      
Epoch 1/2
313/313 [==============================] - 4s 12ms/step - loss: 2.3247 - accuracy: 0.1109 - val_loss: 2.3025 - val_accuracy: 0.1135
Epoch 2/2
313/313 [==============================] - 3s 9ms/step - loss: 2.3020 - accuracy: 0.1081 - val_loss: 2.3033 - val_accuracy: 0.1135
[Trial complete]
[Trial summary]
 |-Trial ID: 32396974f43cade5b6c3ef511b5548f3
 |-Score: 0.11349999904632568
 |-Best step: 0
 > Hyperparameters:
 |-learning_rate: 0.01
 |-num_layers: 19
 |-tuner/bracket: 3
 |-tuner/epochs: 2
 |-tuner/initial_epoch: 0
 |-tuner/round: 0
 |-units_0: 448
 |-units_1: 384
 |-units_10: 416
 |-units_11: 160
 |-units_12: 384
 |-units_13: 480
 |-units_14: 288
 |-units_15: 64
 |-units_16: 288
 |-units_17: 64
 |-units_18: 32
 |-units_2: 96
 |-units_3: 160
 |-units_4: 480
 |-units_5: 416
 |-units_6: 256
 |-units_7: 32
 |-units_8: 160
 |-units_9: 448
Epoch 1/2
313/313 [==============================] - 4s 11ms/step - loss: 2.3109 - accuracy: 0.1081 - val_loss: 2.3028 - val_accuracy: 0.1135
Epoch 2/2
313/313 [==============================] - 3s 10ms/step - loss: 2.3022 - accuracy: 0.1067 - val_loss: 2.3019 - val_accuracy: 0.1135
[Trial complete]
[Trial summary]
 |-Trial ID: 98376f698826a2068c3412301a7aece4
 |-Score: 0.11349999904632568
 |-Best step: 0
 > Hyperparameters:
 |-learning_rate: 0.01
 |-num_layers: 19
 |-tuner/bracket: 3
 |-tuner/epochs: 2
 |-tuner/initial_epoch: 0
 |-tuner/round: 0
 |-units_0: 480
 |-units_1: 320
 |-units_10: 320
 |-units_11: 64
 |-units_12: 128
 |-units_13: 32
 |-units_14: 416
 |-units_15: 288
 |-units_16: 320
 |-units_17: 480
 |-units_18: 256
 |-units_2: 480
 |-units_3: 320
 |-units_4: 288
 |-units_5: 192
 |-units_6: 224
 |-units_7: 256
 |-units_8: 256
 |-units_9: 352
(...)

是否有某些配置可以强制只进行2个时期的训练? 或者这可能是某个错误吗? 还是我漏掉了什么?

如何让模型进行更多时期的训练?


5
这就是Hyperband算法的工作原理。它最初使用有限数量的epochs对超参数空间进行采样以了解该空间,然后对更有前途的模型进行更多epochs的迭代。使用小型测试数据集并让其运行一段时间以观察其效果。 - Joe
@Joe 在经过2个时期后的表面是否应该与经过N=50(或更多...)个时期后的表面有任何关系..? - jtlz2
8
@jtlz2 是的,这是Hyperband的一个关键假设。如果被优化的模型表现出来的特征是,在经过几个epochs之后的结果与在许多更多的epochs之后的结果没有关系,那么最好使用类似贝叶斯超参数算法,该算法在做决策之前会让每个测试模型完整地通过其所有epochs。 - Joe
1
否则,您也可以尝试更改默认的hyperband_iterations=1参数,以使结果更加稳定。或者,keras-tuner现在有BayesianOptimization类,但我从未尝试过这个。 - gregoruar
1个回答

0

你可以更改因子参数来改变它。 默认设置为3,但你可以增加这个数字以获得每次试验超过2个时期。

见:docs

Hyperband调优算法使用自适应资源分配和提前停止来快速收敛到高性能模型。它使用运动锦标赛风格的比赛方式。该算法训练大量模型进行几个时期,并只将表现最好的一半模型推进到下一轮。Hyperband通过计算1 + logfactor(max_epochs)并向上取整来确定在一个比赛中要训练的模型数量。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接