我需要在Java中保存和加载Keras模型,然后我想使用DL4J。 问题是当我保存我的模型时,它没有自己的权重嵌入层。
我在Keras中重新加载模型时遇到了同样的问题,但在这种情况下,我可以创建相同的架构,仅加载我的模型权重。
具体来说,我从这样的架构开始:
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 300, 300) 219184200
_________________________________________________________________
lstm_1 (LSTM) (None, 300, 256) 570368
_________________________________________________________________
dropout_1 (Dropout) (None, 300, 256) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 128) 197120
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 2) 258
=================================================================
保存和加载后,我得到了以下结果(在Keras和DL4J中均如此):
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, None, 300) 219184200
_________________________________________________________________
lstm_1 (LSTM) (None, None, 256) 570368
_________________________________________________________________
dropout_1 (Dropout) (None, None, 256) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 128) 197120
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 2) 258
=================================================================
有没有解决方案或方法可以在Java中实现以下内容?
1)是否可能在Keras中正确保存和加载结构和权重?
3)是否可能在一个函数中实现将单词转换为嵌入,并将先前转换为嵌入的输入提供给神经网络?
4)我能否在Java中使用DL4J加载嵌入层中的权重?
以下是我的网络代码:
sentence_indices = Input(shape=input_shape, dtype=np.int32)
emb_dim = 300 # embedding di 300 parole in italiano
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index, emb_dim)
embeddings = embedding_layer(sentence_indices)
X = LSTM(256, return_sequences=True)(embeddings)
X = Dropout(0.15)(X)
X = LSTM(128)(X)
X = Dropout(0.15)(X)
X = Dense(num_activation, activation='softmax')(X)
model = Model(sentence_indices, X)
sequentialModel = Sequential(model.layers)
感谢您提前的支持。
谢谢。